- Main
THE UC CLIOMETRIC HISTORY PROJECT AND FORMATTED OPTICAL CHARACTER RECOGNITION
Abstract
In what ways—and to what degree—have universities contributed to the long-run growth, health, economic mobility, and gender/ethnic equity of their students’ communities and home states? The University of California ClioMetric History Project (UC-CHP), based at the Center for Studies in Higher Education, extends prior research on this question in two ways. First, we have developed a novel digitization protocol—formatted optical character recognition (fOCR)—which transforms scanned structured and semi-structured texts like university directories and catalogs into high-quality computer-readable databases. We use fOCR to produce annual databases of students (1890s to 1940s), faculty (1900 to present), course descriptions (1900 to present), and detailed budgets (1911-2012) for many California universities. Digitized student records, for example, illuminate the high proportion of 1900s university students who were female and from rural areas, as well as large family income differences between male and female students and between students at public and private universities. Second, UC-CHP is working to photograph, process with fOCR, and analyze restricted student administrative records to construct a comprehensive database of California university students and their enrollment behavior. This paper describes UC-CHP’s methodology and provides technical documentation for the project, while also presenting examples of the range of data the project is exploring and prospects for future research.
This year the University of California celebrates its 150 anniversary since establishment in 1868. This ROPS contribution is part of a series published this year by the Center for Studies in Higher Education related to the history of the University of California and, more broadly, America’s unique investment and faith in public universities.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-