Skip to main content
eScholarship
Open Access Publications from the University of California

Using Spanish Surname Ratios to Estimate Proportion Hispanic via Bayes Theorem

Abstract

We offer a new tool to improve matchups between surname and ethnicity, illustrating the use of this method with data on Spanish surname matching drawn from the U.S. Census. We first show that there is no such thing as the proportion of bearers of a given name who are Hispanic. How Hispanic any given name will turn out to be is a function of the overall Hispanicity of the population, which will affect both the distribution of names and the conditional probability that the possessor of any given name will be Hispanic.  We then propose a simple approach, using only two common names -- one of which is far more likely to be Hispanic and one of which is far more likely to be non-Hispanic -- that allows us to generate, via Bayes Theorem, remarkably accurate estimates about the size of Hispanic populations in California cities from very limited data, here the ratio of those with the name ‘Garcia’ to those with the name ‘Anderson’.  

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View