BackgroundThe National Children's Study (NCS) is a prospective epidemiological study in the USA tasked with identifying a nationally representative sample of 100,000 children, and following them from their gestation until they are 21 years of age. The objective of the study is to measure environmental and genetic influences on growth, development, and health. Determination of the ancestry of these NCS participants is important for assessing the diversity of study participants and for examining the effect of ancestry on various health outcomes.
ResultsWe estimated the genetic ancestry of a convenience sample of 641 parents enrolled at the 7 original NCS Vanguard sites, by analyzing 30,000 markers on exome arrays, using the 1000 Genomes Project superpopulations as reference populations, and compared this with the measures of self-reported ethnicity and race. For 99% of the individuals, self-reported ethnicity and race agreed with the predicted superpopulation. NCS individuals self-reporting as Asian had genetic ancestry of either South Asian or East Asian groups, while those reporting as either Hispanic White or Hispanic Other had similar genetic ancestry. Of the 33 individuals who self-reported as Multiracial or Non-Hispanic Other, 33% matched the South Asian or East Asian groups, while these groups represented only 4.4% of the other reported categories.
ConclusionsOur data suggest that self-reported ethnicity and race have some limitations in accurately capturing Hispanic and South Asian populations. Overall, however, our data indicate that despite the complexity of the US population, individuals know their ancestral origins, and that self-reported ethnicity and race is a reliable indicator of genetic ancestry.