Background
Data missingness can bias interpretation and outcomes resulting from data use. We describe data missingness in the longest-standing US-based youth fitness surveillance system (2006/07-2019/20).Methods
This observational study uses the New York City FITNESSGRAM (NYCFG) database from 1,983,629 unique 4th-12th grade students (9,147,873 student-year observations) from 1756 schools. NYCFG tests for aerobic capacity, muscular strength, and endurance were administered annually. Mixed effects models determined the prevalence of missingness by demographics, and associations between demographics and missingness.Results
Across years, 20.1% of students were missing data from all three tests (11.7% for elementary students, 15.6% middle, and 36.3% high). Missingness did not differ by sex, but differed significantly by race/ethnicity and student home neighborhood socioeconomic status.Conclusion
The nation's largest youth fitness surveillance system demonstrates the highest fitness data missingness among high school students, with more than 1/3 of students missing data. Non-Hispanic Black students and those with very poor home neighborhood SES, across all grade levels, have the highest odds of missing data.Implications for school health
Strategies to better understand and ameliorate the causes of school-based fitness testing data missingness will increase overall data quality and begin to address health inequities in this critical metric of youth health.