Seasonally Optimized Calibrations Improve Low-cost Sensor Performance: Long-term Field Evaluation of PurpleAir Sensors in Urban and Rural India
Lower-cost air pollution sensors can fill critical air quality data gaps in India, which experiences very high fine particulate matter (PM2.5) air pollution but has sparse regulatory air monitoring. Challenges for low-cost PM2.5 sensors in India include high aerosol mass concentrations and pronounced regional and seasonal gradients in aerosol composition. Here, we report on a detailed, long-time performance evaluation of a popular sensor, the Purple Air PA-II, at multiple sites in India. We established 3 distinct sites in India across land-use categories and population density extremes (North India: Delhi [urban], Hamir- pur [rural]; South India: Bangalore [urban]), where we collocated the PA-II with reference beta-attenuation monitors. We evaluated the performance of uncalibrated sensor data, and then developed, optimized, and evaluated calibration models using a comprehensive feature selection process focused on reproducibility in the Indian context. We assessed the seasonal and spatial transferability of sensor calibration schemes, which is especially important in India because of the paucity of reference instrumentation. Without calibration, the PA-II was moderately correlated with the reference signal (R2: 0.55 - 0.74) but was inaccurate (NRMSE ≤ 40%). Relative to uncalibrated data, parsimonious annual calibration models improved PA performance at all sites (cross-validated NRMSE 20-30%, R2: 0.82-0.95), and greatly reduced seasonal and diurnal biases. Because aerosol properties and meteorology vary regionally, the form of these long-term models differed among our sites, suggesting that local calibrations are desirable when possible. Using a moving-window calibration, we found that using seasonally-specific information improves performance relative to a static annual calibration model, while a short-term calibration model generally does not transfer reliably to other seasons. Overall, we find that the PA-II can provide reliable PM2.5 data with better than ± 25% precision and accuracy when paired with a rigorous calibration scheme that accounts for seasonality and local aerosol composition.
Mapping Air Pollution in Polluted Data-Sparse Environments: Resolving Spatial-Temporal PM2.5 Trends with Lower-cost Sensors in North India
North India faces the world’s highest air pollution health burden across the urban-rural divide. Ground-based monitoring is essential for validating models and conducting holistic exposure assessments, but the high costs of reference instruments have limited their adoption to megacities. Here, we demonstrate an analytical approach to scaling a lower-cost sensor (LCS) network in North India using the commercially available PurpleAir PA-II sensor package. Our key constraint was to balance data completeness with representativeness, given the dynamic nature of LCS performance both out-of-the-box and for long-term deployment. We developed and applied our models across 13 diverse land-use collocations in the Delhi National Capital Region, resulting in robust out-of-sample seasonal performance (mean bias < 10%). Validating our framework in Lucknow, another megacity in the region, demonstrated a high correlation between aggregated community-hosted LCS nodes and reference networks (Pearson’s r ≥ 0.9) and well-constrained annual PM2.5 estimates (≤ 10%). While our data pipeline was developed in North India, our statistical framework offers paths forward for scaling similar regional networks in polluted, sparsely monitored environments.
Air Pollution Management Urgently Needed in Non-urban North India
High emissions from a diverse source mixture and unfavorable meteorology have resulted in extreme fine particulate matter (PM2.5) air pollution in the Indo-Gangetic Plain (IGP) of North India. Most studies and abatement strategies have focused on densely populated megacities, but the majority of the population in the region (60-70%) reside in non-urban settlements. Therefore, although non-urban air quality is understudied in the IGP, it is critical to understand the regional health burden of PM2.5 and develop informed policy. We implemented a lower-cost PM2.5 monitoring network of over 80 sites across more than 15 settlements using the popular PurpleAir sensor. Our network spanned megacities to regional background sites and represents the first ground-based observations of PM2.5 in most settlements. We observed sustained poor ambient air quality across the region (annual average PM2.5 concentration ≥ 60 µg m−3), with weak spatial gradients from urban core sites to regional background sites. Leveraging the high temporal resolution afforded by lower-cost sensors, we observed that although annual and seasonal trends were strongly correlated, diurnal patterns diverged across settlement population density strata. Megacities featured smooth diurnal profiles, with peaks later in the evening and larger intra-settlement variability- indicating the complex mixture of traffic, heavy industry, and other sources. Conversely, non-urban and small city sites featured earlier and higher magnitude diurnal peaks (1.5 -1.7 × daily minimum), with low intra-settlement variability, likely indicating a higher relative prevalence of biomass burning. Policymakers should ensure clean air programs do not solely focus on relatively well-studied megacities at the expense of the large non-urban population distributed across the IGP.