Ensemble learning of model hyperparameters and spatiotemporal data for calibration of low-cost PM2.5 sensors.
- Author(s): Yin, Peng-Yeng
- Tsai, Chih-Chun
- Day, Rong-Fuh
- Tung, Ching-Ying
- Bhanu, Bir
- et al.
Published Web Locationhttps://doi.org/10.3934/mbe.2019343
he PM2.5 air quality index (AQI) measurements from government-built supersites are accurate but cannot provide a dense coverage of monitoring areas. Low-cost PM2.5 sensors can be used to deploy a fine-grained internet-of-things (IoT) as a complement to government facilities. Calibration of low-cost sensors by reference to high-accuracy supersites is thus essential. Moreover, the imputation for missing-value in training data may affect the calibration result, the best performance of calibration model requires hyperparameter optimization, and the affecting factors of PM2.5 concentrations such as climate, geographical landscapes and anthropogenic activities are uncertain in spatial and temporal dimensions. In this paper, an ensemble learning for imputation method selection, calibration model hyperparameterization, and spatiotemporal training data composition is proposed. Three government supersites are chosen in central Taiwan for the deployment of low-cost sensors and hourly PM2.5 measurements are collected for 60 days for conducting experiments. Three optimizers, Sobol sequence, Nelder and Meads, and particle swarm optimization (PSO), are compared for evaluating their performances with various versions of ensembles. The best calibration results are obtained by using PSO, and the improvement ratios with respect to R2, RMSE, and NME, are 4.92%, 52.96%, and 56.85%, respectively.