Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

The Role of Data Quality and Heterogeneity on the Calibration of Neural Networks

Abstract

Neural networks have been widely studied and used in recent years due to its high

classification accuracy and training efficiency. With the increase of network depth, however,

the models become worse calibrated, meaning they cannot reflect the true probabilities. On

the other hand, in many applications such as medical diagnosis, facial recognition and selfdriving cars, the calibrated output probabilities are of critical importance. Therefore, the

understanding of the cause of deep neural network uncalibration is of much concern.

The influence of model structures on the output calibration has been explored.

However, the impact of the training dataset quality and heterogeneity, such as dataset size

and label noise remains unclear. In this thesis, the impact of data quality and heterogeneity

on the output calibration is investigated theoretically and experimentally. Afterwards, the

defect of calibration methods using single global parameter are discussed. To overcome

the calibration issues resulting from the dataset heterogeneity, we propose an improved

calibration technique that can give better performance.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View