UC Santa Barbara
Fitting Mixed Effects Models with Big Data
- Author(s): He, Jingyi
- Advisor(s): Wang, Yuedong
- et al.
As technology evolves, big data bring us great opportunities to identify patterns which were infeasible to identify from observations before.
At the same time, it also brings challenges to Statisticians in analyzing massive data and transforming them into knowledge. Many existing implementations of traditional statistical methods can not cope with the volume of big data. Our research is motivated by the need to fit Linear Mixed Effect (LME) models to big data.
Subsampling and divide and conquer (Damp;C) methods have been proposed to analyze the big data. In this thesis, we focus on sampling and Damp;C methods for fitting LME models with big data.
We start with one-way random effect model in Chapter 2 and consider different subsampling methods such as sampling of subjects, sampling of both subjects and repeated measurements, and Damp;C methods to estimate the parameters. Estimation procedures, statistical properties, and simulation results are presented. After comparing the estimators from different methods for one-way random effect model, we consider subsampling of subjects and Damp;C method for random intercepts model and general linear mixed effects model in Chapters 3 and 4, respectively. Comparisons for different methods are provided at the end of each chapter. Overall we find that the Damp;C method has better performance. Finally, we apply subsampling and Damp;C method to investigate the relationship between ultraviolet radiation and blood pressure in Chapter 5.