Identifying Differentially Expressed Genes for Time-course Microarray Data through Functional Data Analysis
- Author(s): Chen, Kun
- Wang, Jane-Ling
- et al.
Published Web Locationhttps://doi.org/10.1007/s12561-010-9024-z
Identification of differentially expressed (DE) genes across two conditions is a common task with microarray. Most existing approaches accomplish this goal by examining each gene separately based on a model and then control the false discovery rate over all genes. We took a different approach that employs a uniform platform to simultaneously depict the dynamics of the gene trajectories for all genes and select differently expressed genes. A new Functional Principal Component (FPC) approach is developed for time-course microarray data to borrow strength across genes. The approach is flexible as the temporal trajectory of the gene expressions is modeled nonparametrically through a set of orthogonal basis functions, and often fewer basis functions are needed to capture the shape of the gene expression trajectory than existing nonparametric methods. These basis functions are estimated from the data reflecting major modes of variation in the data. The correlation structure of the gene expressions over time is also incorporated without any parametric assumptions and estimated from all genes such that the information across other genes can be shared to infer one individual gene. Estimation of the parameters is carried out by an efficient hybrid EM algorithm. The performance of the proposed method across different scenarios was compared favorably in simulation to two-way mixed-effects ANOVA and the EDGE method using B-spline basis function. Application to the real data on C. elegans developmental stages also suggested that FPC analysis combined with hybrid EM algorithm provides a computationally fast and efficient method for identifying DE genes based on time-course microarray data.