Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Time-Course Analysis and Clustering of Gene Expression Data

Abstract

High-throughput time-course studies collect measurements from samples across time. In

particular, longer-duration high-throughput time-course studies are becoming more common, such as in the case of 16S sequencing of bacterial communities or single-cell mRNA sequencing of developmental lineages. A common focus of these studies is on significance

testing per gene. However, in many settings, particularly those studying developmental processes, large numbers of genes show temporal changes, and the relevant question is instead to classify genes into different types of temporal changes. We propose a mixture-model clustering method that estimates a functional spline model for the mean of the cluster, in

order to cluster the temporal patterns of genes independent of scale. The model allows for

a wide range of likelihood models to suit a variety of data types. In addition, this clustering

strategy accounts for time-course data under different experimental conditions or developmental lineages, and it provides a method for evaluating, per cluster, significant differences in temporal patterns between conditions. This allows for an integrated analysis of differential expression analysis and clustering. We demonstrate the benefits of our method using simulated data. In addition, we explore several real data sets to illustrate both the context for and the application of the mixture model.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View