Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Previously Published Works bannerUCLA

Generalized correlation measure using count statistics for gene expression data with ordered samples.

  • Author(s): Wang, YX Rachel;
  • Liu, Ke;
  • Theusch, Elizabeth;
  • Rotter, Jerome I;
  • Medina, Marisa W;
  • Waterman, Michael S;
  • Huang, Haiyan;
  • Stegle, Oliver
  • et al.

Published Web Location

https://doi.org/10.1093/bioinformatics/btx641
No data is associated with this publication.
Abstract

Motivation

Capturing association patterns in gene expression levels under different conditions or time points is important for inferring gene regulatory interactions. In practice, temporal changes in gene expression may result in complex association patterns that require more sophisticated detection methods than simple correlation measures. For instance, the effect of regulation may lead to time-lagged associations and interactions local to a subset of samples. Furthermore, expression profiles of interest may not be aligned or directly comparable (e.g. gene expression profiles from two species).

Results

We propose a count statistic for measuring association between pairs of gene expression profiles consisting of ordered samples (e.g. time-course), where correlation may only exist locally in subsequences separated by a position shift. The statistic is simple and fast to compute, and we illustrate its use in two applications. In a cross-species comparison of developmental gene expression levels, we show our method not only measures association of gene expressions between the two species, but also provides alignment between different developmental stages. In the second application, we applied our statistic to expression profiles from two distinct phenotypic conditions, where the samples in each profile are ordered by the associated phenotypic values. The detected associations can be useful in building correspondence between gene association networks under different phenotypes. On the theoretical side, we provide asymptotic distributions of the statistic for different regions of the parameter space and test its power on simulated data.

Availability and implementation

The code used to perform the analysis is available as part of the Supplementary Material.

Contact

msw@usc.edu or hhuang@stat.berkeley.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Item not freely available? Link broken?
Report a problem accessing this item