SDEAP: a splice graph based differential transcript expression analysis tool for population data.
- Author(s): Yang, Ei-Wen
- Jiang, Tao
- et al.
Published Web Locationhttps://doi.org/10.1093/bioinformatics/btw513
Differential transcript expression (DTE) analysis without predefined conditions is critical to biological studies. For example, it can be used to discover biomarkers to classify cancer samples into previously unknown subtypes such that better diagnosis and therapy methods can be developed for the subtypes. Although several DTE tools for population data, i.e. data without known biological conditions, have been published, these tools either assume binary conditions in the input population or require the number of conditions as a part of the input. Fixing the number of conditions to binary is unrealistic and may distort the results of a DTE analysis. Estimating the correct number of conditions in a population could also be challenging for a routine user. Moreover, the existing tools only provide differential usages of exons, which may be insufficient to interpret the patterns of alternative splicing across samples and restrains the applications of the tools from many biology studies.We propose a novel DTE analysis algorithm, called SDEAP, that estimates the number of conditions directly from the input samples using a Dirichlet mixture model and discovers alternative splicing events using a new graph modular decomposition algorithm. By taking advantage of the above technical improvement, SDEAP was able to outperform the other DTE analysis methods in our extensive experiments on simulated data and real data with qPCR validation. The prediction of SDEAP also allowed us to classify the samples of cancer subtypes and cell-cycle phases more accurately.SDEAP is publicly available for free at https://github.com/ewyang089/SDEAP/wiki CONTACT: email@example.com; firstname.lastname@example.orgSupplementary information: Supplementary data are available at Bioinformatics online.