Improved Algorithms for Predicting Polyadenylation Sites and Cell Membranes From Expression, Sequence, and Image Data
- Author(s): Arefeen, Ashraful
- Advisor(s): Jiang, Tao
- et al.
Alternative polyadenylation (polyA) sites near the 3' end of a pre-mRNA creates multiple mRNA transcripts with different 3' untranslated regions (3' UTRs). The sequence elements of a 3' UTR are essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding, and translation efficiency. Moreover, numerous studies in the literature have reported the correlation between diseases and the shortening (or lengthening) of 3' UTRs. As alternative polyA sites are common in mammalian genes, we develop two algorithms, named TAPAS and DeepPASTA, for predicting polyA sites from different data: RNA-Seq expression and sequence data. TAPAS detects novel polyA sites of a gene from RNA-Seq reads by considering read coverage as a time series data. The method is then extended to identify polyA sites that are expressed differently between two biological samples and genes that contain 3' UTRs with shortening/lengthening events. On the other hand, DeepPASTA predicts polyA sites from sequence and RNA secondary structure data using a deep learning framework. As polyadenylation is a tissue-specific event, the tool also predicts tissue-specific polyA sites. Moreover, the tool can predict the most dominant (i.e., frequently used) polyA site of a gene in a specific tissue and relative dominance when two polyA sites of the same gene are given. Our extensive experiments demonstrate that both TAPAS and DeepPASTA significantly outperform the existing tools in polyA site analysis.
The cells and their internal organelles carry genetic information in all living organisms. An effective method of studying cells and their organelles at different timestamps is to analyze the fluorescent microscopic images of tissues. As a result, computer-automated analyses of such microscopic images are getting popular for their efficiency and minimal human interaction. One of the most important computer-automated analyses is the cell membrane prediction from cell nucleus data. We propose a new tool, named DeepCEP, to predict cell membranes from nuclei using the fluorescent microscopic image data. Our experiments demonstrate that DeepCEP can be a potentially useful tool for analyzing microscopic images in practice.