- Mangul, Serghei;
- Yang, Harry Taegyun;
- Strauli, Nicolas;
- Gruhl, Franziska;
- Porath, Hagit T;
- Hsieh, Kevin;
- Chen, Linus;
- Daley, Timothy;
- Christenson, Stephanie;
- Wesolowska-Andersen, Agata;
- Spreafico, Roberto;
- Rios, Cydney;
- Eng, Celeste;
- Smith, Andrew D;
- Hernandez, Ryan D;
- Ophoff, Roel A;
- Santana, Jose Rodriguez;
- Levanon, Erez Y;
- Woodruff, Prescott G;
- Burchard, Esteban;
- Seibold, Max A;
- Shifman, Sagiv;
- Eskin, Eleazar;
- Zaitlen, Noah
High-throughput RNA-sequencing (RNA-seq) technologies provide an unprecedented opportunity to explore the individual transcriptome. Unmapped reads are a large and often overlooked output of standard RNA-seq analyses. Here, we present Read Origin Protocol (ROP), a tool for discovering the source of all reads originating from complex RNA molecules. We apply ROP to samples across 2630 individuals from 54 diverse human tissues. Our approach can account for 99.9% of 1 trillion reads of various read length. Additionally, we use ROP to investigate the functional mechanisms underlying connections between the immune system, microbiome, and disease. ROP is freely available at https://github.com/smangul1/rop/wiki .