Computational Techniques to Investigate Structural Variation
- Author(s): Kinsella, Marcus Christopher
- et al.
The importance of structural variation as a source of phenotypic variation has become more and more apparent in recent years. At the same time, tools and techniques that detect structural variation using high-throughput data have proliferated. These trends have spurred interest in making increasingly sophisticated inferences about structural variation, including identifying complex or difficult to observe variants and elucidating the biological mechanisms that produce structural variants. Here, we identify several challenging problems in the investigation of structural variation and discuss computational techniques that solve them. First, we examine the discovery of fusion genes in the transcriptome using paired-end reads, a task complicated by reads that map to multiple locations in the genome. Earlier methods ignored these reads to control false discoveries. We demonstrate a method to resolve these ambiguous mappings and increase the sensitivity of fusion gene detection. Second, we investigate whether the breakage-fusion-bridge mechanism leaves a reliable footprint in high-throughput data, a question that had largely been addressed using ad hoc analyses. Using novel algorithms and simulation, we identify the surprisingly limited circumstances when the presence breakage-fusion-bridge can be inferred. Finally, we examine evidence for the phenomenon known as chromothripsis, the shattering and reannealing of chromosomes. We show that there are alternative hypotheses that can account for the structural variation patterns that form the currently proposed signature of chromothripsis