Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

SuPreMo: a computational tool for streamlining in silico perturbation using sequence-based predictive models

Published Web Location

https://www.biorxiv.org/content/10.1101/2023.11.03.565556v1
No data is associated with this publication.
Creative Commons 'BY-NC-SA' version 4.0 license
Abstract

Summary

The increasing development of sequence-based machine learning models has raised the demand for manipulating sequences for this application. However, existing approaches to edit and evaluate genome sequences using models have limitations, such as incompatibility with structural variants, challenges in identifying responsible sequence perturbations, and the need for vcf file inputs and phased data. To address these bottlenecks, we present Sequence Mutator for Predictive Models (SuPreMo), a scalable and comprehensive tool for performing and supporting in silico mutagenesis experiments. We then demonstrate how pairs of reference and perturbed sequences can be used with machine learning models to prioritize pathogenic variants or discover new functional sequences.

Availability and implementation

SuPreMo was written in Python, and can be run using only one line of code to generate both sequences and 3D genome disruption scores. The codebase, instructions for installation and use, and tutorials are on the GitHub page: https://github.com/ketringjoni/SuPreMo.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Item not freely available? Link broken?
Report a problem accessing this item