Skip to main content
eScholarship
Open Access Publications from the University of California

This series is home to publications and data sets from the Bourns College of Engineering at the University of California, Riverside.

Related Units

Center for Environmental Research and Technology

Cover page of Properties and predicted functions of large genes and proteins of apicomplexan parasites.

Properties and predicted functions of large genes and proteins of apicomplexan parasites.

(2024)

Evolutionary constraints greatly favor compact genomes that efficiently encode proteins. However, several eukaryotic organisms, including apicomplexan parasites such as Toxoplasma gondii, Plasmodium falciparum and Babesia duncani, the causative agents of toxoplasmosis, malaria and babesiosis, respectively, encode very large proteins, exceeding 20 times their average protein size. Although these large proteins represent <1% of the total protein pool and are generally expressed at low levels, their persistence throughout evolution raises important questions about their functions and possible evolutionary pressures to maintain them. In this study, we examined the trends in gene and protein size, function and expression patterns within seven apicomplexan pathogens. Our analysis revealed that certain large proteins in apicomplexan parasites harbor domains potentially important for functions such as antigenic variation, erythrocyte invasion and immune evasion. However, these domains are not limited to or strictly conserved within large proteins. While some of these proteins are predicted to engage in conventional metabolic pathways within these parasites, others fulfill specialized functions for pathogen-host interactions, nutrient acquisition and overall survival.

Cover page of Oxidative Transformation of Nafion-Related Fluorinated Ether Sulfonates: Comparison with Legacy PFAS Structures and Opportunities of Acidic Persulfate Digestion for PFAS Precursor Analysis.

Oxidative Transformation of Nafion-Related Fluorinated Ether Sulfonates: Comparison with Legacy PFAS Structures and Opportunities of Acidic Persulfate Digestion for PFAS Precursor Analysis.

(2024)

The total oxidizable precursor (TOP) assay has been extensively used for detecting PFAS pollutants that do not have analytical standards. It uses hydroxyl radicals (HO•) from the heat activation of persulfate under alkaline pH to convert H-containing precursors to perfluoroalkyl carboxylates (PFCAs) for target analysis. However, the current TOP assay oxidation method does not apply to emerging PFAS because (i) many structures do not contain C-H bonds for HO• attack and (ii) the transformation products are not necessarily PFCAs. In this study, we explored the use of classic acidic persulfate digestion, which generates sulfate radicals (SO4-•), to extend the capability of the TOP assay. We examined the oxidation of Nafion-related ether sulfonates that contain C-H or -COO-, characterized the oxidation products, and quantified the F atom balance. The SO4-• oxidation greatly expanded the scope of oxidizable precursors. The transformation was initiated by decarboxylation, followed by various spontaneous steps, such as HF elimination and ester hydrolysis. We further compared the oxidation of legacy fluorotelomers using SO4-• versus HO•. The results suggest novel product distribution patterns, depending on the functional group and oxidant dose. The general trends and strategies were also validated by analyzing a mixture of 100000- or 10000-fold diluted aqueous film-forming foam (containing various fluorotelomer surfactants and organics) and a spiked Nafion precursor. Therefore, (1) the combined use of SO4-• and HO• oxidation, (2) the expanded list of standard chemicals, and (3) further elucidation of SO4-• oxidation mechanisms will provide more critical information to probe emerging PFAS pollutants.

Cover page of Novel Anti-CRISPR-Assisted CRISPR Biosensor for Exclusive Detection of Single-Stranded DNA (ssDNA).

Novel Anti-CRISPR-Assisted CRISPR Biosensor for Exclusive Detection of Single-Stranded DNA (ssDNA).

(2024)

Nucleic acid analysis plays an important role in disease diagnosis and treatment. The discovery of CRISPR technology has provided novel and versatile approaches to the detection of nucleic acids. However, the most widely used CRISPR-Cas12a detection platforms lack the capability to distinguish single-stranded DNA (ssDNA) from double-stranded DNA (dsDNA). To overcome this limitation, we first employed an anti-CRISPR protein (AcrVA1) to develop a novel CRISPR biosensor to detect ssDNA exclusively. In this sensing strategy, AcrVA1 cut CRISPR guide RNA (crRNA) to inhibit the cleavage activity of the CRISPR-Cas12a system. Only ssDNA has the ability to recruit the cleaved crRNA fragment to recover the detection ability of the CRISPR-Cas12 biosensor, but dsDNA cannot accomplish this. By measuring the recovered cleavage activity of the CRISPR-Cas12a biosensor, our developed AcrVA1-assisted CRISPR biosensor is capable of distinguishing ssDNA from dsDNA, providing a simple and reliable method for the detection of ssDNA. Furthermore, we demonstrated our developed AcrVA1-assisted CRISPR biosensor to monitor the enzymatic activity of helicase and screen its inhibitors.

Cover page of High-fidelity, hyper-accurate, and evolved mutants rewire atomic-level communication in CRISPR-Cas9.

High-fidelity, hyper-accurate, and evolved mutants rewire atomic-level communication in CRISPR-Cas9.

(2024)

The high-fidelity (HF1), hyper-accurate (Hypa), and evolved (Evo) variants of the CRISPR-associated protein 9 (Cas9) endonuclease are critical tools to mitigate off-target effects in the application of CRISPR-Cas9 technology. The mechanisms by which mutations in recognition subdomain 3 (Rec3) mediate specificity in these variants are poorly understood. Here, solution nuclear magnetic resonance and molecular dynamics simulations establish the structural and dynamic effects of high-specificity mutations in Rec3, and how they propagate the allosteric signal of Cas9. We reveal conserved structural changes and dynamic differences at regions of Rec3 that interface with the RNA:DNA hybrid, transducing chemical signals from Rec3 to the catalytic His-Asn-His (HNH) domain. The variants remodel the communication sourcing from the Rec3 α helix 37, previously shown to sense target DNA complementarity, either directly or allosterically. This mechanism increases communication between the DNA mismatch recognition helix and the HNH active site, shedding light on the structure and dynamics underlying Cas9 specificity and providing insight for future engineering principles.

Cover page of A view of the pan‐genome of domesticated Cowpea (Vigna unguiculata [L.] Walp.)

A view of the pan‐genome of domesticated Cowpea (Vigna unguiculata [L.] Walp.)

(2024)

Cowpea, Vigna unguiculata L. Walp., is a diploid warm-season legume of critical importance as both food and fodder in sub-Saharan Africa. This species is also grown in Northern Africa, Europe, Latin America, North America, and East to Southeast Asia. To capture the genomic diversity of domesticates of this important legume, de novo genome assemblies were produced for representatives of six subpopulations of cultivated cowpea identified previously from genotyping of several hundred diverse accessions. In the most complete assembly (IT97K-499-35), 26,026 core and 4963 noncore genes were identified, with 35,436 pan genes when considering all seven accessions. GO terms associated with response to stress and defense response were highly enriched among the noncore genes, while core genes were enriched in terms related to transcription factor activity, and transport and metabolic processes. Over 5 million single nucleotide polymorphisms (SNPs) relative to each assembly and over 40 structural variants >1 Mb in size were identified by comparing genomes. Vu10 was the chromosome with the highest frequency of SNPs, and Vu04 had the most structural variants. Noncore genes harbor a larger proportion of potentially disruptive variants than core genes, including missense, stop gain, and frameshift mutations; this suggests that noncore genes substantially contribute to diversity within domesticated cowpea.

Cover page of Comprehensive assessment of 11 de novo HiFi assemblers on complex eukaryotic genomes and metagenomes.

Comprehensive assessment of 11 de novo HiFi assemblers on complex eukaryotic genomes and metagenomes.

(2024)

Pacific Biosciences (PacBio) HiFi sequencing technology generates long reads (>10 kbp) with very high accuracy (<0.01% sequencing error). Although several de novo assembly tools are available for HiFi reads, there are no comprehensive studies on the evaluation of these assemblers. We evaluated the performance of 11 de novo HiFi assemblers on (1) real data for three eukaryotic genomes; (2) 34 synthetic data sets with different ploidy, sequencing coverage levels, heterozygosity rates, and sequencing error rates; (3) one real metagenomic data set; and (4) five synthetic metagenomic data sets with different composition abundance and heterozygosity rates. The 11 assemblers were evaluated using quality assessment tool (QUAST) and benchmarking universal single-copy ortholog (BUSCO). We also used several additional criteria, namely, completion rate, single-copy completion rate, duplicated completion rate, average proportion of largest category, average distance difference, quality value, run-time, and memory utilization. Results show that hifiasm and hifiasm-meta should be the first choice for assembling eukaryotic genomes and metagenomes with HiFi data. We performed a comprehensive benchmarking study of commonly used assemblers on complex eukaryotic genomes and metagenomes. Our study will help the research community to choose the most appropriate assembler for their data and identify possible improvements in assembly algorithms.

Cover page of Drug target prediction through deep learning functional representation of gene signatures.

Drug target prediction through deep learning functional representation of gene signatures.

(2024)

Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input gene signatures and fail to take advantage of preexisting knowledge about gene functions. To further enable comparative analysis of OMICS datasets, including target deconvolution and mechanism of action studies, we develop an approach that represents gene signatures projected onto their biological functions, instead of their identities, similar to how the word2vec technique works in natural language processing. We develop the Functional Representation of Gene Signatures (FRoGS) approach by training a deep learning model and demonstrate that its application to the Broad Institutes L1000 datasets results in more effective compound-target predictions than models based on gene identities alone. By integrating additional pharmacological activity data sources, FRoGS significantly increases the number of high-quality compound-target predictions relative to existing approaches, many of which are supported by in silico and/or experimental evidence. These results underscore the general utility of FRoGS in machine learning-based bioinformatics applications. Prediction networks pre-equipped with the knowledge of gene functions may help uncover new relationships among gene signatures acquired by large-scale OMICs studies on compounds, cell types, disease models, and patient cohorts.

Cover page of An alpha-helical lid guides the target DNA toward catalysis in CRISPR-Cas12a.

An alpha-helical lid guides the target DNA toward catalysis in CRISPR-Cas12a.

(2024)

CRISPR-Cas12a is a powerful RNA-guided genome-editing system that generates double-strand DNA breaks using its single RuvC nuclease domain by a sequential mechanism in which initial cleavage of the non-target strand is followed by target strand cleavage. How the spatially distant DNA target strand traverses toward the RuvC catalytic core is presently not understood. Here, continuous tens of microsecond-long molecular dynamics and free-energy simulations reveal that an α-helical lid, located within the RuvC domain, plays a pivotal role in the traversal of the DNA target strand by anchoring the crRNA:target strand duplex and guiding the target strand toward the RuvC core, as also corroborated by DNA cleavage experiments. In this mechanism, the REC2 domain pushes the crRNA:target strand duplex toward the core of the enzyme, while the Nuc domain aids the bending and accommodation of the target strand within the RuvC core by bending inward. Understanding of this critical process underlying Cas12a activity will enrich fundamental knowledge and facilitate further engineering strategies for genome editing.

Cover page of On-Site Fluorescent Detection of Sepsis-Inducing Bacteria using a Graphene-Oxide CRISPR-Cas12a (GO-CRISPR) System.

On-Site Fluorescent Detection of Sepsis-Inducing Bacteria using a Graphene-Oxide CRISPR-Cas12a (GO-CRISPR) System.

(2024)

Sepsis is an extremely dangerous medical condition that emanates from the bodys response to a pre-existing infection. Early detection of sepsis-inducing bacterial infections can greatly enhance the treatment process and potentially prevent the onset of sepsis. However, current point-of-care (POC) sensors are often complex and costly or lack the ideal sensitivity for effective bacterial detection. Therefore, it is crucial to develop rapid and sensitive biosensors for the on-site detection of sepsis-inducing bacteria. Herein, we developed a graphene oxide CRISPR-Cas12a (GO-CRISPR) biosensor for the detection of sepsis-inducing bacteria in human serum. In this strategy, single-stranded (ssDNA) FAM probes were quenched with single-layer graphene oxide (GO). Target-activated Cas12a trans-cleavage was utilized for the degradation of the ssDNA probes, detaching the short ssDNA probes from GO and recovering the fluorescent signals. Under optimal conditions, we employed our GO-CRISPR system for the detection of Salmonella Typhimurium (S. Typhimurium) with a detection sensitivity of as low as 3 × 103 CFU/mL in human serum, as well as a good detection specificity toward other competing bacteria. In addition, the GO-CRISPR biosensor exhibited excellent sensitivity to the detection of S. Typhimurium in spiked human serum. The GO-CRISPR system offers superior rapidity for the detection of sepsis-inducing bacteria and has the potential to enhance the early detection of bacterial infections in resource-limited settings, expediting the response for patients at risk of sepsis.