Proteins are fundamental building blocks of life: understanding protein structure, function, aggregation, and degradation is, therefore, one of the central questions in biology. My work investigates protein aggregation and degradation through computational modeling, protein structure network analysis, and experimental verification.
One theme of my work is the discovery of new enzymes from the carnivorous plant, \Drosera capensis (D. capensis). With the ever-expanding genomic data, it is imperative to swiftly move from raw genomic data to chemical results. Using the ``target selection pipeline'' that we invented, in silico protein structures can be predicted rapidly, to direct the subsequent experimental characterization of the promising candidates. Subsequent network analysis predicts interesting protein properties such as potential enzyme activity, enzyme specificity, and the functional pH range, aiding the selection of functionally useful proteins for experimental characterization. This approach illustrates a generally applicable way to leverage the wealth of information provided by whole genome shotgun sequencing for proteomics. Computational techniques, despite their limitations, are now powerful enough to allow potentially useful proteins to be identified directly from the genome and filtered for strong indicators of biochemical function. So far, this work has resulted in three publications including proteases, chitinases, and esterase/lipases.
The protease resistance of amyloid fibrils and their central role in more than 40 human diseases, including Alzheimer's, makes them an attractive target to test the activity of new proteases from \textit{D.capensis}. To advance and streamline scientific discovery related to amyloid fibrils, it was crucial to have a standardized nomenclature. With collaborators, I introduced a systematic approach to the nomenclature of fibril topology using graph theoretic concepts to abstract the structure. The scheme encompasses all amyloid fibrils currently in the Protein Data Bank (PDB), and can be easily extended to accommodate newer discoveries. The work also showed that the vast majority of known fibril structures fall into just three topological categories, something that was previously unnoticed. My work has improved the discussion of fibril structures by condensing the descriptions of complicated structural features using a set of universal structural motifs.
The other theme of my work includes solving the protein structure of J2 crystallin, an aggregation resistant protein. J2-crystallin is a novel eye lens protein, highly expressed in Tripedalia cystophora (box jellyfish) and is an interesting target because of its very high stability and water-solubility. Unlike most non-cephalopod invertebrates, box jellyfish have camera-type eyes; therefore, their crystallins present an interesting system from an evolutionary biology perspective, making them an intriguing model system for vertebrates. Interestingly, Basic Local Alignment Search Tool (BLAST) search of J2 in the Protein Data Bank (PDB) found no proteins above 32 \% similarity. Therefore, the structure determination of J2 is not only important from the evolutionary standpoint but also because of the hypothesis that J2 possesses a novel protein fold, due to a lack of known homology. Here, I present the biophysical characterization, and solution-state NMR assignments of J2 crystallin, a previously uncharacterized eye lens protein, addressing the interplay of sequence, structure and function in the eye lens crystallins.