- Heins, Richard A;
- Cheng, Xiaoliang;
- Nath, Sangeeta;
- Deng, Kai;
- Bowen, Benjamin P;
- Chivian, Dylan C;
- Datta, Supratim;
- Friedland, Gregory D;
- D’Haeseleer, Patrik;
- Wu, Dongying;
- Tran-Gyamfi, Mary;
- Scullin, Chessa S;
- Singh, Seema;
- Shi, Weibing;
- Hamilton, Matthew G;
- Bendall, Matthew L;
- Sczyrba, Alexander;
- Thompson, John;
- Feldman, Taya;
- Guenther, Joel M;
- Gladden, John M;
- Cheng, Jan-Fang;
- Adams, Paul D;
- Rubin, Edward M;
- Simmons, Blake A;
- Sale, Kenneth L;
- Northen, Trent R;
- Deutsch, Samuel
Harnessing the biotechnological potential of the large number of proteins available in sequence databases requires scalable methods for functional characterization. Here we propose a workflow to address this challenge by combining phylogenomic guided DNA synthesis with high-throughput mass spectrometry and apply it to the systematic characterization of GH1 β-glucosidases, a family of enzymes necessary for biomass hydrolysis, an important step in the conversion of lignocellulosic feedstocks to fuels and chemicals. We synthesized and expressed 175 GH1s, selected from over 2000 candidate sequences to cover maximum sequence diversity. These enzymes were functionally characterized over a range of temperatures and pHs using nanostructure-initiator mass spectrometry (NIMS), generating over 10,000 data points. When combined with HPLC-based sugar profiling, we observed GH1 enzymes active over a broad temperature range and toward many different β-linked disaccharides. For some GH1s we also observed activity toward laminarin, a more complex oligosaccharide present as a major component of macroalgae. An area of particular interest was the identification of GH1 enzymes compatible with the ionic liquid 1-ethyl-3-methylimidazolium acetate ([C2mim][OAc]), a next-generation biomass pretreatment technology. We thus searched for GH1 enzymes active at 70 °C and 20% (v/v) [C2mim][OAc] over the course of a 24-h saccharification reaction. Using our unbiased approach, we identified multiple enzymes of different phylogentic origin with such activities. Our approach of characterizing sequence diversity through targeted gene synthesis coupled to high-throughput screening technologies is a broadly applicable paradigm for a wide range of biological problems.