Lawrence Berkeley National Laboratory
Transcript level and sequence determinants of protein abundance and noise in Escherichia coli.
- Author(s): Guimaraes, Joao
- Rocha, Miguel
- Arkin, Adam
- et al.
Published Web Locationhttps://doi.org/10.1093/nar/gku126
The range over which a protein is expressed, and its cell-to-cell variability, is often thought to be linked to the demand for its activity. Steady-state protein level is determined by multiple mechanisms controlling transcription and translation, many of which are limited by DNA- and RNA-encoded signals that affect initiation, elongation and termination of polymerases and ribosomes. We performed a comprehensive analysis of >100 sequence features to derive a predictive model composed of a minimal non-redundant set of factors explaining 66% of the total variation of protein abundance observed in >800 genes in Escherichia coli. The model suggests that protein abundance is primarily determined by the transcript level (53%) and by effectors of translation elongation (12%), whereas only a small fraction of the variation is explained by translational initiation (1%). Our analyses uncover a new sequence determinant, not previously described, affecting translation initiation and suggest that elongation rate is affected by both codon biases and specific amino acid composition. We also show that transcription and translation efficiency may have an effect on expression noise, which is more similar than previously assumed.