- Ochoa, David;
- Jarnuczak, Andrew F;
- Viéitez, Cristina;
- Gehre, Maja;
- Soucheray, Margaret;
- Mateus, André;
- Kleefeldt, Askar A;
- Hill, Anthony;
- Garcia-Alonso, Luz;
- Stein, Frank;
- Krogan, Nevan J;
- Savitski, Mikhail M;
- Swaney, Danielle L;
- Vizcaíno, Juan A;
- Noh, Kyung-Min;
- Beltrao, Pedro
Protein phosphorylation is a key post-translational modification regulating protein function in almost all cellular processes. Although tens of thousands of phosphorylation sites have been identified in human cells, approaches to determine the functional importance of each phosphosite are lacking. Here, we manually curated 112 datasets of phospho-enriched proteins, generated from 104 different human cell types or tissues. We re-analyzed the 6,801 proteomics experiments that passed our quality control criteria, creating a reference phosphoproteome containing 119,809 human phosphosites. To prioritize functional sites, we used machine learning to identify 59 features indicative of proteomic, structural, regulatory or evolutionary relevance and integrate them into a single functional score. Our approach identifies regulatory phosphosites across different molecular mechanisms, processes and diseases, and reveals genetic susceptibilities at a genomic scale. Several regulatory phosphosites were experimentally validated, including identifying a role in neuronal differentiation for phosphosites in SMARCC2, a member of the SWI/SNF chromatin-remodeling complex.