The coronavirus genome encodes four structural proteins that are incorporated into virions, spike (S), envelope (E), nucleocapsid (N), and membrane (M). M is the most abundant structural protein and is known to drive viral particle assembly and budding through direct protein-protein interactions. Throughout the COVID-19 pandemic caused by the virus SARS-CoV-2, M has displayed high sequence conservation, making it a compelling target for vaccines and therapeutics. Earlier studies of viral particles from related betacoronaviruses had suggested that M may be able to adopt two distinct conformations – a compact and elongated form – yet prior to 2019, the structure and molecular basis for its functions in infectious particle formation were unknown. The work presented in this dissertation focuses on the structural characterization of the M protein from SARS-CoV-2 to gain insight into how M functions in the viral life cycle and subsequently leverage our findings to develop new avenues to combat disease.
First, we develop routines to express and purify full-length M protein and determine its structure in nanodiscs to describe its architecture in a lipid environment using cryo-electron microscopy (cryo-EM). We show that M is a 50 kDa homodimer that is structurally homologous to an accessory SARS-CoV-2 protein, ORF3a, suggesting that the two share a common ancestor. Subtle structural differences between the two proteins lend insight into how they have evolved to fill drastically different roles in the coronavirus life cycle. One striking feature of M is the overwhelmingly electropositive cytosolic surface that is likely important for protein-protein interactions in viral assembly, as well allowing for close contact with the dense core of viral RNA. We performed molecular dynamics simulations and showed that this conformation is stable in a simple lipid bilayer. Together, these results produced the first atomic model of a coronavirus M protein and provide insight into roles for M in viral assembly and structure.
The next chapter of this thesis describes the structural basis and biophysical characterization of a neutralizing antibody that targets the N-terminus of the SARS-CoV-2 M protein. We used biolayer-interferometry (BLI) to show that the SARS37 neutralizing antibody specifically recognizes the alternate long conformation of M, rather than the short form we had previously characterized in nanodiscs. We used cryo-EM to determine the structure of the SARS37 Fab in complex with long form M and show that the epitope involves residues from the N-termini of both subunits, as well as the outer surface of the transmembrane domain through the insertion of a long CDR3 loop into the lipid bilayer. In agreement with our BLI data, we found that the complex formed between the SARS37 Fab and M in the short conformation was structurally heterogeneous and unstable. Importantly, we find that the residues of M that the Fab recognizes are highly conserved, remaining unchanged as the virus has evolved. Given the strong and stable preference for the long conformation, we infer that this form of M is present in circulating viral particles presented to the immune system and can be targeted to prevent further infection.
The last chapter of this dissertation investigates the role specific M-lipid interactions may play in the process of viral assembly. We show M directly binds a Golgi resident anionic lipid, ceramide-1-phosphate (C1P) and investigate the effect that C1P binding has on M conformational dynamics. We find an overall stabilizing effect of C1P binding to M in the short conformation and describe a specific C1P binding site that involves M residues spanning the transmembrane, hinge and C-terminal domains. Comparing our C1P bound structure with the previously determined apo short conformation M structures, we observe a previously uncharacterized potassium ion binding site and a series of side chain rotamer changes distributed across the C-terminal domain and dimer interface. We design point mutants at these sites and use AlphaFold to show that the predicted models exist in a continuum between the established short and long conformations. The residues involved are highly conserved amongst sarbecoviruses, suggesting a conserved potential principle underlying M protein conformational dynamics.