SARS-CoV-2 antibody magnitude and detectability are driven by disease severity, timing, and assay

Serosurveillance studies are critical for estimating SARS-CoV-2 transmission and immunity, but interpretation of results is currently limited by poorly defined variability in the performance of antibody assays to detect seroreactivity over time in individuals with different clinical presentations. We measured longitudinal antibody responses to SARS-CoV-2 in plasma samples from a diverse cohort of 128 individuals over 160 days using 14 binding and neutralization assays. For all assays, we found a consistent and strong effect of disease severity on antibody magnitude, with fever, cough, hospitalization, and oxygen requirement explaining much of this variation. We found that binding assays measuring responses to spike protein had consistently higher correlation with neutralization than those measuring responses to nucleocapsid, regardless of assay format and sample timing. However, assays varied substantially with respect to sensitivity during early convalescence and in time to seroreversion. Variations in sensitivity and durability were particularly dramatic for individuals with mild infection, who had consistently lower antibody titers and represent the majority of the infected population, with sensitivities often differing substantially from reported test characteristics (e.g., amongst commercial assays, sensitivity at 6 months ranged from 33% for ARCHITECT IgG to 98% for VITROS Total Ig). Thus, the ability to detect previous infection by SARS-CoV-2 is highly dependent on the severity of the initial infection, timing relative to infection, and the assay used. These findings have important implications for the design and interpretation of SARS-CoV-2 serosurveillance studies.


INTRODUCTION
Despite advances in severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) 2 prevention and treatment, the novel coronavirus continues to infect individuals at an 3 unprecedented rate. Because vaccination programs remain limited in scope, millions of 4 individuals worldwide continue to rely upon natural post-infection immunity for protection 5 from reinfection. Serosurveillance studies measuring the prevalence of antibodies to 6 SARS-CoV-2 have been and will continue to be a key means for estimating 7 transmission over time and extrapolating potential levels of immunity in populations, 8 though precise correlates of protection have yet to be established. However, limited 9 available data on the sensitivity of antibody assays to detect prior infection -particularly 10 in appropriately representative populations and over time -make it difficult to accurately 11 interpret results from these studies. 1 For these reasons, longitudinal characterization of 12 antibody responses following SARS-CoV-2 infection with a range of clinical 13 presentations is an important research gap and will be critical to interpreting 14 seroepidemiological data and informing public health responses to the pandemic. 15 Infection with SARS-CoV-2 is associated with substantial variability in disease 16 presentation, with severity ranging from asymptomatic infection to the need for high-17 level oxygen support and mechanical ventilation. 2,3 There appear to be important 18 relationships between the severity of illness and the magnitude and durability of the 19 antibody response, 4-12 but limited data are available evaluating the contributions of 20 demographic factors and clinical features. Numerous platforms are available for the 21 detection of antibody responses to SARS-CoV-2, which rely on different viral antigens 22 and utilize different assay methods, and there is no guarantee that they will provide 23 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint comparable data. With a few notable exceptions, 6,7 most studies to date have produced 1 antibody data from a single or limited number of platforms to evaluate antibody 2 responses following infection. [8][9][10][13][14][15] Comparisons across platforms and assay format 3 differences (e.g., direct vs. indirect detection), including the correlation between binding 4 assays and neutralization capacity, have thus far have been limited. 4,7,16 5 Here, we characterize the antibody responses to SARS-CoV-2 among a diverse cohort CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint between 14 and 90 days after onset of COVID-19 symptoms and are offered monthly 1 visits until 4 months after illness onset; they are then seen every 4 months thereafter. 2 Clinical data from the initial LIINC study visit was used for this analysis. At this visit, 3 LIINC participants underwent a detailed clinical interview conducted by a study 4 physician or research coordinator using a standardized data collection instrument. 5 Demographic data collected included age, sex, gender, race, ethnicity, education level, 6 income level, and housing status. Data related to SARS-CoV-2 infection included the 7 date and circumstances of diagnosis, illness, and treatment history. Each participant 8 was asked to estimate the date of symptom onset in relation to the timing of their first 9 SARS-CoV-2 nucleic acid amplification test result. Participants were questioned 10 regarding the presence, duration (in days), and current status of a list of COVID-19 11 symptoms and additional somatic symptoms derived from the Patient Health 12 Questionnaire 17 , as well as measures of quality of life derived from the EQ-5D-5L 13 Instrument 18 . We determined from medical records whether each individual was 14 hospitalized (defined as spending >24 hours in the emergency department or hospital) 15 and whether they required supplemental oxygen, admission to an intensive care unit 16 (ICU), or mechanical ventilation. Past medical history was ascertained and concomitant 17 medications recorded. 18 At each visit, blood was collected by venipuncture. Serum and plasma were isolated via 19 centrifugation of non-anticoagulated and heparinized blood, respectively, and stored at -20 80C. For the current analyses, we included 128 participants who were enrolled between 21 April and July, 2020 and who had at least one measurement on a binding assay or 22 neutralization platform. 23 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) for the pseudovirus. Non-commercial research use assays included the Luciferase 10 Immunoprecipitation Systems (LIPS) assay (total Ig) targeting the nucleocapsid (N) 11 protein and the receptor binding domain (RBD) of the spike (S) protein performed in the 12 Burbelo laboratory (additional raw data on responses to full S protein, highly correlated 13 with RBD reponses, are included in Supplementary Tables 2 and 3), 19 the split   14 luciferase assay (total Ig) targeting N and S performed in the Wells laboratory, 20 and 15 the Luminex assay (IgG) targeting N (one full-length and one fragment), S, and RBD 16 performed in the Greenhouse laboratory. 17 For the research use Luminex assay, we used a published protocol with modifications. 21 18 Plasma samples were diluted to 1:100 in blocking buffer A (1xPBS, 0.05% Tween, 0.5% 19 bovine serum albumin, 0.02% sodium azide). Antigens were produced using previously 20 described constructs. 22,23 Antigen concentrations used for COOH-bead coupling were 21 as follows: S, 4 ug/mL; RBD, 2 ug/mL; and N, 3 ug/mL. Concentration values were 22 calculated from the Luminex median fluorescent intensity (MFI) using a plate-specific 23 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint standard curve consisting of serial dilutions of a pool of positive control samples. Any 1 samples with MFIs above the linear range of the standard curve were serially diluted 2 and rerun until values fell within range to obtain a relative concentration. A cutoff for 3 positivity was established for each antigen above the maximum concentration value 4 observed across 114 pre-pandemic SARS-CoV-2 negative control samples tested on 5 the platform. Comparing individuals across assays and estimating time to seroreversion: For each 9 assay, we fit a linear mixed effects model that included a patient-specific random 10 intercept. Given the longitudinal nature of our data set, we fit mixed effects models to 11 explicitly account for the repeated measurement of individuals over time. We log-12 transformed the response variable for a subset of the assays based on assessment of 13 their correlations with log-transformed neutralization titers (Supplementary Table 1 represents an individual-level random effect that is normally distributed with a mean of 0 22 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint and a standard deviation of τ , and e hi represents the residual error that is normally 1 distributed with a mean of 0 and a standard deviation of σ . We also considered models 2 that included additional fixed effects for covariates such as age, ethnicity, sex, and HIV 3 status (Supplementary Table 4). We did not find consistent differences in the slope of 4 the antibody responses (λ) by hospitalization status across the majority of the assays 5 used here; therefore we used a single slope for each assay throughout (Supplementary   6   Table 5). 7 Since the timing of the baseline visit was variable between individuals, in order to 8 directly compare the magnitude of measured responses for individuals on each assay, 9 we used the mixed-effects model to estimate the antibody response that each person 10 would have at 21 days post symptom onset (random intercept). 24 We also used the 11 model estimates to calculate the mean time to seroreversion T for severity class s on 12 each assay, given the cutoff value for positivity (Supplementary Table 1), as follows: 14 We performed bootstrapping to obtain 95% confidence intervals of T s for each of the 14 15 assays. We used the time to seroreversion as the outcome here rather than alternative 16 quantities such as the half-life, as the serologic responses obtained here did not all 17 necessarily represent direct measurements of antibody titers. These models were fit 18 using the lme4 package using the R statistical software (https://www.R-project.org/). 19 Random forests modeling of demographic/clinical predictors and antibody responses: 20 For each assay, we used random forests to model antibody responses based on 50 21 demographic and clinical predictors (Supplementary Table 6). We dichotomized the 22 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint antibody response for each individual on each assay based on whether or not their 1 estimated random intercept was in the top half or the bottom half of all fitted random 2 intercepts on that assay. We first fit models to these dichotomized antibody responses 3 using all available predictors; subsequently, we fit models to these dichotomized 4 antibody responses on a down-selected set of predictors selected based on variable 5 importance (i.e., mean decrease in accuracy). We quantified prediction accuracy using 6 the out-of-bag error rate and the area under the curve (AUC). These models were built 7 using the randomForest package using the R statistical software (version 3.5.3). 8 Estimating time-varying assay sensitivity: For each assay, we fit an extension of the 9 linear mixed effects models described in Equation 1 above in a Bayesian hierarchical 10 modeling framework, where we we allowed the standard deviation of the random 11 intercept (τ) to be severity-specific (now referred to as CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. Data and code to reproduce all analyses are available at: 1 https://github.com/EPPIcenter/liinc-Ab-dynamics. 2 Ethical considerations 3 All participants signed a written informed consent form. The study was approved by the 4 University of California, San Francisco Institutional Review Board. 5

6
Participant demographics and characteristics 7 As shown in Table 1, the cohort of 128 participants had an average age of 48 years 8 (range 19-85 years), was relatively balanced in terms of sex (45% female at birth), and 9 26% of participants self-identified as being of Latinx ethnicity, a group which has been 10 identified to be at-risk for COVID-19. Common medical comorbidities were hypertension 11 (23%), lung disease (16%), and diabetes (13%). Notably, 18 individuals (14%) were CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. participants were tested using at least one of the 14 assays evaluated; 171 samples 1 from 88 individuals were tested using all 14 assays (Supplementary Table 1 Substantial heterogeneity in antibody responses across individuals and assays 5 We observed substantial heterogeneity in measured antibody responses in individuals  Table 7). When comparing antibody levels between individuals, 12 responses were very heterogeneous, with some individuals mounting strong responses 13 for all assays and others with weak responses even at the initial visit (below the 14 positivity cutoff for some assays). 15 Strong correlation between binding and neutralization assays 16 We observed high levels of correlation between estimated antibody levels at 21 days were consistently higher between binding assays using the same antigenic target 20 (S/RBD vs. N) than between those using different targets, despite the variety in 21 platforms used and the measurement of responses to both targets on some platforms 22 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. (LIPS, Luminex, split luciferase). Titers of neutralizing antibodies correlated well with all 1 binding assays (range: 0.60 to 0.88) and correlated most highly with responses to the S 2 protein (range: 0.76 to 0.88), as might be expected given the expression of S protein on 3 the pseudovirus used in the neutralization assay ( Figure 2B, Supplementary Figure 3). 4 We found no substantive differences in correlations between binding and neutralization 5 assays at timepoints before vs. after 90 days, suggesting these relationships did not 6 appreciably change over the duration of observed follow-up (Supplementary Table 8). 7 Disease severity is strongly associated with the magnitude of antibody responses 8 Baseline antibody responses for each study participant showed remarkably consistent 9 patterns across all assays when stratified by severity class, with asymptomatic 10 individuals having the lowest responses, hospitalized individuals having the highest, and 11 symptomatic but not hospitalized individuals having intermediate responses ( Figure 3). 12 While the number of asymptomatic individuals was small, responses were significantly 13 lower in these individuals than those who were symptomatic but not hospitalized for 14 multiple assays; hospitalized individuals had significantly higher responses than both 15 other groups for all assays with the notable exception of the neutralization assay 16 (Supplementary Table 9). Despite these consistent patterns, there was still substantial 17 variation in the magnitude of responses between participants within each severity 18 category. Notably, age, sex, HIV status, and Latinx ethnicity showed little association 19 with antibody responses after adjusting the analysis for hospitalization (Supplementary   20   Table 4). 21 Need for hospitalization, cough, and fever are key predictors of antibody responses 22 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. We next examined which of 50 individual demographic and clinical variables were the 1 strongest predictors of the magnitude of the antibody response (top vs. bottom half of 2 responders for each assay, Supplementary Figure 4) using a random forests algorithm. 3 Among the entire cohort (n=128), the presence and duration of cough and fever, and 4 need for hospitalization and supplemental oxygen during the initial illness, were the 5 most important predictors of the antibody response ( Figure 4A). The ranks of their 6 importance varied subtly but were largely consistent across the 14 assays evaluated, 7 and random forests models including only these 6 variables were able to predict high 8 versus low magnitude of response on each assay with reasonably high accuracy (AUCs CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint each other and, particularly for those measuring responses to spike protein, with 1 pseudovirus neutralization. For all assays, we found a consistent, strong, and dose-2 dependent effect of disease severity on antibody magnitude. Despite these similarities, 3 assays performed quite differently in terms of sensitivity to detect prior infection and in 4 the durability of measured responses, leading to large discrepancies in sensitivity 5 between assays in the months following infection. Thus, the ability to detect previous 6 infection by SARS-CoV-2 using an antibody test is highly dependent on the severity of 7 the initial infection, when the sample is obtained relative to infection, and the assay 8 used. 9 Prior work has shown that antibody responses in individuals with symptomatic COVID-10 19 have in some cases been associated with disease severity. 4-12 We observed 11 significant variability in antibody responses between study participants which was 12 largely explained by the self-reported symptom constellation and the severity of the 13 acute illness. A few simple variables consistently predicted the magnitude of the 14 antibody responses across multiple assay platforms and antigen targets; these 15 symptoms (e.g., fever, cough) are similar to those recently described in a population-16 based Icelandic cohort. 6 Importantly, in contrast to that cohort, characteristics like age 17 and sex were not predictive of these responses, after accounting for disease severity. 18 We also observed substantial heterogeneity between assays in terms of overall CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. scales of data that we have here, may not be accurate for extrapolation into the distant 1 future as antibody responses often follow more complex dynamics of boosting and 2 waning over time. 28 Finally, it is important to recognize that assays optimally suited for 3 serosurveillance may not be equally suitable for other use-cases, such as identifying 4 recent infection, detecting reinfection, determining protective capacity, or determining 5 potency of COVID-19 convalescent plasma. 29 Evaluating the performance of assays for 6 each of these use-cases will require different study designs and sample sets. 7 As SARS-CoV-2 vaccination becomes a reality, many serosurveillance efforts will need 8 to increasingly rely upon assays that can distinguish vaccination from natural infection, In this study, we demonstrated substantial differences in the detectability of antibody 21 responses to SARS-CoV-2 related to illness severity, time since infection, and assay 22 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. platform. These results will be important in choosing and interpreting serologic assays 1 for evaluating infection and immunity in population surveillance studies. 2 ACKNOWLEDGEMENTS 3 We are grateful to the LIINC study participants and to the clinical staff who provided 4 care to these individuals during their acute illness period. We acknowledge LIINC study CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; https://doi.org/10.1101/2021.03.03.21251639 doi: medRxiv preprint    Table 1. 20 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021.     Negative predictive values shown are based on the estimated assay sensitivities for 21 non-hospitalized individuals in Figure 5B, for a range of prevalence between 5% and 22 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. 50% (x-axis). Lower panels show the same data with a smaller range in the y axis to 1 visualize small differences. 2 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  heart  preg  hyper  auto  runny  vomit_days  diarr  diarr_days  symp  cancer  gender  short_days  race_BAA  sex  short  vomit  race_AIAN  runny_days  calcage  smell_days  nausea_days  venti  race_PNTA  smell  race_NHOPI  race_A  ethnicity  walkwst  fatig_days  chills  nausea  kidney  sore  hiv  fatig  lung  diab  chills_days  activwst  sore_days  race_W  icu  washwst  muscle  head_days  head  n_symptoms  muscle_days  BMI  fever  fever_days  oxy  cough