Does nonbinary they inherit the binary pronoun production system?

The English pronoun system is undergoing a change in progress as singular they is used more frequently to refer to specific individuals, especially those who identify as nonbinary. How does this change affect the language production system? Research has shown that the production of he/she pronouns is supported by salient discourse status and inhibited in contexts where the pronoun would be ambiguous. In an analysis of naturally-occurring written texts, we test whether they production patterns with he/she production, controlling for discourse context. Results show that the overall rate of pronoun use is lower for references to nonbinary individuals than for references to binary individuals. This difference is not explained by the potential ambiguity of a referent in context. We speculate that relative unfamiliarity with nonbinary they and nonbinary gender may inhibit the activation of they during production, or may lead writers to avoid using a form that may not be familiar to their addressees.


Introduction
Pronouns are some of the most frequent words in language, and they create cohesion between utterances. But recently English pronouns have also been recruited for a social function, where identifying one's pronouns (e.g. "My pronouns are she/her") signals gender identity (Out and Equal, 2021). This practice is inter-related with a growing awareness that gender identity is not always identifiable based on appearance, names, or social role. An increasing number of individuals have identified their personal pronouns as they/them, signaling a nonbinary or gender-queer identity. While nontraditional gender identities have been around for a long time (Vincent & Manzano, 2017), the recent salience of this trend led the American Dialect Society to adopt "they" as the word of the decade in 2019.
What this means is that the pronominal system is changing. It is not novel that they is used in a singular sense; singular they has been used for centuries (Baron, 2020;Bjorkman, 2017;Conrod, 2020;Konnelly & Cowper, 2020;Nunberg, 2016), especially in cases where the referent is quantified or ungendered, e.g. Everyone…they, or If you know a student who is absent please send them a recording of today's lecture. Singular they can also refer to specific individuals, but is more common when the person's identity is not contextually critical, for example it is more acceptable when the referent is "socially distant" (my dentist…they) than someone the speaker knows personally (my friend….they; Camilliere et al., 2021). But the use that attracts the most attention is when it refers to a specific and salient individual (Ackerman, 2019;Bjorkman, 2017;Conrod, 2020;Konnelly & Cowper, 2020), e.g. "Sam Smith has opened up about how they feel since publicly coming out as non-binary," (Young, 2020). This form is socially appropriate (and for many, required) in contexts where the referent is a person who identifies as nonbinary and uses they/them as personal pronouns, so we will use the term "nonbinary they" to refer to this specific linguistic usage.
This change parallels a dramatic re-drawing of the conceptual representations of gender, and has thus drawn substantial attention. For transgender people and allies, using someone's appropriate pronouns is a critical signal of respect, leading many to advocate for the adoption of they. Yet others criticize the use of singular they as either ungrammatical or needlessly ambiguous, for example "The reader enters a minefield of confusion when any individual or any group might be referred to as they or them," (Flynn, 2020). Thus, use of this form is variable across individuals.
This change in progress raises questions about the psycholinguistic processes behind reference production. Every instance of referring requires speakers and writers to make a choice about how specific to be: Should I say Kamala Harris? The Vice President? She? Even though nonbinary they is a relatively new form, it is poised to inherit the same well-practiced decision-making system that allows speakers to choose between he or she and more explicit expressions.
The question we ask here is how people choose between singular third-person pronouns (he, she, they) and more explicit names or descriptions. Decades of research on he and she have revealed two major factors that drive reference production. First, pronouns (and other reduced expressions) are used in discourse contexts where the referent is already available, e.g. when the referent is given (recently mentioned or evoked) vs. new (Ariel, 1990;Chafe, 1976;Givon, 1983;Gundel et al., 1993;Prince, 1981). Pronouns are also more frequent for referents that have a prominent or topical discourse status, for example referents that were mentioned recently (Givon, 1983), or occurred as subject or first-mentioned referent in the previous sentence (Arnold et al., 2000;Gernsbacher et al., 1989;Jarvikivi et al., 2005). The effect of discourse status is observed cross-linguistically, including in languages where pronouns do not mark gender (Gundel et al., 1993;Hwang, 2020), or in languages where the functional equivalent to English pronouns are null references (e.g., Spanish; Medina-Fetterman et al., 2022).
The second major consideration for pronoun production is whether the discourse context includes competitor referents (Ariel, 1990;Givon, 1983;Hwang, 2020). For example, when referring to Mickey, people are less likely to use a pronoun in a story about Mickey and Donald than in a matched story about Mickey and Daisy (Arnold & Griffin, 2007;Francik, 1985). This might reflect a desire to avoid ambiguity: the pronoun he is situationally ambiguous in the presence of two male referents, but not in a context with one male and one female (but for an alternate explanation see Arnold & Griffin, 2007;Fukumura et al., 2011;Fukumura, Hyona, & Scholfield, 2013). This idea is related to claims that speakers aim to be communicatively cooperative by producing referential expressions that are informative for the context (Davies & Katsos, 2010;Deutsch and Pechmann, 1982;Engelhardt, Bailey, & Ferreira, 2006;Olson, 1970;Pogue, Kurumada, &Tanenhaus, 2016), for example using modifiers like "blue cup" in contexts with more than one cup (for a review see Davies & Arnold, 2019).
In sum, research shows that both 1) discourse status and 2) the presence of competitor referents affect the production of pronouns. We strongly predict that discourse status should affect they use, given its widespread constraint on reference cross-linguistically. Yet relatively little is known about how speakers and writers use nonbinary they in English. Here we ask two questions: First, is nonbinary they chosen as frequently as binary he/she, controlling for discourse context? Nonbinary they is relatively new to mainstream American English communities, and individuals vary highly in their acceptance and use of it (Ackerman, 2018;Bjorkman, 2017;Camilliere et al., 2021;Conrod, 2020;Konnelly & Cowper, 2020). The newness of this form raises the possibility that speakers and writers may shy away from using they in favor of more explicit names and descriptions.
Second, if we do find a difference between the rates of nonbinary they and binary he/she, what explains this difference? We focus on the hypothesis that they could match a wider range of referents in the context, including plural, socially-distant-singular, and nonbinary singular referents. If ambiguity drives production choices, we may find that writers avoid all pronouns (he, she, and they) in contexts with other potential referents. If they is more likely to occur in potentially ambiguous contexts than he/she, and if ambiguity drives pronoun use in our dataset, we might observe a lower use of they than he/she that is entirely a function of ambiguity. On the other hand, referential forms -even he and she -are frequently ambiguous, and listeners generally resolve them rapidly (e.g., Arnold et al., 2000). Data on whether ambiguity avoidance drives they production provides an opportunity for psycholinguistic data to inform public discussion, given that many critics of singular they rely on the claim that it is problematic because it is ambiguous (e.g., Flynn, 2020).
Alternatively, a difference between they and he/she production could also be explained by two known effects in reference production. First, the unfamiliar form may be harder to retrieve and suffer competition from alternative forms (e.g., Jescheniak & Shriefers, 1998). Second, writers may be less certain that singular they will be interpretable to readers and avoid it through a process of audience design (Ferreira, 2019).
To investigate these questions, we assembled a naturally-occurring sample of published articles about individuals who identify as nonbinary and use the pronouns they/them, along with a comparison sample of references to binary individuals. This corpus analysis thus examines the production of reference form within a sample of authors, taking an average of pronoun use in each category across several tokens. We analyzed references to both nonbinary and binary individuals and calculated the rate of pronouns for each, controlling for discourse status and potential ambiguity. These articles represent a real-world sample; indeed, for individuals who do not already know people who use they/them pronouns, articles may be one of their first exposures to this usage.

Sample
We searched the internet for published articles that included references to a person who both identifies as nonbinary/genderqueer and uses they/them as their personal pronouns (termed "nonbinary" references for the analysis). For a comparison sample, we found references by the same author to individuals who go by he/him or she/her pronouns (termed "binary" references), either in the same or a different article. Our criteria for including an author in the analysis was that we could find at least 5 nonbinary references and at least 5 binary references in one or more articles. In some cases the references were to different people. We used all the articles we found that met our criteria.
To roughly equate the number of tokens contributed by each author, we only coded up to the first 20 references in each category for a particular author, distributed equally. 1 The articles and 1 If an article had references to more than one different nonbinary or binary individuals, we aimed for about equal tokens from each referent.

Coding Scheme
Our key question was whether reference forms would relate to two features of the discourse context: 1) whether the referent was mentioned recently, and 2) whether potentially competing characters were mentioned recently. To assess this we identified the binary or nonbinary target character(s) in each narrative, and sampled the first reference to that character in each sentence as the token for analysis.
Our basic unit of analysis was the sentence, defined by orthographic conventions: the unit in writing that begins with a capital letter and ends with a period, exclamation mark or question mark. In most cases the sentence corresponded with one or more syntactic clauses, but in a few cases it was only a fragment (e.g., "Aren't scientists, developers, and such an international community of mostly liberal-leaning people who eat this woke stuff up? Not so much." (Greene, 2020). The only exception was multiple sentences in a quote, which was condensed into a single line, along with any nonquoted material (e.g. "she said"). Quotes were excluded from our sample of tokens to analyze, but were included for the purpose of identifying the discourse status of referents.
We chose the sentence as the unit because it is a natural unit in writing, can be reliably identified, and provides a reasonable unit of analysis to approximate the effects of discourse status. Even though some scholars operationalize the effect of the prior context in terms of clauses (e.g., Grosz, Weinstein, & Joshi, 1995), discourse status effects emerge from all prior context, where the recent context has the strongest effects (e.g., Ariel, 1990;Arnold, 1998).
In each sentence, we identified the first mention of the target referent and marked the way it was referred to (pronoun vs. name/description), which served as the dependent measure.
Our analysis only examined direct mentions of the target in the singular, either with names, descriptions, or pronouns. We excluded possessives, reflexive pronouns, and any plural mention that included the target; these expressions are driven by different constraints than singular nonpossessive pronouns. We also excluded the first mention of each character in the article, which could not be pronominal.
We coded two critical predictors. First, was the referent mentioned in the previous sentence?
"Given" referents were those that received any sort of mention, including mention with a possessive or reflexive pronoun, as a part of a plural expression, or within a quote. If it was not mentioned in the previous sentence or quote it was coded as "New". Second, were there any other characters mentioned in the previous sentence? We identified the presence of any other characters, and also whether they would qualify as a potential referent for a pronoun in that condition. Each token was coded by at least two experimenters and any discrepancies were discussed.

Analysis
We analyzed the rate of pronoun use as a binary dependent measure, calculated as the number of pronouns (they, she, he) in each category out of the total number of references, including also names and descriptions (e.g., the man, Argueta). Thus our dependent measure was pronoun (coded 1) vs. other form of reference (coded 0). We used SAS proc glimmix to perform a mixed effects logistic regression, with author as the random effect. The predictor variables were grandmean centered. Analysis 1 tested predictor gender condition (nonbinary vs. binary), discourse condition (given vs. new) and the gender × discourse interaction.
Analysis 2 tested whether the effect of gender occurs even when ambiguity is controlled for.
We operationalized potential ambiguity as whether there was a competitor referent that would make a pronoun ambiguous: for the binary condition this was a same-gender competitor; for the nonbinary condition we counted any potential referent for they, including plural 3 rd -person referents, individuals who use they/them pronouns, or any two people who might be combined in a plural reference. For examples of potential ambiguity, see Appendix B.

Results and Discussion
Results (see Figure 1) demonstrated that writers used pronouns more for given referents than new, and more for binary than nonbinary referents. Our statistical analysis (see Table 1) revealed that both givenness and gender effects were significant.
A potential concern is that the nonbinary pronouns may be more likely to be ambiguous than binary pronouns, and indeed this pattern obtains in our dataset (34% potentially ambiguous tokens in the nonbinary condition and 10% in the binary condition). If pronoun use is driven by ambiguity, this imbalance could explain our contrast between binary and nonbinary conditions.
If it isn't, we should see the same gender effect even when we control for potential ambiguity.
To test this, we added potential ambiguity to the model in Table 1, coded as presence of competitor in previous sentence vs. not (see Table 2). Yet even accounting for ambiguity, we still observed an overall lower rate of they use than he/she use, and a significant effect of gender (see Table 3). Potential ambiguity had no effect. Further evidence that ambiguity is not driving the contrast between binary and nonbinary conditions comes from an analysis of only the unambiguous tokens (where we have the most data), which reveals significant effects of Givenness (b = 2.19 (.46), t = 4.73, p < .001) and Nonbinary condition (b = -0.66 (.30), t = -2.23, p = .03), but no interaction (t = 0.46, p = 0.64).  Table 2: Rate of pronoun use according to the gender of the target (binary vs. nonbinary), discourse condition (given vs. new), and whether there were any potential competitors for the pronoun. A potential competitor was defined as a same-gender character in the binary condition, and as a potential referent for "they" in the nonbinary condition.

General Discussion
Our primary finding was that nonbinary singular they was produced at a significantly lower rate than the binary singular pronouns he and she. Even though nonbinary they has not been adopted by all users of English (Graf & Geiger, 2019), this sample represents a set of writers who are willing to use they in this context. Even so, the relative rate of names or descriptions was higher for reference to nonbinary than binary individuals.
Our second finding was that the difference between binary and nonbinary pronoun use cannot be explained by a context-dependent ambiguity-avoidance strategy. Writers used they around 34-37% of the time for given referents, regardless of whether the context included another potential referent for they. Thus, writers are not selectively avoiding they in contexts where it would be ambiguous.
We speculate that nonbinary they may be suppressed for one or more of the following three reasons. First, we consider and dismiss the idea that ambiguity avoidance may operate as a general strategy: writers may be aware that they often has multiple possible referents, and so may generally suppress it (and not just specifically as a function of contextual ambiguity). However, a purely ambiguity-driven strategy would predict a resistance to they for both plural and nonbinary uses, which is inconsistent with the fact that they is one of the most frequent words in English (Davies, 2008). This suggests that they-avoidance is specific to the relatively new nonbinary case, consistent with the intuition that this new usage is hard for some people (Joyner, 2019).
A second possibility is that the newness of nonbinary they may inhibit the activation of this form during reference production. Word choice is a competitive process (e.g. Dell, 1986;Britt, Ferrara & Mirman, 2016;Griffin & Bock, 1998;Jescheniak & Levelt, 1994), and word production leads to co-activation of near synonyms (e.g., sofa and couch; Jescheniak & Levelt, 1998;Peterson & Savoy, 1998  This process of word production depends on a connection between the word and the concept. For newly-learned concepts, this link may be weak, for example word production is slower for bilinguals and for low frequency words (Gollan et al., 2005). Binary genders are highly practiced and familiar categories to most people, and he/she pronouns are frequently used as singular pronouns. Thus, both she and Kamala are likely to be activated as potential expressions for referring to the Vice President, and the production system selects the best match for the discourse context. By contrast, familiarity with nonbinary pronouns is variable; only 22% of American adults report having heard a lot about the use of gender-neutral pronouns, 38% say they have heard a little, and 39% say they haven't heard anything at all (Geiger & Graf, 2019). Thus, nonbinary they may have reduced activation for producers who are less familiar with either the usage or the nonbinary concept. In essence, speakers may act like second language learners for this one concept, but their native name-producing ability is still intact. If so, the name may be the most highly activated word and be selected for production. This on-line production process may also influence the writer's intuition about which forms are most appropriate, and additionally influence the editing stage.
Third, speakers and writers may also take into account the likelihood that their audience is familiar with nonbinary they. If writers are not sure that all readers will be comfortable with nonbinary they, they may use names even in contexts where pronouns are more appropriate.
Such an effect would be an example of audience design, that is, the process of designing one's utterances for the addressee. However, research shows that audience design effects can be variable. On the one hand, speakers and writers do take their audience into account in multiple ways, and in particular in the production of contextually-interpretable referential forms (e.g., -Schmidt & Tanenhaus, 2006;Clark & Krych, 2004;Clark & Wilkes-Gibbs;1986;Ferreira et al., 2005). Thus, it is plausible that writers avoid low-frequency usages such as nonbinary they. On the other hand, there are limits on speakers' consideration of the addressee's needs (e.g., Engelhardt et al., 2006;Ferreira & Dell, 2000;Kraljic & Brennan, 2005). Moreover, ambiguity avoidance is not the primary determinant of reference form. Even though some studies have reported a tendency for English speakers to avoid binary pronouns in same-gender context, this finding does not always emerge in empirical studies (e.g. Rosa & Arnold, 2017, Exp. 1). In our sample here, we did not observe any effect potential ambiguity, likely because our sample had very few tokens occurring in potentially ambiguous contexts. Yet we still observed substantial variation across discourse contexts, demonstrating that reference form choice is not driven purely by ambiguity considerations.

Brown
Our study examined written and published articles, in contrast with many studies that focus on spontaneous spoken language. This means that the wording may result from editing, and may be more influenced by metalinguistic judgments about ideal phrasing than by the momentary activation of information during online processing. Moreover, many of our articles were specifically about the topic of nonbinary gender and the use of they, and not about people who just happened to be nonbinary. Thus, writers were likely thinking about nonbinary they, and perhaps more likely to use it than usual. Yet we still observed a relative suppression of nonbinary vs. binary pronouns.
This analysis also provides a benchmark for the unfolding change in the English pronominal system. The articles we analyzed appeared from 2015 to 2020, when singular they was relatively new. We predict that as the rate of nonbinary references increases over time, so will facility with this form, supporting the future production of they at a similar rate as he and she.

Additional File
The additional file for this article can be found as follows: • Supplementary Material. Appendix A and B. DOI: https://doi.org/10.5070/G601183.s1