Skip to main content
eScholarship
Open Access Publications from the University of California

Glossa Psycholinguistics

Glossa Psycholinguistics banner

A Meta-analysis of Syntactic Satiation in Extraction from Islands

Published Web Location

https://doi.org/10.5070/G60111425Creative Commons 'BY' version 4.0 license
Abstract

Sentence acceptability judgments are often affected by a pervasive phenomenon called satiation: native speakers give increasingly higher ratings to initially degraded sentences after repeated exposure. Various studies have investigated the satiation effect experimentally, the vast majority of which focused on different types of island-violating sentences in English (sentences with illicit long-distance syntactic movements). However, mixed findings are reported regarding which types of island violations are affected by satiation and which ones are not. This article presents a meta-analysis of past experimental studies on the satiation of island effects in English, with the aim of providing accurate estimates of the rate of satiation for each type of island, testing whether different island effects show different rates of satiation, exploring potential factors that contributed to the heterogeneity in past results, and spotting possible publication bias. The meta-analysis shows that adjunct islands, the Complex NP Constraint (CNPC), subject islands, the that-trace effect, the want-for construction, and whether-islands reliably exhibit satiation, albeit at different rates. No evidence for satiation is found for the Left Branch Condition (LBC). Whether context sentences were presented in the original acceptability judgment experiments predicts the differences in the rates of satiation reported across studies. Potential publication bias is found among studies testing the CNPC and whether-islands. These meta-analytic results can be used to inform debates regarding the nature of island effects and serve as a proof of concept that meta-analysis can be a valuable tool for linguistic research.

Main Content

1. Introduction

Linguists have long relied on acceptability judgments by native speakers, collected either introspectively or experimentally, to inform syntactic theory (Schutze, 1996). Recent studies found that acceptability judgments are affected by a pervasive phenomenon called syntactic satiation, or simply satiation: participants in acceptability rating experiments rate degraded sentences as increasingly acceptable as they see more instances of such sentences (Braze, 2002; Hiramatsu, 2001; Snyder, 2000, inter alia). The satiation effect has recently drawn much attention, and there is an abundance of experimental studies testing whether various unacceptable sentence types satiate in English (Braze, 2002; Chaves & Dery, 2014, 2019; Crawford, 2012; Do & Kaiser, 2017; Francom, 2009; Goodall, 2011; Hiramatsu, 2001; Lu et al., 2021, 2022; Snyder, 2000, 2022a; Sprouse, 2009) and other languages (Abugharsa, 2016; Brown et al., 2021; Goodall, 2011; Myers, 2012; Sommer, 2022). However, the satiation literature is replete with mixed empirical findings and non-replications regarding which sentence types are affected by satiation (see Snyder, 2022b, for a review), thus hindering the development of a coherent theoretical picture.

Quantitative meta-analysis offers the promise of remedying this situation. Meta-analysis is a statistical procedure for synthesizing information from multiple studies, thereby providing more precise estimates of an effect size than any single study (Borenstein et al., 2009). In addition, meta-analysis also allows the researcher to investigate the extent to which an effect varies across studies, and if so, to test whether certain study characteristics systematically produce different results. Meta-analysis is a commonly employed method in a wide variety of fields, including medicine (Haidich, 2010; L’Abbé et al., 1987), education (Glass, 1976; Slavin, 1984), criminology (Pratt, 2010; Turanovic & Pratt, 2021), business (Kirca & Yaprak, 2010), ecology (Gurevitch et al., 2001), and even other areas of psycholinguistics (Bergmann et al., 2018; Cao & Lewis, 2022; Cao et al., 2023).

In the current study, we present a meta-analysis of past findings in the satiation literature, with the aim of assessing which sentence types reliably satiate. Specifically, we limit our focus to past studies examining the satiation of island effects: the degradation in acceptability of sentences that include long-distance syntactic movement operations that are illicit according to standard syntactic theories (Ross, 1967). There are two reasons to restrict the domain of study to island effects. First, the vast majority of past experimental studies on satiation has exclusively tested sentences with island violations, making island effects the only syntactic domain where a meta-analysis has sufficient statistical power and thus the potential to be informative. Second, findings from the literature on satiation have been used to adjudicate between different theories in the domain of island effects. Therefore, the results of a meta-analysis on the satiation of island effects could help inform theoretical claims in the island literature.

In the remainder of this section, we introduce island effects and the satiation effect, respectively. We then provide statistical background on meta-analysis and report a meta-analysis we conducted on 25 island satiation experiments in Section 2. Finally, we discuss the implications of these results in Section 3, focusing especially on their potential for adjudicating between grammatical and processing accounts of island effects.

1.1 Island effects

There is a long-standing generalization that certain structural domains restrict syntactic movement operations, a phenomenon termed island effects (Ross, 1967). Attempting to extract from islands results in degraded sentence acceptability, as in the examples in (1), all involving illicit wh-movement.1

    1. (1)
    1.   Island-violating sentences                                                                  (Snyder, 2000, p. 576)
    1.  
    1. a.
    1.   The Left Branch Condition
    2. *How manyi did John buy ti books?
    1.  
    1. b.
    1.   Adjunct island
    2. *Whoi did John talk with Mary after seeing ti?
    1.  
    1. c.
    1.   The Complex NP Constraint (CNPC)
    2. *Whoi does Mary believe the claim that John likes ti?
    1.  
    1. d.
    1.   Subject island
    2. *Whati does John know that a bottle of ti fell on the floor?
    1.  
    1. e.
    1.   The that-trace effect
    2. *Whoi does Mary think that ti likes John?
    1.  
    1. f.
    1.   The want-for construction
    2. *Whoi does John want for Mary to meet ti?
    1.  
    1. g.
    1.   Whether-island
    2. *Whoi does John wonder whether Mary likes ti?

The nature of these island effects has been a long-standing source of debate in the linguistic literature. The degraded acceptability of island-violating sentences like (1a-g) has been variably attributed to constraints in grammar (Bresnan, 1976; Chomsky, 1964, 1973, 1977, 1986; Huang, 1982; Rizzi, 1990; Ross, 1967; Sag, 1976) or processing (Culicover et al., 2022; Hofmeister et al., 2013; Hofmeister & Sag, 2010; Kluender, 1991; Kluender & Kutas, 1993).2 Grammar-based approaches to island effects claim that the sentences in (1) are ungrammatical because they violate certain grammatical constraints (e.g., the Subjacency Condition, the Phase Impenetrability Condition, etc.). Processing-based approaches to island effects claim that the sentences in (1) are grammatical, but are unacceptable due to the high processing burdens they incur (analogous to the difficulty in processing center-embedding sentences).

In addition to the debate over whether island effects are best explained as the result of grammatical or processing factors, there is a lack of consensus regarding whether certain islands form natural classes. For example, some have grouped subject and adjunct islands together as a natural class in opposition to the other island types, and have attributed the two distinct classes of islands to two different constraints in the grammar (Chomsky, 1986; Huang, 1982; Nunes & Uriagereka, 2000). Others, however, reject this grouping, either proposing that subject and adjunct island effects involve different grammatical constraints (Haegeman et al., 2014; Hiramatsu, 2001; Stepanov, 2007), or arguing for a unifying (syntactic, information structural, or processing-based) account of a larger set of island effects, including but not limited to subject and adjunct island effects (Abeillé et al., 2020; Bošković, 2016; Erteschik-Shir, 1973; Goldberg, 2013). In sum, the island literature lacks a consensus on the source and nature of island effects.

1.2 Satiation

The effect of repeated exposure on the perceived acceptability of island-violating sentences has been brought to bear on the debate over grammatical vs. processing accounts of island effects. A linking assumption that has been (implicitly or explicitly) adopted by some. is that degraded acceptability due to grammatical violations should not be affected by repeated exposure. In contrast, if the source of degraded acceptability is processing difficulty, exposing participants to similar sentences of the same type should increase familiarity with this sentence type and ease the associated processing burden. In turn, acceptability should increase with repeated exposure (i.e., show the satiation effect).3 If we accept this linking hypothesis, whether or not the acceptability of island-violating sentences increases with exposure can been used to diagnose whether certain island effects are grammatical or the result of processing constraints.

Acceptability increase after exposure, or the satiation effect (Stromswold, 1986), was first demonstrated experimentally for island-violating sentences in Snyder (2000).4 The observation of satiation effects in island-violating sentences has subsequently been interpreted as evidence for the extra-grammatical nature of islands, including the Complex NP Constraint (Hofmeister & Sag, 2010), the superiority effect (Hofmeister et al., 2011), and subject islands (Chaves, 2022; Chaves & Dery, 2014, 2019).

Other studies assume a different linking hypothesis for the satiation effect, whereby certain grammatical constraints may also be “unlearned” or weakened throughout repeated exposure to sentences violating those constraints, and differences in satiation profiles reflect differences in the types of grammatical constraints involved (Braze, 2002; Goodall, 2011; Hiramatsu, 2001; Snyder, 2000). Assuming this hypothesis, satiation results cannot be used to inform the grammar vs. processing debate regarding the nature of islands. Instead, satiation can be used to probe for natural classes formed by different islands. If two island types show different patterns of satiation (e.g., one satiates, while the other does not), they are assumed to have different underlying sources of unacceptability and should not be grouped together as a natural class. This linking hypothesis underlies the argument against Huang’s (1982) proposal that subject and adjunct islands form a natural class (Hiramatsu, 2001; Stepanov, 2007).

In sum, while assuming slightly different linking hypotheses, the satiation effect has been used as evidence in multiple debates surrounding the nature of island effects.5 In the current study, our goal is not to further complicate the picture by taking sides in any of these debates. We also remain agnostic about which linking hypothesis for satiation is appropriate. Instead, we aim to clarify the empirical landscape on satiation so that satiation can be better leveraged as a source of evidence.

As mentioned earlier, the satiation literature abounds with mixed findings. Statistically significant satiation of island effects has been observed with some experimental procedures and items, but these effects have been inconsistent (see Snyder, 2022b, for a comprehensive overview). For example, the adjunct island is found to satiate in Chaves & Putnam (2020), but not in Crawford (2012); Francom (2009); Hiramatsu (2001); Snyder (2000); Sprouse (2009, inter alia). For a comprehensive list of past satiation studies and whether they observed satiation in each island type, the reader is referred to Tables 4 and 5 from Snyder (2022a). The current study reports a quantitative meta-analysis of past satiation studies, with the aim of summarizing and aggregating past findings in the service of assessing which island types reliably satiate.

2. A meta-analysis of island effect satiation

Meta-analysis is a way to systematically synthesize evidence from multiple studies to estimate effect size more precisely than is possible in an individual study, and discover inconsistencies between studies (Borenstein et al., 2009). This practice is particularly important in a field like linguistics, where many experimental studies are plagued by low statistical power, leading to mixed results and non-replications (Vasishth & Gelman, 2021; Vasishth et al., 2018).

The main goal of a meta-analysis is to give an effect size estimate informed by the effect size estimates from multiple studies. In particular, the effect size estimate is taken to be the average of all effect sizes weighted by how reliable or informative they are, based on their variance. Different statistical models can be used to achieve this goal, which makes different assumptions about the homogeneity of the effect: fixed-effects or random-effects meta-analytic models.6

The fixed-effects model provides an estimate of the effect size θ^i as an average of each study’s point estimate of the effect μ^ weighted by the inverse of the variance of the data, as shown in (E1). This weighted average approach is intuitively justified: larger and more informative studies with less variance should be given more weight in the model compared to smaller studies with greater variance.

μ ^ = i 1 σ ^ i 2 θ ^ i i 1 σ ^ i 2       (E1)

In many cases, the estimated effect may vary across the studies included in a meta-analysis due to differences in experimental methods or the sampling process. The fixed-effects meta-analytic model, which assumes a single population effect size for all studies, does not take such heterogeneity into account. In contrast, a random-effects meta-analytic model assumes the population effects of all studies come from a normal distribution with mean μ and standard deviation τ, as shown in (E2).

θ i = μ + ϵ i , where  ϵ i ~ N ( 0 , τ 2 )       (E2)

The model then provides estimates for τ in addition to the effect estimate μ. As shown in (E3), the random-effects meta-analytic model estimates the effect size as an average of each study’s point estimate, weighted by the inverse of each study’s variance adjusted by the estimated between-study variance τ^2.

μ ^ = i 1 τ ^ 2 + σ ^ i 2 θ ^ i i 1 τ ^ 2 + σ ^ i 2       (E3)

Since there is variation in methods and designs across satiation studies, we use the random-effects model instead of the fixed-effects model for our meta-analysis.

The random-effects meta-analytic model does not provide a structured analysis of the factors that contribute to the cross-study heterogeneity. To investigate heterogeneity, one can include different moderators (study-level factors that may affect effect size) to form a mixed-effects meta-analytic model as in (E4), where xij represents the jth moderator for the ith study. From the parameter estimates β1 to βj, we can infer whether the moderators influence the effect size.

θ i = μ + β 1 x i 1 + β 2 x i 2 + + β j x i j + ϵ i , where  ϵ i ~ N ( 0 , τ 2 )       (E4)

Finally, meta-analyses can also be used to detect publication bias, the tendency for studies reporting null results to not get published. One simple way to do so is by creating a funnel plot: a scatter plot of effect size against standard error (Light & Pillemer, 1984).7 In absence of heterogeneity or publication bias, studies with smaller standard errors are expected to have effect sizes closer to the meta-analytic estimate, and those with larger standard errors to spread out further. Thus, the scatter plot should show a funnel shape (hence the name funnel plot), as shown in the hypothetical plot in Figure 1a. The white funnel-shaped area in the plot represents the 95% confidence interval for the observed effects calculated based on the meta-analytic estimate and the standard error, and serves as a visual aid for what the funnel-shaped distribution should look like in the absence of any publication bias. In contrast, if there is publication bias, studies with positive effect estimates are more likely to be reported. As a result, the funnel plot should show an asymmetric distribution in the shape of a right-skewed triangle rather than a funnel. An example is shown in Figure 1b. The existence of a publication bias can be statistically confirmed using Egger’s regression test on funnel plot asymmetry (Egger et al., 1997), which detects a correlation between the effect size and standard error. A significant correlation suggests that the funnel plot is asymmetric and that there is potentially a publication bias.

Figure 1: Hypothetical funnel plots showing standard error against effect estimate for each individual study. The dashed vertical line indicates the meta-analytic effect size estimate.

In the study reported below, we conducted a meta-analysis of satiation in seven different island types. We report analyses of satiation effect estimates, heterogeneity, and publication bias for each island type.

2.1 Method

2.1.1 Dataset selection

The study selection process is summarized in Figure 2, which depicts a PRISMA flow chart (Moher et al., 2009). Our goal was to include as many studies as possible on island satiation effects in English. To this end, we first collected all results returned on the first 20 pages of Google Scholar with the search keywords “syntactic satiation” (200 entries). Excluding 3 duplicate entries, 3 non-English entries, and 20 entries without links to full text or abstract, we then screened the abstracts of the remaining 174 entries and excluded 150 entries that did not report any experimental study on satiation. This narrowed the selection down to 24 entries. After accessing the full text of these 24 entries, we excluded 5 entries that did not study the satiation of island effects in English, and 3 entries whose experimental results were also reported in other publications by the same authors.

Figure 2: PRISMA flow chart summarizing the study selection process.

Since not all studies used the same method of data processing and statistical analysis, we reached out to all authors for the raw data files, so that we could extract effect size estimates from the data in a systematic and comparable manner. Our final analysis included all selected papers whose raw data files were kindly made available by the authors, in addition to those that directly reported the measurements we planned to use in the meta-analysis (unit satiation per repetition), which we introduce in detail in the next section. Five papers were excluded from the meta-analysis, because the relevant effect sizes were not reported and could not be computed from the reported statistics, and the data files were not made available.

For the purpose of standardizing effect estimates across different studies, we only summarized experiments that employed acceptability judgment experiments with closed rating scales.8 Although this meta-analysis was not pre-registered, all inclusion criteria are determined before computing meta-analytic estimates. The following studies were included based on our selection criteria: Snyder (2000), Experiments 1 and 2 of Francom (2009), the three replication experiments reported in Section 2 of Sprouse (2009) (labeled Exp.1a, b, c, respectively, in the discussion below), and the two experiments reported in Section 4 of the same paper (labeled Exp.2a and b below), Experiments 1 and 2 of Francom (2009), Hofmeister & Sag (2010), Crawford (2012), Experiments 1 and 2 of Chaves & Dery (2014), Experiment 1 of Do & Kaiser (2017) (the two sub-experiments “lag 1” and “lag 5” in the original paper are labeled as Exp.1a and Exp.1b, respectively, in the discussion below for simplicity), Chaves & Dery (2019), the experiments reported in 6.2.2 (labeled Exp.1 below), 6.2.4 (the two sub-experiments labeled Exp.2a and 2b below), and 6.2.5 of Chaves & Putnam (2020) (labeled Exp.3 below), Experiments 1 and 2 of Lu et al. (2021), Experiments 1 and 2 of Lu et al. (2022), and Experiments 1 and 3 of Snyder (2022a).9 See Table 1 for a summary of the studies included and their reported findings. Please note that the studies in Table 1 have vastly different sample sizes, and used different data processing and hypothesis testing methods to reach their conclusions on which island effects satiate. Therefore, any “vote-counting” procedure by comparing the numbers of studies finding significant satiation effects for each island type would not be informative.

Table 1: Summary of all studies included in the meta-analysis. For some studies, experiment numbering is added for ease of presentation (see section 2.1.1 for details). Each acronym refers to a type of island effect tested. AI: adjunct island; CNPC: Complex NP Constraint; CSC: coordinate structure constraint; LBC: left branch condition; RC: relative clause island; SI: subject island; TT: that-trace effect; WF: want-for construction; WI: whether-island.

Study Sample size Reported satiating island(s) Reported non-satiating island(s) Repetition Scale Context
Snyder (2000) 22 CNPC, SI*, WI AI, LBC, TT, WF 5 Binary Yes
Francom (2009) Exp.1 205 SI, WF, WI AI, CNPC, LBC, TT 5 Binary No
Francom (2009) Exp.2 17 SI CNPC, AI, TT, LBC 8 Binary No
Sprouse (2009) Exp.1a 21 AI, CNPC, LBC, SI, TT, WF, WI 5 Binary No
Sprouse (2009) Exp.1b 21 AI, CNPC, LBC, SI, TT, WF, WI 5 Binary No
Sprouse (2009) Exp.1c 22 AI, CNPC, LBC, SI, WI 5 Binary No
Sprouse (2009) Exp.2a 25 AI, CNPC, CSC, LBC, RC, SI, WI 10 Binary No
Sprouse (2009) Exp.2b 25 AI, CNPC, RC, WI 10 Binary No
Hofmeister and Sag (2010) 22 CNPC 31 Multi-point No
Crawford (2012) 22 WI AI, SI 7 Multi-point Yes
Chaves and Dery (2014) Exp.1 60 SI 20 Multi-point No
Chaves and Dery (2014) Exp.2 51 SI 14 Multi-point No
Do and Kaiser (2017) Exp.1a 44 CNPC SI 6 Multi-point No
Do and Kaiser (2017) Exp.1b 40 CNPC, SI 6 Multi-point No
Chaves and Dery (2019) 48 SI 22 Multi-point No
Chaves and Putnam (2020) Exp.1 74 SI 16 Multi-point No
Chaves and Putnam (2020) Exp.2a 40 AI 12 Multi-point No
Chaves and Putnam (2020) Exp.2b 40 AI 6 Multi-point No
Chaves and Putnam (2020) Exp.3 106 AI 24 Multi-point No
Lu, Lassiter, and Degen (2021) Exp.1 106 CNPC, SI, WI 15 Multi-point Yes
Lu, Lassiter, and Degen (2021) Exp.2 102 CNPC, SI, WI 15 Multi-point Yes
Lu, Wright, and Degen (2022) Exp.1 294 SI, WI 12 Multi-point Yes
Lu, Wright, and Degen (2022) Exp.2 311 SI, WI 12 Multi-point Yes
Snyder (2022) Exp.1 20 WI CNPC, LBC, SI 5 Binary Yes
Snyder (2022) Exp.3 151 CNPC, WI AI, LBC, SI, TT, WF 5 Binary Yes

    *: Only marginal significance reported.

2.1.2 Meta-analytic methods

We first grouped the selected studies by the island types they investigated. When a paper contained multiple experiments, each experiment was treated as a different study. When a study tested multiple variants of the same island effect type (e.g., the subject island effect induced by extraction from DP subjects vs. CP subjects), the different variants were treated as the same island type for the purpose of our analysis. A total of seven island types (those shown in (1)), each studied in at least three experiments, were included in the meta-analysis.

We defined satiation as a positive main effect of the number of repetitions of an island-violating sentence type on the acceptability of that sentence type. We chose the number of repetitions, instead of the overall experimental trial number, as the predictor, because different studies included different numbers of fillers, and not all raw data files provided by the authors contained filler information. We detected satiation as a main effect of repetition number of island-violating sentences, rather than as the change in contrast between island-violating sentences and a non-satiating grammatical control, because various studies included in the meta-analysis either did not include a grammatical control condition or included non-satiating control conditions in their design, but did not include the results for the control sentences in the data files made available. For consistency, we only analyzed the change in absolute acceptability ratings of the island-violating sentence types.10

Many meta-analyses are conducted across standardized unitless effect size measures (e.g., Cohen’s d) to ensure comparability of effects across studies. However, we are interested in a particular effect size that has interpretable units: change in acceptability (0–1) per sentence repetition. If we computed standardized effects as is typical, we would run the risk of combining effects from studies with different definitions for satiation, using different satiation manipulations, and measuring acceptability with different scales. In contrast, our acceptability per repetition effect is much more interpretable in its magnitude across studies than a standardized effect. Therefore, we depart from standard meta-analyses, and compute effect sizes and measures of variation directly from raw data files of the original studies, either available in the public domain or provided by the authors upon request.11

To compute the quantity needed for meta-analysis, we first linearly transformed the acceptability ratings for each study to a value between 0 and 1 through min-max scaling, with 0 representing the “completely unacceptable” endpoint of the scale (or “ungrammatical” in binary judgment tasks), and 1 representing the “completely acceptable” endpoint (or “grammatical” in binary judgment tasks). For studies that directly reported the repetition number effects on acceptability ratings, we directly used the reported estimates, standard errors, and sample sizes in the meta-analysis. For the rest of the studies, we fit linear mixed-effects regression models predicting the adjusted acceptability ratings of each island-violating sentence type with a fixed effect of repetition number. Each model also included random by-participant and by-item intercepts and slopes for the fixed effect when the participant and item information was provided in the data files. In cases of non-convergence, random effects with the least variance were removed until convergence. We recorded the repetition number effect estimates and standard errors for meta-analysis.12

Using the metafor package (Viechtbauer, 2010) in R, we fit a random-effects meta-analytic model for studies testing each island effect type. Then, Cochran’s Q test (Cochran, 1950) was used to detect any cross-study heterogeneity. In case of significance, we fit a mixed-effects meta-analytic model to examine different moderators as possible sources of heterogeneity. Snyder (2022a) speculated that differences in scale types, total numbers of repetitions, and the use of context sentences13 could contribute to different findings in satiation experiments. Therefore, we included these three factors as moderators.

Finally, a funnel plot (a plot of standard errors against point estimates) was created for each island type. The Egger’s regression test (Egger et al., 1997) on funnel plot asymmetry was conducted to detect publication bias.

2.2 Results

Below, we report the meta-analysis results for satiation studies examining the seven different island effects listed above: the Left Branch Condition (LBC), adjunct islands, the Complex NP Constraint (CNPC), subject islands, the that-trace construction, the want-for construction, and whether-islands.

2.2.1 Satiation effect estimates

Figure 3 summarizes the satiation effect estimates and 95% confidence intervals of the estimates obtained from the random-effects meta-analytic models for each island type. The effect sizes represent the estimated increase in acceptability on a 0–1 scale per repetition. A positive effect estimate with a 95% confidence interval not overlapping with 0 is taken as evidence for satiation. Based on the random-effects meta-analysis, we found significant evidence for the satiation of adjunct islands, the CNPC, subject islands, the that-trace construction, the want-for construction, and whether-islands. There was no evidence for the satiation of the LBC. Figures 4, 5, 6, 7, 8, 9, 10 are forest plots summarizing all selected studies that tested each of the seven island types.

Figure 3: Forest plot summarizing estimates of satiation rate for all seven island types. Error bars represent 95% CIs.

Figure 4: Forest plot of studies testing subject island satiation. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

Figure 5: Forest plot of studies testing the satiation of CNPC. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

Figure 6: Forest plot of studies testing whether-island satiation. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

Figure 7: Forest plot of studies testing adjunct island satiation. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

Figure 8: Forest plot of studies testing the satiation of the LBC. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

Figure 9: Forest plot of studies testing the satiation of the that-trace construction. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

Figure 10: Forest plot of studies testing the satiation of the want-for construction. Effect size captures acceptability increase per repetition. Error bars represent 95% CI, and the area of squares represents the weight given to each study based on its standard error and the estimated cross-study variance.

For the satiating island types, the estimated effect magnitudes may seem small, but are generally on par with contrasts reported in the past literature between the various island-violating sentence types and their corresponding grammatical controls. For example, for subject island sentences, we found a 0.0168 increase per repetition on a 0–1 scale, which translates to a 0.1008 increase on a 7-point Likert scale, or 1.008 after ten repetitions. Indeed, Chaves & Dery (2014) reported a 1.065 contrast between subject island sentences and grammatical controls on a 7-point Likert scale, almost exactly the expected acceptability gain based on our estimates. For whether-island sentences, we found a 0.0303 increase per repetition on a 0–1 scale. The contrast between the mean ratings for whether-island sentences and the grammatical control, as reported in Lu et al. (2021), is 0.36 on a 0–1 scale, which again represents roughly the expected acceptability gain after 11–12 repetitions.14

In Figure 3, there appear to be varying rates of satiation among the six island effect types that do satiate. This observation is confirmed by an analysis whereby we pooled the results of the satiating island types (all but the LBC) and fit a mixed-effects meta-analytic model predicting the rate of satiation with island type as a Helmert-coded predictor (ordered by satiation effect size estimates from small to large, as shown in Figure 3). The analysis revealed significance for subject islands (z = 2.24, p < 0.05) and whether-islands (z = 2.06, p < 0.05), suggesting that these two island types show significantly higher satiation rates compared to the means of the previous levels. We can thus conclude that among the satiating island types, there are varying rates of satiation.15

2.2.2 Heterogeneity

The random-effects meta-analytic models revealed significant cross-study heterogeneity in three of the seven island types tested: subject islands, the CNPC, and whether-islands. Results from the Cochran’s Q test of heterogeneity for each island type are shown in Table 3. For the three islands showing significant heterogeneity, we fit mixed-effects meta-analytic models with three moderators to explore the sources of heterogeneity: context (whether or not a context sentence was provided in the experiment), the total number of repetitions, and scale type (binary vs. multi-point). Categorical moderators were sum-coded. The moderator analyses results are shown in Table 4. For subject islands and whether-islands, inclusion of a context sentence in the task resulted in greater satiation effects. None of the other moderator-island pairs reached significance.

For the subject island studies and the whether-island studies, we further conducted a subset analysis. For subject island studies, a random-effects meta-analytic model estimated a 0.0241 increase in acceptability per repetition (95% CI = [0.0103,0.0379]) when context is provided, and a 0.0109 increase per repetition (95% CI = [0.0059,0.0159]) when context is absent; for whether-island studies, a random-effects meta-analytic model model estimated a 0.0364 increase in acceptability per repetition (95% CI = [0.0113,0.0616]) when context is provided, and a 0.0201 increase in acceptability per repetition (95% CI = [–0.0108,0.0510]) when context is absent.

For all three island types, there was significant residual heterogeneity even when the three moderators were included in the meta-analytic model, suggesting that additional moderators might modulate the rate of satiation.

2.2.3 Publication bias

To assess whether these results are affected by publication bias, Figures 11a–11g contain funnel plots for the selected studies of each island type. The funnel plots for CNPC and whether-island are visibly asymmetric.16 The asymmetry is further confirmed by Egger’s regression test results, as shown in Table 2. This suggests that there is possible publication bias among the CNPC and whether-island studies, so we should take the positive effect estimates for these two island types with a grain of salt.

Table 2: Egger’s regression test results for each island type. Marginally significant effects (suggesting possible publication bias) are shaded in light gray.

Island type Egger’s results
t p
Subject Island 1.36 <0.190
Complex NP Constraint 2.08 <0.059
Whether-Island 1.91 <0.081
Adjunct Island –0.54 <0.600
Left Branch Condition 0.61 <0.564
That-trace Effect 0.74 <0.495
Want-for Construction 0.39 <0.720

Table 3: Heterogeneity measures from random-effects meta-analytic models. Statistically significant effects are shaded in gray.

Island type Heterogeneity measures
Cochran’s Q p I2
Subject Island 346.4 <0.001 94.06%
Complex NP Constraint 31.19 <0.006 69.07%
Whether-Island 326.91 <0.001 98.38%
Adjunct Island 8.03 <0.783 0.00%
Left Branch Condition 2.21 <0.978 0.00%
That-trace Construction 7.31 <0.293 9.26%
Want-for Construction 3.28 <0.657 0.00%

Table 4: mixed-effects meta-analytic model results for island types showing significant heterogeneity in the random-effects analyses. Statistically significant effects are shaded in gray.

Island Type Moderators in the mixed-effects models Residual heterogeneity
Context inclusion Scale type Repetition number
t p t p t p Cochran’s Q p
Subject Island 2.37 <0.031 –1.17 <0.259 1.02 <0.321 202.49 <0.001
Complext NP Constraint 1.12 <0.287 –1.43 <0.180 –0.08 <0.936 22.31 <0.023
Whether Island 2.99 <0.014 –1.55 <0.154 –0.34 <0.738 60.52 <0.001

Figure 11: Funnel plots for studies testing the satiation of each island type. Stars represent marginal significance in Egger’s test.

3. General discussion

3.1 Summary

In this study, we conducted a meta-analysis of past experimental studies on the satiation of island-violating sentences in English. The results of this study provide answers to the following three questions. First, which types of islands reliably satiate? Second, which island types display heterogeneity in satiation, and which factors contribute to this heterogeneity? Third, is there evidence for a publication bias in the satiation literature?

To answer the first question, random effects meta-analytic models revealed significant acceptability increases for repetition in sentences violating constraints on adjunct islands, complex noun phrase islands, subject islands, the that-trace effect, the want-for construction, and whether-islands. In contrast, the models did not reveal evidence of acceptability increases for sentences violating the Left Branch Condition (see Figure 3).

To answer the second question, significant cross-study heterogeneity was detected among the CNPC studies, the subject island studies, and the whether-island studies. Following speculation by Snyder (2022a), we tested the contribution of three moderators (presence of context sentences in the experiment, total number of repetitions, and the scale type) to the heterogeneity. For CNPC studies, there was no evidence for any of the moderators modulating satiation; for subject island studies and whether-island studies, the presence of a context sentence increased the rate of satiation, while the other two moderators did not reach significance. For all three groups of studies, residual heterogeneity persisted even when the three moderators were included in the mixed-effects model, suggesting that additional moderators contribute to cross-study heterogeneity.

One such moderator may be the number of fillers. The increase in acceptability of the critical items due to exposure might gradually decay when participants see many unrelated filler items, similar to the observed effect of number of intervening items between prime and target on the strength of priming effects. Thus, the rate of satiation might be smaller when more filler items are included in the experiment. However, since not all selected studies reported filler information in the original papers or in the data files shared by the authors, we could not include the number of fillers as a moderator. Another factor that could affect satiation is the inclusion of multiple island types in the same experiment. It has been demonstrated that exposure to one island-violating sentence type leads to a change in the acceptability of another island-violating sentence type (Lu et al., 2022).17 Therefore, it is possible that when multiple island sentence types are tested together in a single experiment, they might influence each other’s rate of satiation. Cross-study heterogeneity could then arise as a result of different studies testing different sets of island sentence types.

To answer the third question, there is possible publication bias favoring the studies reporting significant satiation for sentences violating the CNPC and the whether-island. However, the evidence is not strong and is based on marginally significant results from Egger’s regression test of funnel plot asymmetry. We did not find a funnel plot asymmetry in any other group of studies, suggesting that there is no evidence for a publication bias in those groups of studies. However, we should note that these groups tended to also be the ones with fewer studies. It is possible that for island types with fewer studies, we might not have had the statistical power to detect publication bias even if the bias exists, creating a false picture that there is no publication bias in those groups of studies. An exception is subject islands, which have the largest number of studies devoted to them and yet didn’t show evidence of publication bias.

3.2 Implications

The results of this meta-analysis are valuable in at least four different ways: in the debate over grammatical vs. processing accounts of islands, in the debate between different linking hypotheses for satiation, in the debate over whether subject and adjunct islands form a natural class in the taxonomy of islands, and in revealing the varying rates of satiation of different islands as a future direction for research.

First, the results may inform the debate over grammatical and processing accounts of islands. As discussed in Section 1, it is often implicitly assumed that when a degraded sentence type satiates, the source of the degradation should come from extra-grammatical factors like high working memory burden (Hofmeister et al., 2013; Hofmeister & Sag, 2010) or low frequency (Chaves & Dery, 2014, 2019). Assuming that this linking hypothesis is accurate (and we shall return to the possibility that it is not), the results reported here can be used to inform theories of island effects: the reliable satiation effects for adjunct islands, CNPC, subject islands, the that-trace construction, the want-for construction, and whether-islands suggests they should be considered grammatical but degraded due to processing factors. In contrast, the LBC does not satiate, and therefore should be considered ungrammatical. These grammaticality statuses pose a challenge to syntactic theories that predict the ungrammaticality of the satiating island types. These include the Government and Binding (GB) theory, which attributes island effects to grammatical constraints like the Subjacency Condition and the Empty Category Principle (Chomsky, 1986; Huang, 1982), and syntactic theories under the minimalist program that attribute island effects to the cyclic nature of spell-out and linearization (Fox & Pesetsky, 2005; Nunes & Uriagereka, 2000). In contrast, our results are, in general, compatible with syntactic theories without non-local syntactic constraints. For example, in certain versions of the Head-driven Phrase Structure Grammar (HPSG) (Boas & Sag, 2012; Michaelis, 2013), only island effects that can be framed in terms of local constraints are predicted to exist without arbitrary stipulations of filtering constraints. Most island-violating sentences, including the ones that reliably satiate according to this meta-analysis, are predicted to be grammatical in this framework (see Chaves & Putnam (2020) for a detailed discussion of the predicted island effects under this framework). In contrast, the LBC, where we found no evidence for satiation, can be captured in HPSG by an independently motivated local constraint requiring that only elements in the arg(ument)-st(ructure) feature list of a head can appear in the gap feature list of the same head. Possessors and modifiers are not part of the arg-st list of an N head, and thus cannot appear in the gap list of N (i.e., cannot be extracted from an NP, Chaves & Putnam, 2020; Runner et al., 2006; Sag, 2012). Nevertheless, one should note that the that-trace construction and the want-for construction are largely left out of the debate between processing-level and grammar-level accounts of island effects, and processing-based accounts of their pre-satiation degradedness are yet to be worked out.18 This may be particularly challenging for the want-for construction, where the declarative form (example (2)) without the long-distance dependency is also considered by many speakers to be degraded, suggesting that the unacceptability of the want-for construction cannot be attributed to a hard-to-process dependency.

    1. (2)
    1. ??I want for Mary to buy a cake.

As for the that-trace construction, past accounts mostly attributed its degradedness to syntactic constraints (e.g., the anti-locality constraint) or phonological constraints (e.g., a phonological requirement that a complementizer cannot be linearly adjacent to a gap).19 It remains unclear how the that-trace effect can be captured by a processing-level explanation. If one believes that processing-level accounts for the want-for and the that-trace constructions are implausible, the satiation of these two sentence types may also be taken as evidence against the linking hypothesis that satiation reflects grammaticality.

Second, the discussion up to this point assumes the linking hypothesis that grammaticality decides the satiability of degraded sentences. However, this hypothesis is not unchallenged. Other factors have been claimed to affect the satiability of sentences, including whether the grammatical constraint violated is part of Universal Grammar (UG) (Braze, 2002; Hiramatsu, 2001; Snyder, 2000), the surface similarity with a grammatical alternative (Sprouse, 2007), and sentence interpretability (Francom, 2009). Instead of assuming a particular linking hypothesis under which to test theories of island effects, one can also use the meta-analysis results to inform the linking hypothesis itself. For example, the hypothesis that grammatical principles in UG determine satiability is rejected by our results. Under this hypothesis, the LBC, which is the only non-satiating island type among the ones investigated, should be a principle in the UG. However, the LBC is in fact subject to variation cross-linguistically. Left branch extraction (extraction of modifiers or possessors out of an NP) is permitted in many Slavic languages (Bošković, 2005). For example, Serbo-Croatian allows the equivalent of “Whosei does Petko like ti car?” as shown in (3). This shows that the LBC cannot be a grammatical principle encoded in UG, thus rejecting the hypothesis that satiation diagnoses UG principles.

    1. (3)
    1. Left Branch Extraction in Serbo-Croation                                         (Bošković, 2005, p. 3)
    1. Čijai
    2. Whosei
    1. xaresva
    2. like
    1. Petko
    2. Petko
    1. ti
    2. ti
    1. kola?
    2. car
    1. “Whose car does Petko like?”

Consider also the linking hypothesis that satiation is underlyingly linguistic adaptation, whereby participants update their beliefs about the frequency distribution of degraded sentence types throughout acceptability judgment experiments (Lu et al., 2021, 2022). Under this proposal, participants should be able to update their expectations for any given sentence type whose underlying linguistic representations can be recovered. This predicts that any sentence type, with the exception of word-salad sentences from which no abstract linguistic representation can be recovered, should demonstrate satiation. However, our results show that the LBC sentences resist satiation altogether, which is not predicted by the vanilla version of the adaptation-based linking hypothesis. It is likely that there are other gate-keeping factors at play, restricting the satiation of an otherwise perfectly representable structure.

One possible candidate for such gate-keeping factors is the availability of grammatical alternatives: when the intended meaning of a sentence can be expressed by a more acceptable alternative sentence, satiation of the sentence is suppressed, perhaps because the participants do not adapt to sentences produced by hypothetically non-cooperative speakers. Among all seven sentence types investigated, the LBC sentences are the only type with a clear meaning-equivalent acceptable alternative. For example, the unacceptable LBC-violating sentence (4a) has the meaning-equivalent acceptable alternative (4b).

    1. (4)
    1. a.
    1. *How many did John buy books?
    1.  
    1. b.
    1.   How many books did John buy?

Whether the availability of acceptable alternatives is indeed the reason why the LBC sentences resist satiation is beyond the scope of the current study, but further studies should test this possibility.

Third, our results also have implications for the debate regarding whether subject and adjunct islands are reducible to the same underlying constraint. In syntactic theories, among the various types of island effects, adjunct islands and subject islands are traditionally considered to form a natural class. For example, Huang (1982) attributes both adjunct and subject island effects to a single syntactic principle: the Condition on Extraction Domains (CED), which states that constituents that are not properly governed restrict extraction from within. Subject DPs and adjuncts are not properly governed, and therefore are both CED islands.20 Chomsky (1986), which aims to provide an analysis for all island effects using the concept of barriers, also grouped subject and adjunct islands together. Movements out of adjuncts and subjects need to cross two barriers, whereas movements out of other islands (e.g., whether-islands, complex NPs) cross only one barrier. Studies including Hiramatsu (2001) and Stepanov (2007) question Huang’s (1982) and Chomsky’s (1986) accounts on the grounds that subject island sentences satiate while adjunct island sentences do not. Assuming a linking hypothesis under which structures with a common source of unacceptability should show a similar satiation profile (Braze, 2002; Goodall, 2011; Snyder, 2000), we would expect subject and adjunct islands to either both satiate or both resist satiation, contrary to Hiramatsu’s (2001) observation. However, as is evident in the current meta-analysis, there is reliable evidence that both subject and adjunct island sentences satiate. Although this result cannot distinguish between the CED or the barriers account, at least it provides sufficient grounds for rejecting Hiramatsu’s (2001) counterargument.

Fourth, our results point to a future direction of research. As mentioned in Section 2.2.1, there appear to be varying rates of satiation among the satiating island types. The differences in the rates of satiation might signal differences in the linguistic properties underlying these sentence types, and could potentially become a useful diagnostic tool for experimental syntacticians. Differences in rates of satiation could reflect different sources of unacceptability (see Goodall, 2011, for a similar proposal). Further research is needed to determine which factors govern the rate of satiation.

3.3 Limitations

Finally, we would like to acknowledge several limitations of the current study.

First, the results of the Egger’s test provide evidence for publication bias only among the CNPC and whether-island studies. This does not suggest that there is no publication bias for the other island types. It is possible that the Egger’s tests we conducted simply lack the statistical power to detect publication bias for the island types with fewer studies (e.g., the that-trace construction and the want-for construction both have fewer than 10 studies).

Second, the current meta-analysis reports results on the satiation of island-violating sentences (i.e., the increase in acceptability ratings for sentences that contain island violations), as opposed to the amelioration of island constraints (i.e., the amelioration of the acceptability degradation induced by violating an island constraint). This is because the current meta-analysis is constrained by the design of early satiation experiments that did not include the relevant grammatical control conditions that would enable us to probe for the amelioration of island constraints. However, we should note that the satiation of an island-violating sentence type does not necessarily reflect any change in the relevant island constraint, which is only one of various possible sources of degradation for the tested sentence type.

Below, we would like to point to a possible future direction for satiation researchers interested in teasing these two concepts apart. Recent experimental syntax studies on islands have largely adopted the “factorial design of measuring island effects” (Fukuda et al., 2022; Keshev & Meltzer-Asscher, 2019; Kim & Goodall, 2016; Ko et al., 2019; Kush et al., 2018, 2019; Lu et al., 2020; Sprouse et al., 2016, 2012; Stepanov et al., 2018, inter alia). Consider a CNPC sentence such as (5).

    1. (5)
    1. *What did you make the claim that John bought?

The degraded status of (5) could be due to any of the following factors: (a) a long-distance dependency crossing a clausal boundary, which may be cognitively taxing to process; (b) a Complex NP embedded clause structure, which may be structurally infrequent; (c) a special penalty for having a long-distance dependency crossing a Complex NP embedded clause structure (i.e., the violation of an island constraint). Under the factorial design of measuring island effects, these potential sources of degradation are teased apart using a 2 × 2 factorial design manipulating dependency distance (movement from embedded clause vs. movement from matrix) and embedded clause structure (island structure vs. non-island structure). An example set of stimuli is shown in (6), adapted from Sprouse et al. (2012).

    1. (6)
    1. An example stimuli set under the factorial design of probing for island effects
    1.  
    1. a.
    1. non-island|matrix: Who claimed that John bought a car?
    1.  
    1. b.
    1. non-island|embedded: What did you claim that John bought?
    1.  
    1. c.
    1. island|matrix: Who made the claim that John bought a car?
    1.  
    1. d.
    1. island|embedded: What did you make the claim that John bought?

If the island constraint violation contributes unacceptability, we should detect it as an interaction effect between dependency distance and clause structure, where the dependency distance penalty is larger in the island conditions than in the non-island conditions. This factorial design of measuring island effects can be valuable for future research on the satiation of island effects. If the island constraint itself is affected by repeated exposure, we should expect a three-way interaction of dependency distance, clause structure, and presentation order, in the direction where the super-additive interaction of dependency distance and clause structure decreases as presentation order increases. This would serve as a more rigorous test for whether the satiation of island-violating sentences truly reflects the amelioration of the underlying island constraints.

4. Conclusion

In this article, we present a meta-analysis of past experimental studies investigating the satiation of island-violating sentences. The meta-analysis provides effect size estimates for the satiation rates of different types of island-violating sentences, identifies the island types that display heterogeneity in satiation and the factors that modulate such heterogeneity, and identifies publication bias in the literature.

On a broader level, the current study also demonstrates that meta-analysis, already widely employed in disciplines such as medicine and psychology, can be a valuable research tool for linguists. A common issue for quantitative studies in linguistics is low statistical power due to small sample sizes or poor research design, giving rise to mixed findings and non-replications (Prasad & Linzen, 2021; Sönning & Werner, 2021; Vasishth & Gelman, 2021; Vasishth et al., 2018). Moreover, academic journals typically discourage the publication of null experimental results, leading to widespread publication bias (Roettger, 2021; Vasishth et al., 2018). The current study shows that meta-analysis can address these issues by synthesizing results from individual studies, even if they are underpowered, without the need for new experiments with substantially larger sample sizes. Moreover, as demonstrated, meta-analysis can help identify publication bias. Overall, meta-analytic methods can improve the quality and rigor of quantitative research in linguistics and should be considered an essential component of the linguistic research toolkit.

Notes

  1. In the current study, we use the terms islands and island effects purely descriptively, and do not intend them to entail that the various sentence types labeled as such share the same underlying violation. [^]
  2. Cf. Phillips (2006), Sprouse et al. (2012), and Yoshida et al. (2014) for experimental evidence against the processing-based accounts of island effects. [^]
  3. For discussions of this assumption and alternatives, see Snyder (2000), Hiramatsu (2001), Braze (2002), Hofmeister & Sag (2010), Hofmeister et al. (2013), and Goodall (2011). [^]
  4. The term syntactic satiation was first used in Stromswold (1986), and was defined as the decrease in certainty in participants’ judgments for sentences after repeated exposure. This is different from the working definition for satiation that we adopt: following the various studies since Snyder (2000), we define syntactic satiation as the increase in acceptability rating throughout repeated exposure. [^]
  5. Sprouse (2009) suggested a linking hypothesis where satiation is considered the result of an “equalization response strategy” employed by participants: they tend to balance the number of positive and negative responses. If an experiment contains an overwhelming number of degraded sentences, participants will gradually increase the number of positive ratings throughout the experiment to maintain a balance between positive and negative ratings. Under Sprouse’s (2009) linking hypothesis, satiation should not contribute to our understanding of island effects (or any linguistic phenomenon) at all, because the equalization response strategy that underlies satiation is non-linguistic. However, there is now abundant evidence suggesting that satiation effects cannot be reduced to only the equalization response strategy (Chaves & Dery, 2014; Crawford, 2012; Francom, 2009; Lu et al., 2021, 2022). Thus, we will not further discuss the equalization response strategy in the current study. [^]
  6. These are not to be confused with fixed-effects and mixed-effects regression models. [^]
  7. There are also more sophisticated models available for testing publication bias. See Hedges & Vevea (2005) for examples of non-graphical methods of detecting publication bias. [^]
  8. The majority of studies on syntactic satiation utilized some form of the closed rating scale (e.g., 5/7-point Likert scale, 0–1 continuous slider scale). We did not include open-scale acceptability judgment experiments (e.g., magnitude estimation experiments), because there is no well-justified way of converting such scales to a closed interval such that the results can be aggregated with the other studies for meta-analytic purposes. Five experiments from Sprouse (2009) were excluded for this reason. These five experiments each tested subject islands, adjunct islands, whether-islands, and CNPC with and without context sentences. The subject island and the adjunct island experiments have 14 repetitions, while the other three experiments have 10 repetitions. None of the five reported a satiation effect. [^]
  9. As noted by Snyder (2022a), the CNPC condition in Francom’s (2009) Experiment 1 includes several sentence tokens that, in fact, do not contain CNPC violations. Therefore, the CNPC satiation results in that study might be confounded. However, we still include all the data from Francom (2009) for systematicity. Note that removing the data from Francom (2009) does not lead to any qualitative change in the meta-analytic results. [^]
  10. We should note that there is a crucial difference between the satiation of island-violating sentences and the amelioration of island violations, two concepts that are often confused in the satiation literature. Without including the relevant grammatical control sentences in the analysis, we can only test whether island-violating sentences satiate, which may or may not reflect whether the acceptability penalties due to island effects are ameliorated. We shall return to this issue in Section 3. [^]
  11. There is considerable debate in the literature about whether to report standardized effect sizes or effects in the original units in meta-analyses. One recommendation that has emerged is that, if the original units are meaningful, these can be more helpful (Baguley, 2009; Kelley & Preacher, 2012). In particular, the issue is that standardized effect sizes are standardized with respect to the variance in the data. Therefore, if the variance is different for different studies for reasons unrelated to the critical conditions (e.g., the types and the amount of filler sentences included), there would be different standardized effect sizes even when the actual magnitude of the effect is the same (i.e., a “reliability paradox”). We are especially worried about this in the case of satiation effects. [^]
  12. An anonymous reviewer pointed out that a few of the studies show satiation effects based on the effect estimates we obtained, but the original authors reported no satiation effects. This discrepancy is the result of either the current study and the original studies having used different working definitions for satiation, or because the original studies used statistical models and hypothesis testing methods different from ours, which we cannot implement either due to the lack of relevant information in the data files available or due to the need to remain consistent and use the same model to obtain satiation effect estimates for all studies included in the meta-analysis. [^]
  13. Context sentences used in satiation experiments are usually just the declarative form of the interrogative test sentences, presented along with the test sentences to participants. We follow Snyder (2022a) in calling these sentences “context sentences”. [^]
  14. These estimates should be taken with a grain of salt: the estimated acceptability gains are calculated based on the assumption that the effect of trial order is linear. This is likely to be an incorrect assumption – satiation effects tend to be strongest early on and plateau later in the experiment (Fine et al., 2010, 2013). In this case, we could be overestimating the acceptability increase after 11–12 repetitions. [^]
  15. Note that we are not arguing that subject islands or whether-islands have a special status, nor are we arguing for a grouping among the island types with subject island and whether-islands being the breakpoints. We simply take this result as evidence for varying rates of satiation among the island types investigated. [^]
  16. The funnel plot for subject island also appears asymmetric, but Egger’s test does not show significance. The high number of studies that fall outside the funnel area in Figure 11a could result from high cross-study heterogeneity. [^]
  17. Snyder (2000, 2022a) also tested the generalization of satiation effects between different lexical realizations of the same island type (e.g., between CNPC sentences with the matrix predicates believe the claim and accept the idea), but reported mixed results. [^]
  18. As noted earlier, these two sentence types are not traditionally considered islands in the syntax literature, but we follow Snyder (2000) and other previous work in the satiation literature in labeling them as island structures for the purpose of discussion. [^]
  19. For a comprehensive review of past proposals for the that-trace effect, see Pesetsky (2017). [^]
  20. See Nunes & Uriagereka (2000) for a more modern rendering of the CED in the minimalist framework. [^]

Data accessibility statement

All data files and analysis scripts are available at https://osf.io/arj8c/.

Acknowledgements

We would like to thank the authors of the analyzed studies for kindly making their data files available. Thanks to three anonymous reviewers, Rui Chaves, Dan Lassiter, William Snyder, Tom Wasow, members of the Interactive Language Processing Lab (The ALPS Lab) at Stanford University, and the audiences at LSA 2023 and CAMP 2023 for their valuable comments and feedback.

Competing interests

The authors have no competing interests to declare.

Author contributions

All authors contributed to the conceptualization of the study. JL acquired the data sets and conducted the analysis. All three authors contributed to the drafting and revision of the paper.

References

Abeillé, A., Hemforth, B., Winckel, E., & Gibson, E. (2020). Extraction from subjects: Differences in acceptability depend on the discourse function of the construction. Cognition, 204, 104293. DOI:  http://doi.org/10.1016/j.cognition.2020.104293

Abugharsa, A. F. (2016). The usefulness of explicit grammar teaching: An investigation of syntactic satiation effects and acceptability judgements in Libyan EFL contexts. (Unpublished doctoral dissertation), Middlesex University.

Baguley, T. (2009). Standardized or simple effect size: What should be reported? British Journal of Psychology, 100(3), 603–617. DOI:  http://doi.org/10.1348/000712608X377117

Bergmann, C., Tsuji, S., Piccinini, P. E., Lewis, M. L., Braginsky, M., Frank, M. C., & Cristia, A. (2018). Promoting replicability in developmental research through metaanalyses: Insights from language acquisition research. Child development, 89(6), 1996–2009. DOI:  http://doi.org/10.1111/cdev.13079

Boas, H. C., & Sag, I. A. (2012). Sign-based construction grammar. CSLI Publications.

Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2009). Introduction to meta-analysis. John Wiley & Sons. DOI:  http://doi.org/10.1002/9780470743386

Bošković, Ž. (2005). On the locality of left branch extraction and the structure of NP. Studia Linguistica, 59(1), 1–45. DOI:  http://doi.org/10.1111/j.1467-9582.2005.00118.x

Bošković, Ž. (2016). On the timing of labeling: Deducing Comp-trace effects, the subject condition, the adjunct condition, and tucking in from labeling. The Linguistic Review, 33(1), 17–66. DOI:  http://doi.org/10.1515/tlr-2015-0013

Braze, F. D. (2002). Grammaticality, acceptability and sentence processing: A psycholinguistic study. (Unpublished doctoral dissertation), University of Connecticut.

Bresnan, J. W. (1976). On the form and functioning of transformations. Linguistic Inquiry, 7(1), 3–40.

Brown, J. M., Fanselow, G., Hall, R., & Kliegl, R. (2021). Middle ratings rise regardless of grammatical construction: Testing syntactic variability in a repeated exposure paradigm. PLOS One, 16(5), e0251280. DOI:  http://doi.org/10.1371/journal.pone.0251280

Cao, A., & Lewis, M. (2022). Quantifying the syntactic bootstrapping effect in verb learning: A meta-analytic synthesis. Developmental Science, 25(2), e13176. DOI:  http://doi.org/10.1111/desc.13176

Cao, A., Lewis, M., & Frank, M. C. (2023). A synthesis of early cognitive and language development using (meta-) meta-analysis. In Proceedings of the Annual Meeting of the Cognitive Science Society, 45, 48–55. DOI:  http://doi.org/10.31234/osf.io/qn2t5

Chaves, R. P. (2022). Sources of discreteness and gradience in island effects. Languages, 7(4), 245. DOI:  http://doi.org/10.3390/languages7040245

Chaves, R. P., & Dery, J. E. (2014). Which subject islands will the acceptability of improve with repeated exposure. In Proceedings of the 31st West Coast Conference on Formal Linguistics (pp. 96–106).

Chaves, R. P., & Dery, J. E. (2019). Frequency effects in subject islands. Journal of Linguistics, 55(3), 475–521. DOI:  http://doi.org/10.1017/S0022226718000294

Chaves, R. P., & Putnam, M. T. (2020). Unbounded dependency constructions: Theoretical and experimental perspectives. Oxford University Press, USA. DOI:  http://doi.org/10.1093/oso/9780198784999.001.0001

Chomsky, N. (1964). Current issues in linguistic theory. De Gruyter Mouton.

Chomsky, N. (1973). Conditions on transformations. In S. R. Anderson & P. Kiparsky (Eds.), A festchrift for Morris Halle (pp. 232–286). Holt, Rinehart and Winston.

Chomsky, N. (1977). On wh-movement. In P. W. Culicover, T. Wasow & A. Akmajian (Eds.), Formal Syntax (pp. 71–132). Academic Press.

Chomsky, N. (1986). Barriers. MIT Press.

Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika, 37(3/4), 256–266. DOI:  http://doi.org/10.1093/biomet/37.3-4.256

Crawford, J. (2012). Using syntactic satiation to investigate subject islands. In Proceedings of the 29th West Coast Conference on Formal Linguistics (pp. 38–45). Cascadilla Proceedings Project.

Culicover, P. W., Varaschin, G., & Winkler, S. (2022). The radical unacceptability hypothesis: Accounting for unacceptability without universal constraints. Languages, 7(2), 96. DOI:  http://doi.org/10.3390/languages7020096

Do, M. L., & Kaiser, E. (2017). The relationship between syntactic satiation and syntactic priming: A first look. Frontiers in Psychology, 8, 18–51. DOI:  http://doi.org/10.3389/fpsyg.2017.01851

Egger, M., Smith, G. D., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by a simple, graphical test. The BMJ, 315(7109), 629–634. DOI:  http://doi.org/10.1136/bmj.315.7109.629

Erteschik-Shir, N. (1973). On the nature of island constraints. (Unpublished doctoral dissertation), Massachusetts Institute of Technology.

Fine, A. B., Jaeger, T. F., Farmer, T. A., & Qian, T. (2013). Rapid expectation adaptation during syntactic comprehension. PLOS One, 8(10), e77661. DOI:  http://doi.org/10.1371/journal.pone.0077661

Fine, A., Qian, T., Jaeger, T. F., & Jacobs, R. (2010). Syntactic adaptation in language comprehension. In Proceedings of the 2010 workshop on cognitive modeling and computational linguistics (pp. 18–26).

Fox, D., & Pesetsky, D. (2005). Cyclic linearization of syntactic structure. DOI:  http://doi.org/10.1515/thli.2005.31.1-2.1

Francom, J. C. (2009). Experimental syntax: Exploring the effect of repeated exposure to anomalous syntactic structure–evidence from rating and reading tasks. (Unpublished doctoral dissertation), The University of Arizona.

Fukuda, S., Tanaka, N., Ono, H., & Sprouse, J. (2022). An experimental reassessment of complex NP islands with NP-scrambling in Japanese. Glossa: a journal of general linguistics, 7(1). DOI:  http://doi.org/10.16995/glossa.5737

Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5(10), 3–8. DOI:  http://doi.org/10.2307/1174772

Goldberg, A. E. (2013). Backgrounded constituents cannot be “extracted”. In Experimental syntax and island effects, Volume 221. Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781139035309.012

Goodall, G. (2011). Syntactic satiation and the inversion effect in English and Spanish wh-questions. Syntax, 14(1), 29–47. DOI:  http://doi.org/10.1111/j.1467-9612.2010.00148.x

Gurevitch, J., Curtis, P. S., & Jones, M. H. (2001). Meta-analysis in ecology. Advances in Ecological Research, 32, 200–239. DOI:  http://doi.org/10.1016/S0065-2504(01)32013-5

Haegeman, L., Jiménez-Fernández, Á. L., & Radford, A. (2014). Deconstructing the subject condition in terms of cumulative constraint violation. The Linguistic Review, 31(1), 73–150. DOI:  http://doi.org/10.1515/tlr-2013-0022

Haidich, A.-B. (2010). Meta-analysis in medical research. Hippokratia, 14(Suppl 1), 29.

Hedges, L. V., & Vevea, J. (2005). Selection method approaches. In H. R. Rothstein, A. J. Sutton, & M. Borenstein (Eds.), Publication bias in aeta-Analysis: Prevention, assessment and adjustments (pp. 145–174). Wiley Online Library. DOI:  http://doi.org/10.1002/0470870168.ch9

Hiramatsu, K. (2001). Accessing linguistic competence: Evidence from children’s and adults’ acceptability judgments. (Unpublished doctoral dissertation), University of Connecticut.

Hofmeister, P., Casasanto, L. S., & Sag, I. A. (2013). Islands in the grammar? Standards of evidence. In J. Sprouse & N. Hornstein (Eds.), Experimental syntax and island effects. Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781139035309.004

Hofmeister, P., Jaeger, T. F., Arnon, I., Sag, I. A., & Snider, N. (2011). The source ambiguity problem: Distinguishing the effects of grammar and processing on acceptability judgments. Language and Cognitive Processes, 28(1–2), 48–87. DOI:  http://doi.org/10.1080/01690965.2011.572401

Hofmeister, P., & Sag, I. A. (2010). Cognitive constraints and island effects. Language, 86(2), 366–415. DOI:  http://doi.org/10.1353/lan.0.0223

Huang, C.-T. J. (1982). Logical relations in Chinese and the theory of grammar. (Unpublished doctoral dissertation), Massachusetts Institute of Technology.

Kelley, K., & Preacher, K. J. (2012). On effect size. Psychological Methods, 17(2), 137. DOI:  http://doi.org/10.1037/a0028086

Keshev, M., & Meltzer-Asscher, A. (2019). A processing-based account of subliminal whisland effects. Natural Language & Linguistic Theory, 37(2), 621–657. DOI:  http://doi.org/10.1007/s11049-018-9416-1

Kim, B., & Goodall, G. (2016). Islands and non-islands in native and heritage Korean. Frontiers in Psychology, 7, 134. DOI:  http://doi.org/10.3389/fpsyg.2016.00134

Kirca, A. H., & Yaprak, A. (2010). The use of meta-analysis in international business research: Its current status and suggestions for better practice. International Business Review, 19(3), 306–314. DOI:  http://doi.org/10.1016/j.ibusrev.2010.01.001

Kluender, R. (1991). Cognitive constraints on variables in syntax. (Unpublished doctoral dissertation), University of California, San Diego.

Kluender, R., & Kutas, M. (1993). Subjacency as a processing phenomenon. Language and Cognitive Processes, 8(4), 573–633. DOI:  http://doi.org/10.1080/01690969308407588

Ko, H., Chung, H.-B., Kim, K., & Sprouse, J. (2019). An experimental study on scrambling out of islands: To the left and to the right. Language & Information Society, 37, 287–323. DOI:  http://doi.org/10.29211/soli.2019.37..008

Kush, D., Lohndal, T., & Sprouse, J. (2018). Investigating variation in island effects: A case study of Norwegian wh-extraction. Natural Language & Linguistic theory, 36, 743–779. DOI:  http://doi.org/10.1007/s11049-017-9390-z

Kush, D., Lohndal, T., & Sprouse, J. (2019). On the island sensitivity of topicalization in Norwegian: An experimental investigation. Language, 95(3), 393–420. DOI:  http://doi.org/10.1353/lan.2019.0051

L’Abbé, K. A., Detsky, A. S., & O’Rourke, K. (1987). Meta-analysis in clinical research. Annals of Internal Medicine, 107(2), 224–233. DOI:  http://doi.org/10.7326/0003-4819-107-2-224

Light, R. J., & Pillemer, D. B. (1984). Summing up: The science of reviewing research. Harvard University Press. DOI:  http://doi.org/10.4159/9780674040243

Lu, J., Lassiter, D., & Degen, J. (2021). Syntactic satiation is driven by speaker-specific adaptation. In Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 43.

Lu, J., Thompson, C. K., & Yoshida, M. (2020). Chinese wh-in-situ and islands: A formal judgment study. Linguistic Inquiry, 51(3), 611–623. DOI:  http://doi.org/10.1162/ling_a_00343

Lu, J., Wright, N., & Degen, J. (2022). Satiation effects generalize across island types. In Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 44.

Michaelis, L. A. (2013). Sign-based construction grammar. In T. Hoffman & G. Trousdale (Eds.), The Oxford handbook of Construction Grammar. DOI:  http://doi.org/10.1093/oxfordhb/9780195396683.013.0008

Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal Medicine, 151(4), 264–269. DOI:  http://doi.org/10.7326/0003-4819-151-4-200908180-00135

Myers, J. (2012). Testing adjunct and conjunct island constraints in Chinese. Language and Linguistics, 13(3), 437.

Nunes, J., & Uriagereka, J. (2000). Cyclicity and extraction domains. Syntax, 3(1), 20–43. DOI:  http://doi.org/10.1111/1467-9612.00023

Pesetsky, D. (2017). Complementizer-trace effects. In Blackwell companion to syntax. Wiley Blackwell. DOI:  http://doi.org/10.1002/9781118358733.wbsyncom108

Phillips, C. (2006). The real-time status of island phenomena. Language, 82(4), 795–823. DOI:  http://doi.org/10.1353/lan.2006.0217

Prasad, G., & Linzen, T. (2021). Rapid syntactic adaptation in self-paced reading: Detectable, but only with many participants. Journal of Experimental Psychology: Learning, Memory, and Cognition, 47(7), 1156. DOI:  http://doi.org/10.1037/xlm0001046

Pratt, T. C. (2010). Meta-analysis in criminal justice and criminology: What it is, when it’s useful, and what to watch out for. Journal of Criminal Justice Education, 21(2), 152–168. DOI:  http://doi.org/10.1080/10511251003693678

Rizzi, L. (1990). Relativized minimality. MIT Press.

Roettger, T. B. (2021). Preregistration in experimental linguistics: Applications, challenges, and limitations. Linguistics, 59(5), 1227–1249. DOI:  http://doi.org/10.1515/ling-2019-0048

Ross, J. R. (1967). Constraints on variables in syntax. (Unpublished doctoral dissertation), Massachusetts Institute of Technology.

Runner, J. T., Sussman, R. S., & Tanenhaus, M. K. (2006). Processing reflexives and pronouns in picture noun phrase. Cognitive science, 30(2), 193–241. DOI:  http://doi.org/10.1207/s15516709cog0000_58

Sag, I. A. (1976). Deletion and logical form. (Unpublished doctoral dissertation), Massachusetts Institute of Technology.

Sag, I. A. (2012). Sign-based construction grammar: An informal synopsis. In Sign-based Construction Grammar (pp. 69–202).

Schutze, C. T. (1996). The empirical base of linguistics: Grammaticality judgments and linguistic methodology. University of Chicago Press.

Slavin, R. E. (1984). Meta-analysis in education: How has it been used? Educational Researcher, 13(8), 6–15. DOI:  http://doi.org/10.3102/0013189X013008006

Snyder, W. (2000). An experimental investigation of syntactic satiation effects. Linguistic Inquiry, 31(3), 575–582. DOI:  http://doi.org/10.1162/002438900554479

Snyder, W. (2022a). On the nature of syntactic satiation. Languages, 7(1), 38. DOI:  http://doi.org/10.3390/languages7010038

Snyder, W. (2022b). Satiation. In G. Goodall (Ed.), The Cambridge handbook of experimental syntax (pp. 154–180). Cambridge University Press. DOI:  http://doi.org/10.1017/9781108569620.007

Sommer, M. (2022). A further look: Is syntactic priming a plausible underlying mechanism for syntactic satiation? (Unpublished master’s thesis), Radboud University.

Sönning, L., & Werner, V. (2021). The replication crisis, scientific revolutions, and linguistics. Linguistics, 59(5), 1179–1206. DOI:  http://doi.org/10.1515/ling-2019-0045

Sprouse, J. (2007). Syntactic satiation: Toward an etiology of linguist’s disease. In 30th Annual GLOW Colloquium. University of Tromso, Norway.

Sprouse, J. (2009). Revisiting satiation: Evidence for an equalization response strategy. Linguistic Inquiry, 40(2), 329–341. DOI:  http://doi.org/10.1162/ling.2009.40.2.329

Sprouse, J., Caponigro, I., Greco, C., & Cecchetto, C. (2016). Experimental syntax and the variation of island effects in English and Italian. Natural Language & Linguistic Theory, 34, 307–344. DOI:  http://doi.org/10.1007/s11049-015-9286-8

Sprouse, J., Wagers, M., & Phillips, C. (2012). A test of the relation between working-memory capacity and syntactic island effects. Language, 88(1), 82–123. DOI:  http://doi.org/10.1353/lan.2012.0004

Stepanov, A. (2007). The end of CED? Minimalism and extraction domains. Syntax, 10(1), 80–126. DOI:  http://doi.org/10.1111/j.1467-9612.2007.00094.x

Stepanov, A., Mušič, M., & Stateva, P. (2018). Two (non-) islands in Slovenian: A study in experimental syntax. Linguistics, 56(3), 435–476. DOI:  http://doi.org/10.1515/ling-2018-0002

Stromswold, K. (1986). Syntactic satiation (Unpublished Ms).

Turanovic, J. J., & Pratt, T. C. (2021). Meta-analysis in criminology and criminal justice: Challenging the paradigm and charting a new path forward. Justice Evaluation Journal, 4(1), 21–47. DOI:  http://doi.org/10.1080/24751979.2020.1775107

Vasishth, S., & Gelman, A. (2021). How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis. Linguistics, 59(5), 1311–1342. DOI:  http://doi.org/10.1515/ling-2019-0051

Vasishth, S., Mertzen, D., Jäger, L. A., & Gelman, A. (2018). The statistical significance filter leads to overoptimistic expectations of replicability. Journal of Memory and Language, 103, 151–175. DOI:  http://doi.org/10.1016/j.jml.2018.07.004

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. DOI:  http://doi.org/10.18637/jss.v036.i03

Yoshida, M., Kazanina, N., Pablos, L., & Sturt, P. (2014). On the origin of islands. Language, Cognition and Neuroscience, 29(7), 761–770. DOI:  http://doi.org/10.1080/01690965.2013.788196