Frontiers of Biogeography plant species in South American seasonally dry tropical forests responded differently to past climatic fluctuations

Seasonally dry tropical forests (SDTFs) are a main component of open seasonally dry areas in South America and their biogeography is understudied compared to evergreen forests. In this work, we identify vascular plant species with long-distance disjunctions across SDTF patches of South America based on information available in online repositories and on species taxonomy and distribution, to explore species’ biogeographic patterns. Specifically, we combine distribution data from the Brazilian Flora 2020 Project (BFG) and the Global Biodiversity Information Facility (GBIF) to identify species with a peri-Amazonian distribution, and then use species distribution models to discuss possible scenarios of peri-Amazonian distributions under Pleistocene climatic fluctuations. We identified 81 candidate species for peri-Amazonian distributions in SDTFs, including shrubs, herbs, trees and lianas, and provided a summary of their main fruit dispersion syndrome based on the literature to identify prevalent dispersal patterns. The study species responded differently to Pleistocene climatic fluctuations, with both contractions and expansions through time in different rates and did not show consistent larger distributions during past climate conditions. Our results show that a peri-Amazonian distribution is also present in growth-forms other than trees. Also, the prevalence of species with long-distance dispersal strategies such as wind or vertebrate-dispersed can suggest, although biased for Neotropical taxa, an alternative scenario of long-distance dispersal, possibly using stepping-stones of azonal vegetation. We argue that such an alternative scenario, especially for species disjunct with long-dispersal abilities, should be considered to test if SDTF disjunctions are relics of a past widespread distribution or not.

• We provide a new list of flowering plant species disjunct in South American seasonally dry tropical forests, considering different types of lifeforms and dispersal syndromes.
• Species have responded differently to Pleistocene climatic fluctuations in different model conditions, with no consistent connections through time.
• There is a prevalence of species with good dispersal abilities that might evoke a scenario where long dispersal events might also explain the current observed disjunctions.

Abstract
Seasonally dry tropical forests (SDTFs) are a main component of open seasonally dry areas in South America and their biogeography is understudied compared to evergreen forests. In this work, we identify vascular plant species with long-distance disjunctions across SDTF patches of South America based on information available in online repositories and on species taxonomy and distribution, to explore species' biogeographic patterns. Specifically, we combine distribution data from the Brazilian Flora 2020 Project (BFG) and the Global Biodiversity Information Facility (GBIF) to identify species with a peri-Amazonian distribution, and then use species distribution models to discuss possible scenarios of peri-Amazonian distributions under Pleistocene climatic fluctuations. We identified 81 candidate species for peri-Amazonian distributions in SDTFs, including shrubs, herbs, trees and lianas, and provided a summary of their main fruit dispersion syndrome based on the literature to identify prevalent dispersal patterns. The study species responded differently to Pleistocene climatic fluctuations, with both contractions and expansions through time in different rates and did not show consistent larger distributions during past climate conditions. Our results show that a peri-Amazonian distribution is also present in growth-forms other than trees. Also, the prevalence of species with long-distance dispersal strategies such as wind or vertebrate-dispersed can suggest, although biased for Neotropical taxa, an alternative scenario of Introduction Forest in coastal Brazil; Hoorn et al. 2010, Thode et al. 2019) and Andean ecosystems (Anthelme et al. 2014, Godoy-Bürki et al. 2014, Quintana et al. 2017. In contrast, the distribution and evolution of species in open seasonally dry formations remain understudied in many aspects (Werneck et al. 2012). If explicitly considered for continental-scale analyses, the open seasonally dry formations are mostly treated as the background in which vicariant events occurred in groups associated to moist forests, rather than entities of their own right (Werneck 2011, Pennington et al. 2018. This is problematic, due to the high species richness and endemism observed in these areas (Rizzini 1997, Ribeiro and Walter 2008, Zizka 2019) and the importance of evolutionary connectivity among biomes for the evolution of South American diversity (see Simon et al. 2009, Antonelli et al. 2018).
The open seasonally dry formations of South America broadly include seasonally dry tropical forests (SDTFs) and savannas (sensu Olson et al. 2001). These two biomes can occur under similar climatic conditions but are differentiated by soil conditions and by differences in fire frequency (savannas burn regularly, SDTFs usually do not; Pennington et al. 2018). In particular, SDTFs are also evolutionarily relatively isolated centers of diversity (Rizzini 1997, Ribeiro and Walter 2008. Currently, SDTFs form coherent blocks through South America, isolated from each other by large stretches of forest or mountainous habitats, forming the so-called "peri-Amazonian pattern", surrounding the Amazonian forest from Northeastern Brazil, Paraguay, Colombia and Venezuela (Fig. 1a, Ducke & Black 1953, Prado & Gibbs 1993.
While most plant species in SDTFs are geographically restricted to individual blocks (Dryflor et al. 2016), peri-Amazonian disjunct distributions across distant SDTFs occur in animals (e.g., insects, Morrone and Coscarón 1996;reptiles, Azevedo et al. 2016;mammals, Courtenay and Maffei 2004) and woody plant species (Prado and Gibbs 1993, Prado 2000, Pennington et al. 2000, Dryflor et al. 2016. However, existing suggestions were taxonomically and functionally limited (e.g., to trees) and thus it is unclear how common peri-  Olson et al. (2001), Pennington et al. (2000Pennington et al. ( , 2004 and Werneck (2011 Amazonian disjunctions are in other plant lifeforms, where they occur, and how they originated. Peri-Amazonian disjunctions in SDTF species may be the result of recent or repeated long-distance dispersals from one SDTF region to another (Fig. 1b). Alternatively, they may represent relicts of past large distribution ranges when climatic conditions were more favorable for SDTF species (Fig. 1c). SDTFs may have changed their shape and extent through time, tracking climatic fluctuations in the Quaternary (Moritz et al. 2000, Costa et al. 2017 or even already in the Tertiary (Pennington et al. 2004), contrasting the dynamics of evergreen forests. Particularly, during the Pleistocene, the climate in large parts of tropical South America was potentially drier (Pennington et al. 2004. Peri-Amazonian disjunctions might therefore reflect past floristic links of a once more continuous SDTF connecting today's "dry diagonal" (i.e., the Chaco + Cerrado + Caatinga) with peri-Andean and Venezuelan/ Colombian SDTFs (Prado and Gibbs 1993, Prado, 2000, Pennington et al. 2000. Recent increases in the availability of plant distribution data from the digitization of collection records (Lavoie 2013, Schmidt-Lebuhn et al. 2013, Dryflor et al. 2016) and floral treatments (e.g., BFG 2018) provide a novel opportunity to identify SDTF disjunct species and to cross-validate different data sources. Additionally, species distribution modelling based on environmental conditions offers a way to evaluate, within certain limitations (see Collevatti et al. 2012), the distribution of species through time, assuming a niche conservatism in species distribution (Kramer-Schadt et al. 2013).
Here, we review distribution information from different data types to identify plant species with peri-Amazonian disjunct distributions and use species distribution models to reconstruct the distribution of the identified species through time to understand disjunctions patterns. Specifically, we aim to answer two questions:

Identifying peri-Amazonian disjunctions
To identify SDTF disjunctly distributed species, we obtained species distribution information from two databases. First, we used the Brazilian Flora 2020 Project database (henceforth "BFG"), to get a set of candidate species. BFG provides information on habitat, lifeform and taxonomy of algae, plant and fungi species of Brazil (BFG 2018) and all information for a species is filed and checked by taxonomic specialists. Brazil is the largest country in South America and comprises information for two of three principal areas of SDTF, as well as information on endemism of species of other areas in the continent. Thus, we believe it is a good proxy and a valid attempt to assess a list of species that might occur in different SDTF blocks.
In June 2020, we used BFG (BFG 2020) to identify potential candidate species with a peri-Amazonian distribution, by downloading a list of all species that fulfilled the following criteria: (1) vascular plants native to Brazil, at species level and; (2) correct names, setting aside names with uncertain nomenclatural or taxonomical status; (3) species assigned to occur in "Caatinga" and/or "Seasonally Deciduous Forests" vegetation, which includes the SDTF definition of this study.
In a second step, we downloaded georeferenced point-occurrence records from the Global Biodiversity Information Facility (www.gbif.org, one of the largest biodiversity repository publicly available, check Robertson et al. 2014) for all candidate species identified in BFG. We used the "rgbif" v. 3.0.0 package (Chamberlain et al. 2020) in R (R Core Team 2020) to download records based on preserved specimens (since we consider them more reliable than observational data) and, since GBIF records are error prone ), used the "CoordinateCleaner" v. 2.0-15 package ) for data cleaning, identifying duplicates, records whose coordinates were centroids, points in the sea, incomplete or inaccurate (i.e. degrees with no decimal information) and outliers. Occurrence database as well as references for all downloaded occurrence datasets are available as Supplementary Material (Appendix S1).
We then combined the information from BFG and GBIF and retained only species that truly were restricted to South American SDTFs and that were disjunct in at least two SDTF blocks. To do so, we selected only species that attained the following criteria: (1) had at least 10 valid occurrence records available, i.e., records with no coordinate issues; (2) had at least 90% of all records within the delimitations of SDTFs (sensu Olson et al. 2001), i.e., excluding species where more than 10% of records may have been in vegetation other than dry forests; (3) had at least 90% of the records in at least two or more blocks (or SDTF "regions" sensu Fig. 1).
From the remaining species, we summarized information on fruit dispersal syndrome, lifeform, and taxonomy, based on the information of the BFG and on the literature. Data for all selected species and each filtering step can be found in supplementary Table S1.

Present and past distribution ranges
To answer question 2 of this work and test if peri-Amazonian disjunctly distributed species responded similarly to past climate fluctuations, we modeled the suitable habitat for all species under current and past climatic conditions. To reduce geographic sampling bias, we ran a spatial thinning procedure to our occurrence data set, with the "spThin" package v. 2.5-1 (Aiello-Lammens et al. 2015), keeping only one record in a 10 km distance radius. We then downloaded nineteen bioclimatic variables for four time-slices from the Pliocene to the present, based on layers of the ecoClimate project (Lima-Ribeiro et al. 2015): (1) Present (0 kybp, kiloyears before present); (2) Holocene (6 kybp), (3) Last Glacial Maximum (LGM, c. 22 kybp); and (4) mid-Pliocene Warm period (ca. 3300 kybp). We did not include projections for the LIG (Last Interglacial Period) in this study because climatic data in the required resolution and consistent with the other time-slices (i.e., based on the same earth system models, see below) were not available. We used the "sdm" package v. 1.0-81 (Naimi and Araujo 2016) to model species distributions with five different modeling algorithms, regression-based or machine learning-based (support vector machines, random forest, boosted regression tree, multivariate adaptive regression splines and maximum entropy), performing five runs and 5k-folds cross-validation for each model, and selecting 1000 random background points for prediction. Ideally, edaphic variables would significantly improve model accuracy because SDTFs favor high pH and relatively fertile soils (Ferreira-Nunes et al. 2013, Pennington et al. 2000. However, we did not include edaphic variables in the distribution models, since soil data in the required resolution are lacking to a large extent for the study area and are not available for past time slices. As general circulation models have high variation, especially in the Neotropics, as shown for the SDTFs (Collevatti et al. 2012), this might lead to high differences in paleodistributions predicted by one unique algorithm. Thus, we performed niche modeling considering this variation, by running models with different paleoclimatic algorithms besides the general CCSM model. We selected six other models (CNRM, FGOALS, IPSL, MIROC, MPI and MRI) and compared their major results to evaluate the potential differences on the species distribution projections, but CCSM was the main model used to discuss our results, while we show results from the other models in Supplementary Materials (Appendix S2 and S3).
Finally, we used the Area Under the Curve (AUC) for modeling evaluation, keeping only models with AUC > 0.7. To generate binary layers, we applied a 10 th percentile training presence cut-off, i.e., omitting all areas with habitat suitability lower than suitability values for the lowest 10% of the occurrence records. Extension ranges generated for each model were calculated and summarized to build graphs to check the rate of variation and the expansion or contraction through past time slices.

Identifying peri-Amazonian disjunct species
When searching for taxa occurring in dry deciduous formations according to the BFG, we found 2,302 species of vascular plants (see Table S1 in Supplementary Materials). However, when downloading data of these species and applying filtering criteria to select only species from SDTF blocks, this number drops to only 81 species, exclusively including flowering plants (Table 1). This final data set included both trees and herbs from different families, but families such as Leguminosae, Euphorbiaceae, Malvaceae and Rubiaceae prevailed. In addition, information on main dispersal syndrome modes revealed the prevalence of both anemochory, autochory and zoochory in these plants (Table 1). All occurrence records, before and after data cleaning, can be found in Appendix S1 in the Supplementary Materials.

Present and past distribution ranges
Almost all species (78 of 81 species) recovered good models, with AUC > 0.7, and were considered for downstream analyses. The projected species distributions show a consistently disjunct pattern for all study species, although for the Pliocene this area seems to enlarge towards connecting SDTF blocks in the Caatinga and Central-western South America (Fig. 2). All plots and binary model outputs, as well as information on thresholds and AUCs for all species can be found in Supplementary Materials (Appendix S2 and S3). Similar scenarios were found when comparing CCSM to other alternative general circulation models, but this also varied in some cases, especially comparing oldest time slices (check Table S2 in Supplementary Materials).
The amount of suitable habitat varied for all species, and species area have varied over time (Fig. 3). When considering area variation over time (Fig. 3a), variation is observed in almost all time slices, with species both contracting and expanding their area (Fig. 3b). A notable exception is when comparing the Holocene period with LGM, where most species have expanded their area (Fig. 3b).

Discussion
We found species occurrences that are congruent with the predicted SDTF disjunct blocks described in classical works (Prado and Gibbs 1993, Pennington et al. 2000, in trees, shrubs, and herbs (Table 1). Past modeling projections, especially in past time slices, are to some extent congruent with Prado & Gibbs (1993) ideas of the existence a more widespread tropical dry forest. However, the modelled species ranges under past climatic conditions varied among species, as did species responses (range expansion or contraction) to past climatic changes (see Fig. 4 for examples on this variation from four selected species). This suggests that species responded differently to paleoclimatic conditions, which would in turn contradict the idea of past widespread distribution of SDTF species and a related relict nature of present-day peri-Amazonian disjunctions.
Our two-level approach combining different types of distribution data from two up-to-date databases reinforces the support of such a pattern, described originally based only on woody plants from point occurrence records, which can be spatially biased for less Frontiers of Biogeography 2021, 13.1, e49882 © the authors, CC-BY 4.0 license 5 Table 1. Selected species disjunct in SDTF fragments of South America based on BFG and GBIF. The presence of the species is according to the fragments presented in Fig. 1: (1) "Andean/Caribbean", (2) Central-western South America, and (3) Caatinga. Lifeform legends: S = shrubs/sub-shrubs; H = herbs; T = trees; L = lianas. Main dispersal syndromes were filled and categorized according to an extensive survey in the literature, cited in the "Reference" column. Categories are: An = anemochory; ZoB = zoochory (bird-mediated); ZoO = zoochory (other animals and epizoochory); Hy = hydrochory; Au = autochory; Ba = ballistic dispersal; ND = no data found for the species or the genus in which the species belongs. It should be noted that, for some cases, what is present in this Table may be different from what is currently informed in the BFG, as we used the version of BFG from June 2020 (BFG 2020; see also Table S1 in Supplementary Material), when the database was still under construction or not updated.

Justicia harleyi
Wassh.  collected areas. Hence, by using a collaboratively built dataset such as the BFG, we could recover part of SDTF diversity that overcomes previous functional limitations and demonstrates peri-Amazonian disjunctions not only for trees, but also for shrubs and herbs. We follow a conservative approach in identifying species from SDTFs with the aim to only identify unambiguous peri-Amazonian disjunctions. However, we consequently excluded some species iconic to SDTFs, for instance Anadenanthera colubrina (Vell.) Brenan (Leguminosae) (Dryflor et al., 2016). Anadenanthera colubrina was not selected because BFG does not explicitly consider it present in "Seasonally Deciduous Forests" at the moment of our search, although it actually is (Pennington et al., 2000, Dryflor et al. 2016. This example illustrates that the number we present is likely a conservative estimate on the number of plant species with a peri-Amazonian disjunction, and we hope that this list may initiate the discovery of additional species with a similar pattern.
Still, traditionally, seminal studies that have assessed disjunct patterns in SDTF blocks (Prado and Gibbs 1993, Pennington et al. 2000, Collevatti et al. 2012) have focused much more on the woody component (especially trees) rather than the herbaceous stratum instead. Although trees are important components of SDTFs, they are inserted in a context of open seasonally dry formations, where the shrubby-herbaceous stratum prevails. Moreover, life-history and distribution of herbaceous lineages can be significantly different when comparing different lifeforms (Ehrendorfer 1970, Petit and Hampe 2006, Smith and Beaulieu 2009, and all strata should be considered when analyzing different habitats.

The role of long-distance-dispersal
Peri-Amazonian disjunctly distributed woody species described in previous works and identified in this study share attributes that can enhance long-dispersal events. For instance, some species have either winged fruits or seeds which are dispersed by birds, e.g., Anacardiaceae-Astronium, Loxopterygium spp., Schinopsis (Griz and Machado 2001, Leite 2002, Burnham & Carranco 2004, Villaseñor-Sánchez et al. 2010, Apocynaceae-  Aspidosperma (Griz andMachado 2001, Vieira et al. 2008), Bignoniaceae-Bignonia (Vieira et al. 2008), Leguminosae-Amburana in our list, but also, Rhamnaceae-Crumenaria in our list, but also Zizyphus (Griz andMachado 2001, Alves 2008). While these species suggest some role of longdistance dispersal, a rigorous comparison using a null distribution based on a random selection of species from SDTFs will yield results that are more robust, once distribution data for more species is available. Furthermore, such future work may include a comparison of the results of this study with the distribution and dispersal mode of the 2,000 nonendemic SDTF species. The abovementioned traits are likely related to increased long-distance dispersion abilities that might have occurred recently, as disjunct taxa have no apparent morphological differences at species levels (Fryxell 1967). Instead, these species can have traits that can facilitate indirect dispersal modes such as epizoochory, which is the case of e.g., shrub-herbaceous Malvaceae (Colli-Silva & Pirani 2020) that we recovered in this study. High dispersal capacity might be an essential trait for species of open seasonally dry formations since these areas are often themselves a mosaic of different vegetation types, including deciduous forests (where most disjunct species seem to prevail), but also savannas and grasslands (e.g., Prado and Gibbs 1993, Prado 2000, Whitlock et al. 2011. Minor patches of azonal vegetation within forest biomes (e.g., the wellknown open savanna patches within Amazonian rainforest) may act as stepping-stones diminishing the dispersal distance among major parts of open/ seasonally dry formations , Olmstead 2012).

Responses to Pleistocene climatic fluctuations
Forests may have played an important role as environmental barriers, keeping species from open/ dry habitats fragmented at least since the LGM , Costa et al. 2017). An analogous situation can be traced with the dynamics of South  Table 1), highlighting the variance of past climatic conditions for different species. See Appendix S2 and S3 in Supplementary Materials for files for other models and for each species.
Frontiers of Biogeography 2021, 13.1, e49882 © the authors, CC-BY 4.0 license 11 American tropical rainforests. Current disjunctions of related lineages-i.e., one from the Atlantic Forest and another one from the Amazonia-suggests at least two main past corridors through these forests (Bigarella et al. 1975, Costa 2003, Thode et al. 2019. Quaternary dynamics of forests might have led to the observed disjunctions of forested species, separating related lineages (Ledo andColli 2017, Thode et al. 2019). For plant species in SDTFs, disjunction would have been more recent than the disruption of South American rainforests (Werneck 2011, Costa et al. 2017) and mounting evidence indicates that evolutionary lineages of open seasonally dry environments, incl. SDTFs, have rapidly diversified since the Pliocene (e.g., Simon et al. 2009, Werneck 2011, Hughes et al. 2013, Vasconcelos et al. 2020). In the case of our study, although extrapolating for earlier time slices such as Pliocene, where some species may not have existed, our results of a varying distribution of SDTF species under past climatic conditions and the varying response to past climate change contradict the idea of a past widespread distribution of SDTF species and peri-Amazonian distributions as relictual.
Furthermore, our results agree with palynological evidence  in not corroborating a corridor of open seasonally dry vegetation in the middle of the Amazonian Basin (Quijada-Mascareñas et al. 2007). This is not likely for SDTFs, as they do not spread through Amazonian rainforests because of the unsuitable edaphic conditions, which in turn would favor species from savannas with similar soil conditions (Pennington et al. 2000, Ferreira-Nunes et al. 2013, Bueno et al. 2016. Conversely, the apparent tendency in species range expansions observed from the Holocene to the LGM is related to a shift of forest distributions in South America. As observed by Allen et al. (2020), areas occupied by evergreen forests were instead occupied by open seasonally dry vegetation in different extensions, suggesting at least expansion in some places where SDTF blocks are found.
Phylogeographic methods using genetic population structure of SDTF (disjunct) lineages in association with lifeforms and dispersal modes, would certainly contribute to elucidate past distribution and dispersal scenarios and the evolutionary history of these lineages. However, such studies are scarce for SDTF species and, for plants, inconclusive. Naciri et al. (2006) for instance observed different differentiation events when comparing populational structure of two equally disjunct species in seasonally dry tropical forests, but the question if the main drivers for disjunctions are long-dispersal events whether than relics of past connected areas remained open.

Supplementary Materials
The following materials are available as part of the online article at https://escholarship.org/uc/fb: Appendix S1. Occurrence records downloaded from GBIF and after each data cleaning step. Also including all cited occurrence data downloads.
Appendix S2. Model performance metrics for SDM models ran for all species including AUC and other threshold metrics for generating binary models.
Appendix S3. SDM outputs for all species and models, including raster files. Table S1. Full list of species distribution from South America SDTFs, including selected species for modeling and selection criteria. Table S2. Full lust of the areas of binary raster files from SDM models for all selected species and time frames.