Frontiers of Biogeography Lineages through space and time plots: Visualising spatial and temporal changes in diversity

. During the radiation of a clade, diversification rates can show temporal patterns such as a speedup or slowdown, which might relate to different ecological and evolutionary mechanisms. The temporal dynamics of diversification of whole clades are often visualised as a lineage-through-time (LTT) plot, which traces the number of reconstructed lineages at different time points. However, clades do not radiate evenly across space and may show different temporal dynamics in different regions. As such, a biogeographic approach is required to more completely understand temporal diversification dynamics. Here, I present a tool to extract temporal diversity information across different biogeographic regions from the output of commonly used ancestral range estimation models implemented in the R package BioGeoBEARS. The lineages through space and time (LTST) plot allows for visualisation of diversification dynamics in different regions, formatted in an accessible way which can be used for further quantitative analysis.


Introduction
The temporal dynamics of the build-up of diversity in different regions is key to understanding the emergence of biogeographic patterns in diversity. A common method to explore temporal patterns of diversification within whole clades is to plot the logarithm of the reconstructed number of lineages in different time-slices drawn through a time-calibrated molecular phylogeny; a lineages-through-time plot (LTT; Figure 1A; Nee et al. 1992;Harvey et al. 1994). LTTs can be used to visually explore when diversification is deviating from a straight line as expected under a constant rates model, for example as the net diversification rate of a clade slows down during the late stages of an adaptive radiation (Nee et al. 1992;Phillimore and Price 2008). Temporal diversification information can be analysed quantitatively to see when diversification deviates from the expectation of constant rates (e.g., Pybus and Harvey 2000), and the shape of LTT curves have recently been shown to be a useful summary metric for diversification diagnostics (Janzen et al. 2015). LTTs have been used extensively in evolutionary biology, forming a standard visualisation procedure in the toolbox of evolutionary biologists. However, diversification dynamics are not spatially homogenous and within a single clade diversity may accumulate more or less rapidly in different regions. To visualise the spatial context of diversification through time first requires an estimate of the geographic distribution occupied by each reconstructed lineage in the phylogeny.
A suite of ancestral range estimation (ARE) methods are available to infer when in history lineages have occupied different regions based on models of geographic range evolution (Ronquist 1997;Ree and Smith 2008;Landis et al. 2013;Matzke 2013a). These methods model range dynamics by taking geographic range information, as measured by the present-day occupancy of species in discrete regions (often coded broadly at the scale of ecoregions, biomes, or biogeographic realms), as well as a time-calibrated phylogeny, and fitting models of geographic range evolution to these data. Different implementations of AREs model range shifts via dispersal between regions, speciation amongst or between regions, or extirpation from a region. ARE models can be constrained to account for the effect of geographic or environmental distance on dispersal (Van Dam and Matzke 2016), trait-based dispersal capacity of species (Klaus and Matzke 2019), historical information on geographic ranges from the fossil record (Matzke 2013a), or can be time-stratified to account for the ontogeny of different regions, such as the formation of islands or rise of mountains (Matzke 2014).
Given this flexibility and suitability of ARE models to a range of diverse systems, the application of Frontiers of Biogeography 2019, 11.2, e42954 © the authors, CC-BY 4.0 license 2 ARE methods is prolific in the field of biogeography and many different models, including variants of the dispersal-extinction-cladogenesis model (DEC; Ree and Smith 2008), dispersal-vicariance-analysis (DIVA; Ronquist 1997), and BayArea model (Landis et al. 2013), are commonly implemented in the R package BioGeoBEARS (Matzke 2013b(Matzke , 2014. However, despite the advancement of these analytical methods for ARE, methods to visualise the temporal dynamics of range evolution and diversification tends to be limited to painting the branches of a phylogeny based on the inferred ancestral condition according to a model of range evolution (e.g., Figure 2 in Matzke 2014). This limitation generally makes it difficult to extract the temporal information contained in the output of these models. In this paper, I present a method for post-hoc data extraction and visualisation of temporal species diversity information contained in the output of a BioGeoBEARS analysis. These lineage-through-space-and-time (LTST) plots form a bridge between clade-level LTTs and infra-clade regional diversity dynamics.

Data collection and ancestral range estimation background
The method for generating a LTST is presented in the R package ltstR 1 and described below using a global radiation of agamid dragons (family Agamidae) as an example. The agamid dragons are a large clade of over 400 species, which are broadly distributed through much of the Australasian, Indomalayan, Afrotropical, and Palearctic biogeographical realms ( Figure 1C). This radiation is over 90 million years old and appears to be showing a slowdown in diversification towards the present ( Figure 1A), which is confirmed by estimating 1 Available at github.com/alexskeels/ltstR the γ statistic on this group (γ = -2.624; Pybus and Harvey 2000). The following analysis was performed on a randomly selected phylogeny from the post-burn-in posterior distribution of phylogenies from Tonini et al. (2016; Figure 1B) and spatial information obtained from Roll et al. (2017; Figure 1C).
The first step towards generating a LTST is to perform an ARE analysis using BioGeoBEARS. BioGeoBEARS implements user-specified ARE models from which the probability that each node and internal branch in a phylogeny occupies a given state (geographic range) can be estimated. Biogeographic stochastic mapping (Dupin et al. 2016) uses stochastic simulations based on the phylogeny and the specified ARE model and parameters to assign a state to internal branches and nodes. Because they are probabilistic, stochastic maps will tend to differ from one another and account for alternative biogeographic histories based on the ARE model and parameters. As such, LTSTs require at least a single stochastic map and, to account for biogeographic uncertainty LTSTs, can be produced using a distribution of stochastic maps. Phylogenetic uncertainty is another source of variation for LTSTs as ARE is typically performed on a representative phylogeny such as a maximum clade credibility phylogeny from a Bayesian posterior. Phylogenetic uncertainty can be accounted for by replicating ARE estimation across, for example, alternative phylogenetic trees from a Bayesian posterior. However, replicating AREs is likely to be computationally prohibitive for many large data sets. Therefore, it is advised that users acknowledge this source of uncertainty when interpreting LTSTs. For the agamid dragons, I scored the biogeography of species by categorising them as present or absent from the major biogeographical realms of the world. The agamids were present in four of these, the Afrotropical, Indomalayan, Australasian, and Palearctic realms, which we used to fit a DEC model to illustrate the application of LTST. From the maximum likelihood estimates of the DEC model parameters, we obtained 50 biogeographic stochastic maps from which we generated LTST plots (Figure 2).
LTSTs are generated with the R package ltstR 1 or with R scripts found in the Supplementary Appendix S1 and requires four objects: 1) a time-calibrated phylogeny, 2) a BioGeoBEARS geography file, and two objects from the output of biogeographic stochastic mapping under an ARE model with BioGeoBEARS; the 3) anagenetic events data table, and 4) cladogenetic events data table. The first two objects are required input for a BioGeoBEARS analysis, and the last two are the standard output from biogeographic stochastic mapping with BioGeoBEARS. Users having performed biogeographic stochastic mapping with BioGeoBEARS will have all the necessary components.
Step-by-step guide to producing LTSTs The following outlines a step-by-step guide to producing LTST plots. A vignette and associated data to replicate this analysis for the agamid as well as further details on the R code can be found within the Supplementary Appendix S1 or on GitHub 1 .
1) The method works by first going through the cladogenetic and anagenetic event tables and extracting all the times in which a state change occurs on the phylogeny for a given stochastic map (getEventTiming function). The output from this function is the events timing table, a data frame which gives the time of each event, the node at which the event is located on the phylogeny, whether the event was cladogenetic (occurring at a speciation event, at a node) or anagenetic (occurring in between speciation events, along a branch), the number of species extant at that time, and a running total of the number of species present in each unique state combination. From the events timing table we can infer whether the diversity of a region increases through immigration or speciation or decreases through emigration or extirpation, to keep a record of diversity changes through time.
2) Lineages can occupy multiple states (be present in multiple regions simultaneously), and BioGeoBEARS gives each state combination a unique identifier. For example, in the agamid lizards, a lineage that occupies Australasia is given state 2 while a lineage that occupies Australasia and Indomalaya simultaneously is given state 8. An intermediate step is required which matches the state combination identifier to the actual geographic ranges. The function getRangeStates generates a lookup table (Table 1) to perform this matching.
3) Next, to obtain the LTST data for each geographic region separately we employ the functions getLTSTDataTable, for a single stochastic map, or getLTSTFromStochasticMaps, for multiple stochastic maps. These functions take the events timing table from getEventTiming and the lookup table from getRangeStates to produce the LTST data table, a data frame (or list of data frames for multiple stochastic maps) which contain the number of species present in each geographic range at each event time (Table 2). At this stage the LTST data table for a single stochastic map can be plotted (see vignette in the Supplementary Appendix S1 or on GitHub 1 ; Figure 2A). However, comparing LTSTs across multiple stochastic maps requires further processing because the timing and number of events will be different between iterations of the stochastic mapping and therefore not align perfectly.

4)
To account for the discrepancy in event timing between stochastic maps, the function slidingWindowForStochasticMaps performs a sliding window analysis to get the number of species if the phylogeny is measured in units of million years, then 0.1 represents 100,000 years. If the depth of the phylogeny is, for example, 30 million years, then this may be a fine enough resolution to assess meaningful patterns, but if the depth of the phylogeny is only 1 million years, then it may make sense to narrow the window size (e.g., 0.01).
The output is a time-standardised LTST data table.
5) To visualise the LTSTs from multiple stochastic maps we then summarise the variability of species diversity values for each region at each time point in the list of time-standardised LTST data tables. The function timeRangeAcrossStochasticMaps finds the upper (97.5%) and lower (2.5%) quantiles for each region at each time step and returns this in the LTST quantile data table which can be used to plot the distribution of diversity values for a given LTST (see vignette in the Supplementary Appendix S1 or on GitHub 1 ; Figure 2B).
Importantly, not only does this method provide a tool for plotting this biogeographic, temporal-diversification information but it also returns it in a format that is user friendly, making quantitative analysis of these data more accessible to many users. For example, logistic growth models can be fit to the temporal diversity data contained within the LTST data tables to estimate the diversity asymptote, which might be interpreted as the regional diversity carrying capacities (Skeels and Cardillo 2019).

Benefits of a biogeographical approach to temporal lineage diversification dynamics
LTSTs and a biogeographical approach to investigating the temporal dynamics of diversification are important to understand the evolution of biodiversity. Diversity is not evenly distributed, with some regions being more diverse than others, for example, the Indomalayan realm contains a greater amount of agamid diversity (Figure 2). This uneven distribution of diversity may be because clades have been, 1) diversifying more rapidly, 2) diversifying for longer, or 3) have a greater carrying capacity in particular regions. LTSTs allow us to more easily visualise these kinds of spatiotemporal dynamics of diversification within clades. In the agamids we can see that diversification appears to have been occurring much earlier in the Indomalayan region, followed by the Australasian, Afrotropical, and Palearctic regions. We can also see that diversity began to build in the Palearctic, Afrotropics and Australasian regions at a similar time, but different temporal dynamics have led to stark differences in diversity between these regions. Notably, we can see that the accumulation of diversity was more rapid in Australasia during the period of between roughly 60-15myr, before slowing towards the present. Palearctic diversity, on the other hand, has been increasing more steadily in the past 20myr and at some point around 10myr overtook Australasia in species diversity. Comparatively, the Afrotropics have had a lower rate of species accumulation. Investigating regional patterns in diversification may also help clarify processes that are lost when looking at clade-level temporal diversification dynamics. For example, in the Agamidae as a whole we see a diversification slowdown towards the present ( Figure 1A). This pattern appears to be repeated in differing degrees within the different biogeographic realms. In the Palearctic, however, we can see that diversification appears to be still increasing steadily or only showing the earliest beginnings of a slowdown ( Figure 2B), suggesting ongoing steady diversification from a relatively late-start in this region. This interesting aspect of agamid diversification is overlooked without an explicitly biogeographical approach.

Conclusion
Lineage-through-space-and-time plots contain information that is useful for interpreting how diversity has arisen in different biogeographic regions. Clade-level temporal diversification dynamics such as understood using the classic lineages through time plot may miss key dynamics that occur within and between different regions. Differences in the tempo of lineage accumulation in different regions is particularly obvious in the case of agamid lizards but is likely to be true in many systems across different spatial and phylogenetic scales. LTSTs offer a visualisation tool for post-hoc exploration of these dynamics, and the method presented here also allows users to extract this temporal diversity data which can be used for novel applications (e.g., Skeels and Cardillo 2019).