Frontiers of Biogeography On dendrograms, ordinations and functional spaces: methodological choices or pitfalls?

A number of concerns persist regarding (i) how functional spaces should be quantified, (ii) how phylogenetic richness should be calculated, (iii) and how functional beta diversity should be calculated. Because all current methods have their shortcomings we think that analytical choices are as much a matter of knowing the limitations of the data and knowing the working hypothesis. Only then can one follow their personal choice, weighing up the shortcomings of different methods that, at the end of the day, usually produce qualitatively similar results.

We recently demonstrated that introduced bird species do not make an equivalent contribution to functional and phylogenetic diversity to that of extinct bird species on islands (Sobral et al. 2016). Therefore, the balance of extinction and colonization is not a zero-sum game and island biotas composed of introduced species will not fulfill ecological roles, nor represent the evolutionary histories of pre-disturbance communities. We also highlighted the importance of evaluating changes in alpha and beta diversity concurrently in order to identify "true" compensation scenarios ( Figure  1 in Sobral et al. 2016). Despite corroborating our findings, Villeger et al. (2017) argued that Sobral et al. (2016) was beset by methodological pitfalls. Their chief concerns revolved around: (i) how functional space should be quantified, (ii) how phylogenetic richness should be calculated, and (iii) how functional beta diversity should be calculated. These questions reflect a wider lack of consensus on how these metrics should be generated (Pavoine and Bonsall 2011) and herein we use the arguments of Villeger et al. (2017) as a convenient opportunity to discuss divergent opinions on analytical best practice.
Most authors studying functional diversity consider only those species that belong to the regional pool in functional space (e.g., Boersma et al. 2016, de Bello et al. 2012). In Sobral et al. (2016) we used each island as a replicate and compared differences within each island for our past, native, and present scenarios. Thus, we did not compare between assemblages (islands) and, therefore, we did not compare different functional spaces. Because functional diversity (FD) studies at larger spatial scales are becoming more common (Cianciaruso 2011) there will often be several potential 'regional' pools to choose from and the consequences of a poor definition of the resultant functional space needs to be better discussed. According to Villeger et al. (2017) "when compar-ing functional diversity between assemblages, a single functional space should be computed". In Sobral et al. (2016) we considered 32 islands across the globe where 92% of the species occur on just one or two islands. We therefore find it odd to group together species that have never coexisted in a single functional space. Overestimating the species pool in this way by including species that do not belong to the regional pool can produce serious biases in analyses of functional diversity (de Bello et al. 2012). Indeed, calculating multiple functional spaces can result in different distances between the same species ( Figure 1a in Villeger et al. 2017 but see Figure 1 below). In our opinion, this seems to be a case of disagreeing on how things should be done, rather than a "right or wrong" issue. We are not alone in disagreeing with Villeger et al. (2017) on how functional spaces should be quantified (see e.g. Podani and Schmera 2006, Poos et al. 2009, Martins et al. 2012. While Villeger et al. (2017) are concerned that "distances between the same species varies between spaces", we are more concerned with the fact that including species that never co-exist in a single functional space will alter the distances among all species (including those that do co-exist, Figure 1). In our opinion, such distances are not meaningful, especially to the type of questions that we asked in Sobral et al. (2016).
Calculating a single functional matrix including all species found in all study regions or even in a grid at the global scale is the default choice in large-scale studies. In fact, some of us have already used this technique (see Safi et al. 2011 for a global example with terrestrial mammals). However, as discussed above this can result in species that never coexist affecting the ecological distances of species that do coexist ( Figure 1). Considering that functional spaces are to some extent closely related to a tentative construction of the Eltonian niche space, does it make any sense letting a kangaroo modify the niche overlap between a tapir and a jaguar? Conceptually, at least, that does not make any biological sense. Indeed, there might be a good statistical reason why one would want to standardize non-related assemblages to a global functional pool. For example, if one wants to investigate the relationship between island features (size, habitat number, elevation etc.) and functional diversity. If so, FD values calculated independently for each island's pool are not comparable directly in the same way that island size is. This is because the functional space can use completely different units along different axes for each island (see de Bello et al. 2012 for the only exercise on this matter published so far). A proper definition of the "functional pool" is indeed difficult and both statistical and biological consequences of including species that never coexist in a single functional space should be better explored in future studies.
Villeger et al. (2017) suggested that the correct way to calculate phylogenetic richness is with Faith's PD (PD). However, they ignore the widely acknowledged problem that PD is biased by species richness; the higher the number of species the greater the likelihood of having a higher PD (Pavoine andBonsall 2011, Chao et al. 2015). Thus, it is not surprising they found lower PD with species extinctions (fewer species) and higher PD . Notice how the inclusion of a single species that does not belong to the regional pool alters all the pairwise distances among species that do coexist. Solid lines represent deviations from the 1:1 relationship (red dashed line). According to Villeger et al. (2017) it is not an issue to let a species that never coexisted (nor will do) with the native fauna of a given region to alter the ecological distances among these species. Defining an ecologically relevant distance is a matter for debate, but a good starting point would be to present a logical framework that justifies the inclusion of all possible species in a given functional space without any biological criteria in order to build the "correct" functional space. A potential issue with this is to conclude that all functional spaces ever built (and to be built) are wrong, because -by definition -it is impossible to include all species (no matter scale or taxonomic group one studies).
with species introductions (more species). The consequences of ignoring the fact that some PD and FD metrics are strongly correlated with species richness may lead one to interpret phylogenetic or functional patterns that are not independent of species richness patterns (Figure 2). Nevertheless, when one is interested in evaluating whether the observed phylogenetic diversity of an assemblage is lower, higher or equal to a random expectation, Faith's PD is a good choice (see Miller et al. 2016). In such cases, with the use of appropriate null models the effect of species richness is removed (but see Sandel [in press] for some good criticism on that assumption). Finally, according to Villeger et al. (2017) "there is no rationale behind representing functional relatedness between species using a dendrogram". This, they maintain, is because dendrograms are less related to the original distance matrix than ordinations (Maire et al. 2015). This is not new, but the important question is how much less? Claiming that our dendrogram is poor in having an absolute deviation of > 10% from the original distance matrix and that it therefore should not be used is a similar exercise to saying that a linear model with r 2 = 0.90 is poor. In fact, in a large number of simulations Maire et al. (2015) found absolute deviations around 5%, on average. Villeger et al. (2017) found no qualitative difference between using ordinations and dendrograms to calculate functional beta diversity (neither did Weinstein et al. 2014). The preferred method of Villeger et al. (2017) to calculate functional beta diversity -which is based on ordinations -is however not problem-free; convex hulls are sensitive to extreme points (few species that are very functionally distinct may dramatically affect the functional volume); they also ignore holes or disjunctions in functional space (see Podani 2009, Blonder et al. 2014. Thus, when used to calculate beta diversity, convex hulls may also superimpose regions that are not occupied by any species (empty spaces). Finally, convex-hulls cannot be calculated when there are too few species in a given assemblage. This was the case for Sobral et al. (2016) because in comparing extinct, native and introduced species on several islands the number of extinct or introduced species are often quite low. As a consequence of these low sample sizes we would have to exclude several islands in order to use convex-hulls.
Our more general point is that all current methods have their shortcomings and the analytical choice should be first appropriate to the question, then personal choice between equally appropriate options. . In so far as we can see, and as demonstrated by Weinstein et al (2014) and Villeger et al. (2017), there is no qualitative difference in using ordinations or dendrograms to calculate functional beta diversity.