Frontiers of Biogeography Cryptic diversity of Rhinolophus lepidus in South Asia and differentiation across a biogeographic barrier

volant mammals across the Palghat Gap. Abstract Peninsular India is an important region for mammalian diversification and harbors major biogeographic barriers. However, little is known about the role of this region in the diversification of bats though it harbors high chiropteran diversity. In this study, we used phenotypic, acoustic, and genetic markers to assess the diversification of Rhinolophus lepidus bats in South Asia. We first investigated if peninsular India is associated with speciation of R. lepidus . Further, we tested if the Palghat Gap acts as a biogeographic barrier to gene flow in this species. Our results revealed cryptic genetic diversity in peninsular India suggesting that this region holds at least one endemic species level lineage of the R. lepidus species complex. Analyses of populations of R. lepidus across the Palghat Gap in the Western Ghats revealed clinal variation in phenotype, with bats south of this barrier being bigger and emitting echolocation calls of higher frequency. We also observed that populations on either side of the Palghat Gap have


Introduction
South Asia encompasses several distinct biogeographic zones and biodiversity hotspots (Myers et al. 2000, Tamma et al. 2016. While much of the mammalian diversity is concentrated in the hotspots of northeastern India and the Western Ghats, mammalian endemism is largely concentrated in peninsular India, specifically the Western Ghats (Tamma et al. 2016). Although bats follow these general mammalian trends, much of their endemism in peninsular India is currently considered to be at the subspecific level, and often as part of large, polytypic super-species composites that may contain high degrees of cryptic species-level diversity (Bates and Harrison 1997, Chattopadhyay et al. 2012, Chattopadhyay et al. 2016. In this context, Rhinolophus lepidus presents an interesting case. It harbors considerable cryptic diversity, and species-level lineages across its range in South and Southeast Asia have been discovered recently (Csorba et al. 2003, Soisook et al. 2016. In the discussion of its taxonomy, we follow the treatment of Csorba et al. (2003).
Frontiers of Biogeography 2021, 13.4, e49625 © the authors, CC-BY 4.0 license 2 Peninsular India is known to harbor at least two taxa of small horseshoe bats, R. pusillus gracilis and R. lepidus lepidus (Bates and Harrison 1997), which are sympatric in the mountain ranges of the southern Western Ghats, often co-inhabiting the same cave and can be morphologically difficult to differentiate (Bates and Harrison 1997). In general, in the Western Ghats smaller bats (forearm length: 34.9-37.8mm) are identified as R. pusillus gracilis and larger specimens (forearm length: 37.0-41.8mm) are identified as R. lepidus lepidus, although intermediate phenotypes are encountered frequently (Bates and Harrison 1997). Genetic data from other regions, e.g. Indochina (Soisook et al. 2016), suggest that the R. lepidus complex may consist of multiple species-level lineages with overlapping morphology and echolocation frequencies. In light of these findings, it becomes important to study the evolutionary dynamics of the lineages of these bats in peninsular India, assess their affinities with Southeast Asian lineages, and address differentiation in the Western Ghats.
In the present study, we aimed to study the evolution of R. lepidus from southern India using bioacoustic, morphological, phylogenetic, and population genetic methods. R. lepidus is a forest species, generally associated with hilly areas in peninsular India Harrison 1997, Csorba et al. 2003). We sampled R. lepidus bats across the Palghat Gap (Figure 1a), a major biogeographic barrier (Robin et al. 2015) in the southern Western Ghats mountain range to assess phenotypic and genetic differentiation and to reconstruct their phylogeny. Palghat Gap is a ~500 million year old, low elevation area (average elevation ~140m, ~40km width at 11 o N) in the Western Ghats (Robin et al. 2010). It is a known biogeographic barrier for many species (Robin et al. 2010, Vidya et al. 2014, Robin et al. 2015, Vijayakumar et al. 2016, Joshi et al. 2018).

Sampling
We concentrated our sampling efforts on the Western Ghats mountain range, sampling three sites, one south of the Palghat Gap (Kalakkad Mundanthurai Tiger Reserve, henceforth KMTR: 8.53N, 77.45E) and the other two north of the Palghat Gap (Gudalur: 11.47N, 76.42E and Karkala: 13.07N, 74.99E) (Figure 1a). We captured bats in mist nets (Avinet, USA) close to their day roosts either during emergence (Karkala) or in foraging areas in KMTR and Gudalur. We determined their age (juvenile or adult by looking at the epiphyseal gap) (Brunet-Rossinni and Wilkinson 2009) and sex by visual observation. For each individual captured, we weighed body mass using a spring balance (Pesola, Switzerland) and measured length of forearm, ear and tibia using a dial caliper (Avinet, USA) (Table 1). Handling time was minimal and bats were released within 15 minutes of capture. We captured 27 bats that were identified as R. lepidus lepidus in the field based on external morphology, specifically the  (Table 1) following Bates and Harrison (1997). For genetic sampling, we obtained 4 mm of tissue from biopsy punches from the plagiopatagia and stored them in absolute ethanol. We recorded vocalizations during rest (handheld or hanging stationary after release) as well as in flight (post-release) using a Petersson 1000X Bat Detector (Pettersson Elektronik, Sweden). Resting echolocation calls (Figure 1b and S1) were analyzed following Chattopadhyay et al. (2012). In brief, we measured the frequency of highest intensity (principal frequency) using a sampling rate of 500 kHz in BatSound 4.03 (Pettersson Elektronik, Sweden) with 90% overlap in the Hanning window. For each bat we measured the principal frequency across ten calls with the best available signal to noise ratio. In order to assess the phenotypic variation across samples, we initially visualized the spread of each morphological variable as well as that of the principal frequency alongside measurements of the same parameters from other lineages of closely related Asian horseshoe bats (Csorba et al.2003, Soisook et al. 2016 to understand the extent of character overlap. To better understand the morphological and bioacoustic diversity across the Western Ghats populations and possible sexual dimorphism, we performed two-way multivariate analysis of variance (MANOVA) using a combination of vocal data (principal frequency) and morphological data (forearm length and body mass) in R (R Core Team 2016). MANOVA requires the variable to be distributed normally. Hence, we performed a Shapiro-Wilk test in R, ensuring that morphological and principal frequency parameters were normally distributed (p value > 0.05). After MANOVA, we performed a post-hoc Tukey's test for variables with significant differences between groups. Measurements of ear length and tibia length were removed from analysis, as these were not recorded for the Gudalur population.  [TG]). These loci have been used widely in bats for both phylogenetic and population genetic analyses (Li et al. 2006, Stoffberg et al. 2010. Sequences of the primers used for each gene are provided in Table S1. We used Multiplex Mastermix (QIAGEN, Germany) for amplification of all genes (PCR conditions provided in Table S1) and ExoSap-IT (USB) to clean PCR products. Samples were sequenced on an ABI3100XL Genetic Analyzer. We edited the chromatograms manually using FinchTV (http://www.geospiza.com/Products/ finchtv.shtml) and aligned sequences in MEGA 7.0 ). We did not observe any stop codons in the mitochondrial gene sequences. The nuclear data were phased in PHASE (Stephens et al. 2001, Stephens andDonnelly 2003) as implemented in DnaSP v5 (Librado and Rozas 2009). All nuclear analyses were performed with phased data. We also calculated the number of variable and parsimony informative sites in MEGA.

Genetic distance
We computed both the net p-distance as well as the Kimura 2 Parameter distance, including sequences of various taxa closely related to Rhinolophus lepidus (Csorba et al. 2003, Soisook et al. 2016. For all lineages for which there are reported nuclear gene sequences on GenBank (Table S2), we also computed nuclear genetic distances at each locus and averaged across all loci.

Phylogenetic reconstruction
Different approaches can be used for multi-locus phylogenetic reconstructions, including concatenation of multiple loci into a supermatix as well as coalescent species tree approaches (Edwards 2009). However, Genbank sequences of the two nuclear introns were only available for R. monoceros (Stoffberg et al. 2010). In order to improve taxonomic coverage within our dataset we performed phylogenetic reconstructions using concatenation for the mtDNA dataset (given that cyt b and CR are in close linkage disequilibrium) and species tree reconstruction for the subset of species for which nuclear gene sequences were available. We used the species tree reconstruction as a topological guide for population genetic modelling and also to obtain divergence times for R. lepidus lepidus. For the mtDNA dataset, we performed two separate phylogenetic reconstructions, one with 639bp of cyt b and another with concatenated cyt b (1075bp) and CR (445bp) sequences using Bayesian inference as implemented in MrBayes 3.2.6 (Ronquist et al. 2012) with two independent runs each at 2,000,000 generations and four chains. We kept the swapping frequency and temperature at default and obtained the best substitution model for each locus from jModeltest 2.1.10 (Darriba et al. 2012) based on corrected AIC values (AICc) (see Table S3 for the best substitution model). Trees were sampled every 500 th generation and diagnostics recorded every 5,000 th generation, ensuring that the standard deviation of the split frequency had reached below 0.01 by the end of each run. We checked parameter convergence in Tracer v1.6 (Rambaut et al. 2014) and obtained consensus trees from MrBayes after discarding a 25% burnin. Phylograms were viewed in FigTree v1.4.2 (Rambaut 2015). Sequences were concatenated using Sequence Matrix v1.7.8 (Vaidya et al. 2011). In addition, we also constructed a Bayesian tree for the concatenated nuclear loci only using MrBayes. We used the abovementioned settings for the tree construction except for the run length, wherein for the nuclear tree the MCMC run was performed for 8,000,000 generations.
We further performed species tree reconstruction in *BEAST 1.8.3 (Drummond et al. 2012). Species trees account for reticulations resulting from incomplete lineage sorting and are thus known to be more powerful in reconstructing phylogenies from multiple loci (Edwards et al. 2007). We used a relaxed clock model (lognormal distribution) to obtain divergence times of various lineages present in our dataset. We used one calibration point of 39 million years for the Rhinolophus-Hipposideros divergence from the previous literature (Teeling et al. 2005) and performed 100 million iterations, sampling every 100 th iteration. Two independent runs were conducted with a 50% burnin and combined. For each run we checked for parameter convergence in Tracer, obtained a maximum clade credibility tree keeping median heights (at 50% burnin) in TreeAnnotator 1.8.3 and viewed the species tree in Figtree. For the combined run we performed a 10% burnin in TreeAnnotator. GenBank accession numbers for all sequences are provided in Table S2.

Modeling gene flow and vicariance
We performed coalescent simulations and used an Approximate Bayesian Computation (ABC) approach as previously implemented by Lopes et al. (2009), using both nuclear and mtDNA to model gene flow. We first confirmed the topology observed in the species tree reconstruction using both the ABC and phylogenetic approaches (supplementary material). Further, using this topology (Karkala and Gudalur are sister populations) we then tested five different models ( Figure 2) based on our understanding of population segregation within R. lepidus lepidus in the Western Ghats and performed one million simulations for each model, calculating thirty summary statistics in popABC for each simulation (Table S4). The five scenarios of gene flow and isolation simulated were, a) no gene flow model (model A); b) full migration model (model B); c) low migration model, a subset of the full migration model wherein we modeled lower gene flow from KMTR to Gudalur and Karkala (model C); d) partial migration model, providing for migration between Gudalur and Karkala populations only (model D); and e) divergence model, where Gudalur and Karkala recently diverged and KMTR diverged earlier from the ancestor of Gudalur and Karkala (model E) (Figure 2). The prior space for divergence time for model E was more restricted compared to other models. We restricted the divergence time between Gudalur and Karkala to less than or equal to 100 generations and for the ancestor of Gudalur, Karkala and KMTR to less than or equal to 30,000 generations. For the low migration model, KMTR contributes only 10% of the immigrants to Gudalur and Karkala (model C; Figure 2). We considered a generation time of one year for our simulations. We used a 0.01 tolerance level to determine the best model. We performed model selection using the multinomial logistic regression method in the abc package (Csilléry et al. 2012). Prior to model selection, we quantified the power of summary statistics to discriminate among models using both soft and hard classifications. We obtained posterior probabilities of models and calculated their Bayes factor to determine the best model (Csilléry et al. 2012). In addition, each model's goodness of fit was evaluated to test if the observed summary statistic falls within the prior distribution for the best model. We further performed five million simulations for the best model and used the neuralnet method to estimate parameters.

Sampling and genetic data
We captured a total of 27 individuals of R. lepidus ( Figure 1a) and generated 2.5kb of sequence data for each individual (cyt b: 1075bp, CR: 482bp, PRKC1: 340bp, TG: 664bp). All sequences generated in this study are available on GenBank (MT640147-MT640254, see Table S2). We did not detect any signal of recombination or deviation from neutrality within our sequence dataset from the Western Ghats (Table S5). The number of variable sites for each locus are provided in Table S6.

Vocal and morphological differentiation
A visual inspection of the morphological and vocal data revealed considerable overlap between R. lepidus and other closely related horseshoe bat lineages (Table 1). We observed that all three phenotypic variables (principal frequency, forearm length and body mass) were significantly different across populations of R. lepidus (approximate F: 5.83; p value: 0.000253). There was no sexual dimorphism within our dataset (approximate F: 0.92; p value: 0.45). Further, Tukey's test and boxplots of each variable revealed that the southernmost population (KMTR) of R. lepidus lepidus was distinct from the other populations based on body mass and peak frequency ( Figure 3). Overall, we observed a clinal (north to south) pattern in all three phenotypic variables wherein bats from the southernmost population (KMTR) were larger with higher peak frequencies (Figure 3a), bats from the northernmost population smallest with lowest peak frequencies and bats from Gudalur intermediate in size as well as peak frequency.

Phylogenetic reconstruction
All three Bayesian trees, the cyt b tree (Figure 4a), concatenated mtDNA tree (Figure 4b) and the concatenated nuclear tree ( Figure 5) revealed that the southern Indian samples of R. lepidus lepidus form a monophyletic group. For the cyt b tree with wider taxonomic coverage, posterior probabilities were generally lower (Figure 4a) than for the 2-locus tree (Figure 4b). On both mitochondrial trees, one individual from Karkala (RRMK1, indicated by black arrow) emerges as basal to all other individuals from the South Indian lepidus population (Figure 4a and 4b). However, we did not observe this for the concatenated nuclear tree, where the sample was embedded in a clade with multiple samples (Figure 5). In general, we observed monophyly of all bat lineages closely related to the R. lepidus complex (Csorba et al.2003, Soisook et al.2016) that were a part of our study panel (Figure 4, 5 and 6) without resolving their internal relationships especially based on the mitochondrial tree (Figure 4a and 4b).
The species tree agreed that R. lepidus lepidus is a distinct monophyletic lineage ( Figure 6). The species tree also revealed a sister relationship of the two northern populations, Karkala and Gudalur, with the southernmost population (KMTR) basal to this clade. This internal topology of R. lepipus lepidus was used as a guide for population genetic modelling.

Genetic distances
In general, pairwise cyt b genetic distances between various lineages of R. lepidus and R. pusillus groups of bats ranged between ~2 to 4% (Table S7). Both Kimura 2 Parameter and p-distance measures returned similar values (Table S7).
Within our Indian study populations, most individuals exhibited shallow genetic divergence. One individual from Karkala was exceptional in that it emerged as separate in mitochondrial phylogenetic reconstructions and displayed deep mitochondrial divergence (individual indicated by black arrow in Figure 4a and 4b). However, the nuclear genetic distance of this individual to other samples of the south Indian group remained low and comparable to pairwise distances within the R. lepidus lepidus Southern India lineage (see Supplementary Material Figure S2b). Given its deep mtDNA divergence from other samples of R. lepidus lepidus we removed this individual from all further analyses within R. lepidus lepidus.

Modeling gene flow and vicariance within southern Indian R. lepidus lepidus
The posterior probability of the divergence model (model E) was the highest based on both the rejection (0.59) and multinomial logistic regression (0.91) criteria ( Table 2). The observed Bayes factor for the divergence model (model E) was also high (above 15) when compared to other models ( Table S8). The summary statistics differentiated among the five models based on both  Table S2. soft and hard classification (Table S9 and S10). For the divergence model, we performed five million simulations and estimated parameters using the neuralnet method at 0.01 tolerance. The divergence time between Gudalur and Karkala was estimated at approximately 65 years ago (confidence interval 6-99 years) and between the Gudalur/Karkala ancestor and KMTR approximately 4,700 years ago (confidence interval 1,187-14,414 years) ( Table 3). The effective population size of Karkala was around 53,000 (confidence interval 5,323-97,528), much higher than Gudalur and KMTR (Table 3).

Peninsular India harbors a distinct species-level lineage of Rhinolophus lepidus
Rhinolophus lepidus has experienced numerous taxonomic revisions in the past century, owing to the phenotypic diversity across its presumed range in South and Southeast Asia. While traditional taxonomic treatments included five subspecies (nominate in peninsular India, monticola in northwestern India to Afghanistan, shortridgei in Myanmar, feae in Indochina and refulgens in Peninsular Malaysia), recent evidence from phenotypic data as well as DNA-based studies have elevated shortridgei and refulgens to species level (Csorba et al. 2003, Stoffberg et al. 2010, and additionally unearthed the presence of multiple cryptic lineages in Indochina (Soisook et al. 2016). However, these studies were unable to make definitive statements about the taxonomy of the complex without the inclusion of representatives of the nominate lineage in South Asia. Our present investigation is the first in this regard.
While the phenotypic parameters herein investigated did not exhibit any differences between peninsular Indian R. lepidus lepidus and other closely related lineages of horseshoe bats (Table 1), the former can be differentiated from its sympatric congener R. pusillus gracilis by its longer forearm (Bates and Harrison 1997). Our genetic data suggest that R. lepidus lepidus is a comparatively deeply differentiated lineage (Figures. 4, 5 and 6), which probably diverged from other horseshoe bats approximately one million years ago (Figure 6), although there is substantial uncertainty surrounding this estimate. Mitochondrial distance estimates were within the range of other sister-species relationships observed in the genus (Figures 4 and 6, Table S7). Considering species-level mitochondrial divergence and phylogenetic distinctiveness of the R. lepidus lepidus lineage of peninsular India, we propose that it should be afforded species rank (Figures 4, 5 and 6, Table S7), with the resulting separation of Southeast Asian forms as separate species. Comparative genetic data are not available for the western Himalayan subspecies monticola, which at present is the only other named taxon from South Asia besides the nominate. Our results add another example of endemism of bats in peninsular India and point towards this region as a diversification hotspot Harrison 1997, Chattopadhyay et al. 2012, present study). We recommend that genetic assessment Extent of mutation along branches is represented by a scale bar of substitutions below each tree. Black arrow denotes one individual from Karkala which revealed mito-nuclear discordance. Accession number for all the sequences used for phylogenetic analysis is provided in Table S2.  Table S2. of R. lepidus populations from northeastern India should be prioritized, specifically to determine the taxonomic status of these bats in this biodiversity hotspot. Finally, future studies should preferably be based on a multi-locus approach (both mtDNA and genome-wide nuclear loci) including all the known taxa to shed light on the evolution of the R. lepidus complex.
We noticed that a single individual of R. lepidus from the population of Karkala harbored a mtDNA sequence far diverged (~3%) from other peninsular Indian R. lepidus samples ( Figure S2). It is possible that this individual exhibits an introgressed mtDNA haplotype from the closely related and sympatric congener, R. pusillus gracilis, present in the Western Ghats, which often co-inhabits the same day roosts. As we lacked comparative genetic material of R. pusillus gracilis, we cannot make any inference regarding the events which might have resulted in this puzzling haplotype. However, it should be noted that this sequence was also far removed from any other known R. pusillus lineages. Given that R. pusillus, too, has experienced recent major taxonomic revisions due to the discovery of cryptic genetic diversity through DNA markers, we recommend a thorough genetic and phenotypic investigation of R. pusillus populations in the Western Ghats as well as in the rest of its range in South Asia.

Palghat Gap potentially drives subdivision within R. lepidus
Climatic fluctuations and associated habitat change can produce distributional changes and Table 2. Posterior probabilities for all models based on rejection and multinomial logistic regression. Model with the highest posterior probability is shown in bold font. For more information about the models see Figure 2.

Model
Model code  Table 3. Parameter estimation from the best supported model (divergence model, model E, Figure 2). TMRCA-Time to most recent common ancestor, KMTR-Kalakkad Mundanthurai Tiger Reserve.  (Hewitt 2000, Chattopadhyay et al. 2017. Within R. lepidus populations in the Western Ghats we observed evidence of clinal variation (north-south gradient) in phenotypic data (Figure 3). While latitudinal clinal variation is quite well documented in many bat species, there are exceptions as well. For example, in R. pusillus from China and Carollia perspicillata from South America, an inverse cline effect was observed (Barros et al. 2014, Wang et al. 2020. The results of our study point towards a trend indicative of inverse clinal variation; sampling of more populations across this clinal gradient in the Western Ghats might shed more light into the causes and consequences of this pattern. KMTR populations were generally larger in size with higher peak frequencies ( Figure 3) and showed evidence of genetic isolation from populations north of the Palghat Gap since the mid-Holocene (Figure 2, Tables 2 and 3). While clinal variation in phenotype was also reported at least for the fruit bat Cynopterus sphinx across peninsular India (Storz et al. 2001) our study is apparently the first report in volant mammals showing isolation across the Palghat Gap. However, it is important to note that the relative isolation of the KMTR population might be a cumulative effect of both the Palghat Gap and the Shencottah Gap, which is situated immediately north of the KMTR population. Despite being much narrower than the Palghat Gap, the Shencottah Gap (7.5 km wide) is also an effective biogeographic barrier and may have played an important role in biotic diversification within the Western Ghats (Robin et al. 2010;Robin et al. 2015).

TMRCA of
In the case of R. lepidus, the genetic divergence across biogeographic barriers in the Western Ghats is quite recent and can be attributed to a change in climate during the mid-Holocene. At this time, peninsular India experienced increased aridity and reduction in moist forest cover (Sukumar et al. 1993, Rajagopalan et al. 1997. R. lepidus likely retreated higher up into the mountains during this period, isolating populations across the gap. This period is also associated with demographic fluctuations of some bats in peninsular India (Storz and Beaumont 2002). Future studies comprising more sampling localities from either side of both the Palghat Gap and the Shencottah Gap can shed more light into the relative importance of these barriers in driving population divergence of this species within the Western Ghats. R. lepidus is widely distributed across peninsular India and our sampling regime only consisted of a subset of this distribution. Hence range-wide sampling, genome-wide markers and isolation-migration models considering historical demography should be used to provide more clarity about the diversity and phylogeography of R. lepidus.

Author Contributions
BC and UR designed the study; BC, KMG, SK, DPS and AKVK performed sampling; BC and KMG generated the data; BC and KMG performed all analyses with critical inputs from UR and FER; BC wrote the manuscript with significant contributions from all coauthors.

Data Accessibility
All sequences generated in this study are available on GenBank (Accession ID: MT640147-MT640254).

Supplementary Material
The following materials are available as part of the online article at https://escholarship.org/uc/fb Figure S1. Spectrogram depicting representative echolocation call attributes for a) Karkala, b) Gudalur, and c) KMTR populations. Figure S2. Boxplots depicting the distribution of p-distances a) mitochondrial markers and b) nuclear markers. Table S1. Primer information and annealing temperatures for each locus Table S2. GenBank accession numbers for the sequences used for phylogenetic analyses. All sequences generated in this study are presented in bold font Table S3. Substitution model selected for each gene based on corrected AIC (AICc) values in jModeltest 2.1.10 (Darriba et al. 2012). AIC: Akaike's Information Criteria Table S4. Summary statistics used for Approximate Bayesian Computation  Table S5. Results of test for significant deviation from neutrality Table S6. Sequence length, number of variable sites and number of parsimony informative sites for the locus sequences for southern Indian populations of R. lepidus lepidus Table S7. Net pairwise genetic distances between various rhinolophid lineages based on 639 bp of cyt b. Kimura 2-parameter distances (Kimura, 1980) are presented in the lower right corner and p-distances in the upper right corner  Table S8. Bayes factors for various models based on multinomial logistic regression. For details about the models refer to Figure 2 and Table 2  Table S9. Mean model posterior probabilities. For details about the models refer to Figure 2 and Table 2  Table S10. Confusion matrix for various models. For details about the models refer to Figure 2 and Table 2  Table S11. Comparison of tree topologies using the approximately unbiased (AU) test and model selection using Approximate Bayesian Computation