Fast evolutionary genetic differentiation during experimental colonizations

Founder effects during colonization of a novel environment are expected to change the genetic composition of populations, leading to differentiation between the colonizer population and its source population. Another expected outcome is differentiation among populations derived from repeated independent colonizations starting from the same source. We have previously detected significant founder effects affecting rate of laboratory adaptation among Drosophila subobscura laboratory populations derived from the wild. We also showed that during the first generations in the laboratory, considerable genetic differentiation occurs between foundations. The present study deepens that analysis, taking into account the natural sampling hierarchy of six foundations, derived from different locations, different years and from two samples in one of the years. We show that striking stochastic effects occur in the first two generations of laboratory culture, effects that produce immediate differentiation between foundations, independent of the source of origin and despite similarity among all founders. This divergence is probably due to powerful genetic sampling effects during the first few generations of culture in the novel laboratory environment, as a result of a significant drop in N e. Changes in demography as well as high variance in reproductive success in the novel environment may contribute to the low values of N e. This study shows that estimates of genetic differentiation between natural populations may be accurate when based on the initial samples collected in the wild, though considerable genetic differentiation may occur in the very first generations of evolution in a new, confined environment. Rapid and significant evolutionary changes can thus occur during the early generations of a founding event, both in the wild and under domestication, effects of interest for both scientific and conservation purposes.


Introduction
Population size plays a key role in determining the relative importance of natural selection and genetic drift, with small isolated populations more exposed to stochastic loss of genetic variability, potentially reducing their subsequent response to selection (Robertson 1960). During a colonization event, a population may experience a considerable reduction in size. The effect of a census population-size bottleneck on effective population size (N e ) is expected to be strong even when a population expands quickly after the initial colonization event (Wade and McCauley 1988; stochastic effects are also expected to lead to genetic differentiation when several independent colonizations from the same source take place. When populations colonize a novel environment, high variance in reproductive success may contribute to low N e (Hedrick 2005), further increasing the importance of drift. Natural selection can thus augment the effects of genetic drift during colonization of a novel environment.
There is abundant evidence for the loss of variability among neutral genetic markers during colonization events (e.g. Pascual et al. 2007). Both theory and simulations have shown that when populations grow quickly after a founder event, differences in founding gene frequencies tend to be maintained even in the presence of gene flow (Boileau et al. 1992). Only over long time scales will gene flow from the source population partly compensate for initial losses of genetic diversity (Dlugosch and Parker 2008). Moreover, rare alleles may get amplified to high frequencies by a combination of founder effects and rapid population growth (surfing), contributing to genetic differentiation between populations (Excoffier and Ray 2008). The effect of initial historical events may thus play an important role in the evolution of differences between populations derived from independent colonization events, even when a common source population is involved.
While the success of colonization does not depend on genetic variation among neutral markers, the latter can still provide a measure of how strong initial bottleneck drift effects are. Colonization history inferred from neutral genetic markers is thus a useful tool when testing the context for adaptive evolution in particular, when population sizes are expanding and have not reached equilibrium between gene flow and selection (Keller et al. 2009).
Reduced population size can have two well-defined genetic sampling effects on a newly founded population: a reduction in allele number (Nei et al. 1975) and a bias in allele frequencies (Waples 1998). Even high frequency alleles can be underrepresented in the new population, and relative allele frequencies will be rearranged (Ryman 1997;Palm et al. 2003). This can then result in a shift in the mean phenotypes of the colonizers relative to the source population (Keller and Taylor 2008). Differences in the evolutionary dynamics among multiple new colonies derived from the same ancestral population may then result.
Ultimately, laboratory populations of model organisms are founded from collections in the wild, with a limited number of founder individuals that yield lab N e values which can be orders of magnitude smaller than those of the natural populations of origin. Thus, the initial generations in the laboratory can be seen as a type of colonization process (Matos et al. 2000). Since this scenario of both reduced initial population size and new selective pressures will lead to lower effective population size, it is a general expectation that multiple laboratory introductions will lead to genetic differentiation among lab populations, even when they share a common source and undergo parallel selection.
Genus Drosophila is widely studied in the laboratory not only for genetics, for which it has long been a leading model organism, but also in experimental evolution (e.g. Prasad and Joshi 2003). Disparities among the experimental results found among different laboratories or with different experimental stocks in the same laboratory may be partly due to stochastic events that took place during the early steps of founding the various Drosophila populations that have been employed. This should be taken into account before offering more elaborate interpretations (cf. Ackermann et al. 2001). This does not mean that laboratory studies are useless, in terms of their value for generalization. The fact that laboratory evolution may be a 'local' process (vid. Rose et al. 2005) illustrates material complexities of evolutionary dynamics that will often be important for the correct interpretation of both genetic and evolutionary research, even in laboratory models.
We have repeatedly performed studies of evolutionary trajectories during laboratory adaptation of Drosophila subobscura populations derived from collections in the wild. Two sets of populations derived from geographically close locations revealed significant differences in the laboratory evolution of fitness-related traits after their founding from the wild (Simões et al. 2007). Further, these founding populations were differentiated at microsatellites by generation three of laboratory culture (Simões et al. 2008a). Nevertheless, the design of that study did not allow us to determine whether this differentiation was due to distinct geographic locations or initial sampling effects, whether at the founding generation or during the next two generations. To disentangle these two possible effects at the level of life-history traits, we sampled two new sets of populations from each of the locations previously used. These new populations again exhibited location-dependent differences in the laboratory evolution of life-history traits. Moreover, we also detected some sampling effects within locations both at the start of adaptation and in initial evolutionary rates, for weakly selected traits (Simões et al. 2008b). An important analysis still missing from that study is the systematic characterization of the genetic differentiation of neutral markers between populations, whether among the first founders or at an early generation after laboratory introduction. Data of this kind will allow the estimation of the possible effects of founder events on genetic variability and on differentiation that may reveal the role of early colonization events on laboratory adaptation.
Here we deepen a previous analysis of the genetic variability and differentiation of these six foundations both at the founders and at laboratory generation three (Santos et al. 2012), taking into account their natural sampling hierarchy involving two locations, two years and two samples from each location in one of the years, each one three-fold replicated by generation two. We show that striking stochastic effects occur in the first few generations of laboratory culture, effects that produce immediate differentiation between populations derived from different foundations, independently of their source of origin and despite similarity among all founders.

Sampling design
To screen the occurrence of sampling effects in the founders and during the early stages of laboratory evolution, we derived two independent foundations of D. subobscura a few days apart in 2005, in each of two natural sites, Adraga, in Sintra (from here on referred as FWA and FWB) and Arrábida (here named as NARA and NARB). These are the same Portuguese locations sampled in 2001, samples of D. subobscura which gave rise to the TW (from Sintra) and AR (from Arrábida) foundations (Simões et al. 2007(Simões et al. , 2008a (figure 1). The sampling locations are 50 Km apart, being separated by the Tagus river. Fermented fruit baits were used for all collections. To assure that female founders were fertilized (less than 30% being virgin upon collection), groups of around five females and two males, derived from the same collections, were formed and maintained in vials. The TW foundation was collected on October 12th and 13th of 2001, being composed of 110 females and 44 males, while the AR foundation derived from 59 females and 24 males collected on 10th, 14th and 15th of October of the same year. All 2005 collections presented more males than females, the excess being randomly discarded. The FWA foundation was composed of 60 females and 28 males, collected from 5th to 7th of April, while FWB derived from 75 females and 30 males collected on 9th and 10th of April. In Arrábida, 55 females and 24 males collected on April 4th originated the NARA foundation, and 68 females and 30 males collected on April 8th gave rise to the NARB foundation (see figure 1). All foundations were three-fold replicated during the collection of the eggs that gave rise to the second  generation. Maintenance conditions were same for all populations as described in Simões et al. (2008b), involving discrete generations of 28 days, reproduction timed to be close to the age of peak fecundity, controlled temperature of 18 • C and a 12 : 12 h L : D cycle. Flies were kept in vials with David axenic medium stained with animal charcoal. Density was controlled for both eggs and adults at about 80 and 50 individuals per vial, respectively. This corresponded to 24 vials with 80 eggs each, at the start of each generation, and to a variable number of adults as a function of developmental success. Census population sizes were generally between 600 and 1200 adult individuals distributed among 12 to 24 vials. In each generation, adults that emerged from several vials corresponding to each population were mixed together with CO 2 anesthesia before the collection of eggs for the next generation.

DNA extraction and amplification
All 18 laboratory populations were genotyped for nine microsatellites at the third generation after the founding event (TW 1−3 , AR 1−3 , FWA 1−3 , FWB 1−3 , NARA 1−3 and NARB 1−3) . For each of the six foundations, founders collected from the wild (generation zero) were also genotyped for the same markers. This is an adequate number of markers, as Spencer et al. (2000) showed that eight highly polymorphic microsatellites are enough to detect bottleneck effects. Our markers' sequences have the accession numbers GU732209-GU732280 at GenBank. The markers: dsub01, dsub02, dsub05, dsub10, dsub19, dsub20, dsub21, dsub23 and dsub27, used were previously identified and characterized by Pascual et al. (2000), and cytologically localized in the five D. subobscura chromosomes (Santos et al. 2010). For each population, 30 randomly picked female flies were analysed. Altogether, the DNA of 720 females was thus extracted and amplified following the general protocol described in Simões et al. (2008a). Fragment analysis was carried out always in the same ABIPRISM 310 sequencer (Applied Biosystems, Foster City, USA). Allele sizes were estimated by comparing it with the standard GeneScan-500 ROX, using the software GeneMapper ver. 3.7 (Applied Biosystems). In a separate publication (Santos et al. 2012) we used these genotypic data together with other sets of data to analyse how initial variability and its decline may affect subsequent evolution of populations. Data presented here is deposited in the Dryad repository with doi:10.5061/dryad.0fm71.

Data analysis
Tests of Hardy-Weinberg equilibrium: Deficit of heterozygotes per locus and sample was estimated by F IS coefficients and tested based on 4320 randomizations in FSTAT 2.9.3 (Goudet 1995). False discovery rate (FDR) corrections for multiple testing were carried out following theorem 1.3 of Benjamini and Yekutieli (2001). The adjusted α was deter- , m being the number of tests.

Microsatellite variability:
Microsatellite variability was characterized by estimating allelic richness (A) and gene diversity (or expected heterozygosity, H E ) with the software FSTAT 2.9.3 in the founders and at generation three. The variability of the six foundations in each generation was analysed by a Friedman's ANOVA (analysis of variance) across nine loci. Wilcoxon tests were also performed to test for differences between pairs of foundations at the same generation (averaging data of the three replicates of each foundation at generation three). At generation three, we performed simple ANOVAs to test differences in each of the variability estimates across foundations using data of the three replicates within each foundation as individual data. Variability estimates were also tested across years and locations by a bi-factorial ANOVA, with year corresponding to the years of founding, 2001 and 2005, and location for both sites of origin of foundations, Sintra and Arrábida.
The experimental design of the 2005 foundations involves two independent founder events from each of the two locations (FWA and FWB, in Sintra and NARA and NARB, in Arrábida). To test for the effect of such founder events on the genetic differences of variability at generation three, a nested ANOVA was used with the two foundations (each three-fold replicated) nested inside each location (Foun-dation{Location}). The significance of the term Foundation {Location} indicates the occurrence of founder effects between the two events of founding within each location. When differences between the two foundations derived from different sampling events within location were not significant, simple ANOVAs were performed pooling together the six populations inside each location as if it is derived from the same founder event.
For each foundation, allelic richness and gene diversity were compared between founders and populations at generation three using a Wilcoxon matched pair-test performed across the nine loci. Decline in both variability measures in this period (dA for allelic richness and dH E for gene diversity) was calculated by subtracting from unity, the ratio between variability of each population at generation three and variability in the respective founders. To test the differences in decline across foundations, year and location, as well as due to sampling, the same four ANOVA models described above for generation three were used.
Genetic differentiation: Genetic differentiation was estimated by determining the fixation indexes both between foundations (F CT ) at generation three and between populations (F ST ) at generations zero and three. The differentiation between foundations (F CT ) was estimated by AMOVA (analysis of molecular variance) with the software Arlequin 3.5.1.2 (Excoffier et al. 1992), which follows the Weir and Cockerham (1984) framework. The analysis allows the definition of groups of populations, generating a hierarchical structure with the total variance partitioned into covariance components due to different levels of variation: individuals, populations (each replicate is a population) and groups of populations (foundations i.e., groups of three replicates). F CT significance was tested after 1000 permutations of populations among foundations and confidence intervals were obtained by 20,000 bootstraps over loci.
Genetic differentiation at the third generation involving data of the 2005 foundations allowed testing for founder effects due to differences between locations, taking into account the possible role of sampling effects of foundation within locations. To perform this test, these foundations were analysed with a design involving nesting of foundations (samples) inside each location (FWA and FWB, from Sintra and NARA and NARB, from Arrábida). Differentiation between locations was estimated by defining a four-level hierarchical AMOVA with individuals, populations (replicates), foundations (A and B) and locations (Sintra and Arrábida), using the software GDA 1.1 (Lewis and Zaykin 2001). Differentiation relative to the variance at the more inclusive level (location) was given by a Theta-P (θ -P) value, equivalent to F CT , which confidence intervals were obtained after 5000 bootstraps over loci. To understand how founder effects may affect differentiation we also calculated genetic differentiation between locations with a three-level hierarchical analysis pooling together the six populations of the two samples inside each location as if it is derived from a unique sampling effort.
Pairwise F ST values between all populations were also obtained for all populations, including the founders, using FSTAT 2.9.3 software (Goudet 1995). Exact G-tests were performed using GenePop ver. 4.0 (Rousset 2008). FDR correction was applied for multiple testing, considering the number of comparisons inherent to each question, such as: Are the 2005 populations derived from the four foundations differentiated between foundations at generation three (54 comparisons A principal coordinate analysis using pairwise F ST values and including generation three as well as the founders of the 2005 populations was performed in GenAlex 6.4 (Peakall and Smouse 2006).
Given the recent controversy as to what is the best way to estimate genetic differentiation between populations using highly variable molecular markers such as microsatellites (Jost 2008;Ryman and Leimar 2009), we also estimated pairwise Jost's D est values using SMOGD (Crawford 2010). After FDR adjustment, the average D est across the nine loci was estimated in each comparison, with significance tested by t-test (with differences between microsatellites as source of error). Significance of correlations between F ST and D est matrices were also estimated (by Mantel test) using GenAlex 6.4 software (Peakall and Smouse 2006). Effective population size: Effective population sizes (N e ) were estimated both for founder populations and populations at generation three by the linkage disequilibrium (LD) method, using the software NeEstimator 1.3 (Ovenden et al. 2007). Differences between the effective population size estimates in the founders and mean values between replicates at generation three were tested using the Wilcoxon matched pair-test.
The decline in N e between generations zero and three was estimated from dN e = (1 − N e(3) /N e(0) ) * 100. The ratio N e /N (with N being the census size) was also calculated at generation three.
ANOVA tests for detecting effects of foundation, year and location using the models described above were performed for the estimates of effective population size at generation three, its decline since founding (dN e ), and its ratio with census size (N e /N).

Hardy-Weinberg equilibrium
In general, most loci were in Hardy-Weinberg equilibrium (HWE) in the founders, with some exceptions. Locus dsub02 gave an indication of heterozygote-deficit in TW and in NARA. At generation three, most populations significantly deviated from HWE for one or more loci, with the exception of AR 1 , AR 2 , FWA 3 , NARA 1 and NARB 1 . Nevertheless, only dsub02 in TW and dsub10 in FWB showed consistent heterozygote deficits for all replicates at generation three (see table 1 in electronic supplementary material at http://www.ias.ac.in/jgenet/).

Microsatellite variability in the founders and at generation three
Variability levels were globally high both in the founders and at generation three (table 1; details per microsatellite are presented in Santos et al. 2012). Overall the six foundations did not differ significantly either in allelic richness (A) or in gene diversity (H E ), both at the founders and at generation three, tested by Friedman ANOVA. Only the comparison between AR and FWA founders gave significantly higher allelic richness for the former (Wilcoxon Z = 2.19, P = 0.028). However at generation three significant differences for allelic richness were found across foundations by parametric ANOVA (F (5,12) = 4.71, P = 0.013).
When tested for year and location effects at the third generation, a significant effect of year and of the interaction term year × location was found for allelic richness (F (1,14) = 11.40, P = 0.0045; and F (1,14) = 6.78, P = 0.021, respectively) but not for gene diversity. Tukey tests showed a significantly lower allelic richness of AR when compared to FWA, NARA and NARB (df = 12, P = 0.018, P = 0.019 and P = 0.026, respectively). ANOVA tests on the 2005 foundations taking into account the hierarchical structure (the two foundations in each location) gave no significant differences either between or within locations, at generation three for both allelic richness and gene diversity. Most foundations showed a significant reduction in allelic richness across loci between generations (table 1). For some foundations, gene diversity also declined significantly. Both the decline of allelic richness and of gene diversity were significantly different across foundations (F (5,12) = 15.24, P = 0.0001; and F (5,12) = 3.38, P = 0.039, respectively). The decline in allelic richness was also significantly different across years as well as across locations (F (1,14) = 62.49, P = 0.0001; and F (1,14) = 20.25, P = 0.0005, respectively). The 2005 foundations suffered a mean decline of allelic richness of 3.06% per generation compared to the 5.76% observed for the 2001 foundations. Further, the Sintra foundations showed a mean decline of 3.25% per generation in allelic richness while Arrábida foundations had a decline of 4.67%. There was also a significant difference in decline of allelic richness between locations with the 2005 data alone (F (1,10) = 11.07, P = 0.008). Differences between locations remained significant taking into account the hierarchical structure involving the two foundations in each location (F (1,2) = 22.40, P = 0.042), while foundations within location did not differ significantly (F (2,8) = 0.439, P = 0.659). No significant effect of year or location on the decline of gene diversity was detected, both with the whole set of foundations and with the 2005 data alone.

Patterns of genetic differentiation
There was a high concordance between estimates of genetic differentiation using F ST versus Jost's D est , as shown by the significant Mantel test correlations between F ST and D est matrices among all populations (r = 0.977, P < 0.001), between founders (r = 0.751, P = 0.003) and between populations at generation three (r = 0.971, P < 0.001) (see figure 2). Thus, in spite of the bigger values obtained with D est , the rankings of the estimates of genetic differentiation across population pairs were very similar using either index. Nevertheless, few cases of significant differentiation were detected using D est probably due to a lower statistical power of the ttests used in this case (cf. table 2; tables 3 and 4 in electronic supplementary material with tables 5-7 in electronic supplementary material). Below we will focus the results obtained   (table 2a in electronic  supplementary material). Moreover, populations at generation three were generally not significantly differentiated from the founders of origin with the exception of the three AR replicates (table 2). Nevertheless, there were significant differences between some of the founders and populations of other foundations at generation three (table 2). The same general conclusions are suggested by the D est estimates (cf. tables 2b and 5 in electronic supplementary material).
At generation three, the two foundations of 2001 were significantly differentiated, whether estimating pairwise F ST between populations of distinct foundations (F ST = 0.022; P < 0.0001) or F CT between the two groups (F CT = 0.0139; P < 0.0001). By contrast, of the six pairwise comparisons between replicates within foundations, only AR 1 ver-sus AR 2 showed significant differentiation (F ST = 0.0086; P < 0.002). Similar results were obtained when estimating differentiation by D est (data not shown).
When comparing the 2005 populations at the third generation, a four-level hierarchical model was used to test for founder effects between locations taking into account sampling effects within each location. For this purpose, we defined an AMOVA hierarchy with the following levels: individuals, replicate populations within foundations (e.g. FWA 1−3 ), the two foundations within each location (i.e. FWA 1−3 versus FWB 1−3 and NARA 1−3 versus NARB 1−3 ; see figure 1), and locations (FWA and FWB versus NARA and NARB). When using this hierarchical analysis, no significant differentiation was observed between locations (θ -P = 0.0011; 95% CI (−0.00092, 0.00305)). Nevertheless, low but significant differentiation between locations was detected when pooling together the six populations in each location (i.e. six Sintra populations versus six Arrábida populations; θ -P = 0.0038; 95% CI (0.00196, 0.00568)). By contrast,  3 in electronic supplementary material). Of the 12 pairwise comparisons between replicates of the same foundation, only three showed significant pairwise differentiation (FWA 1 versus FWA 2 , FWB 1 versus FWB 3 and NARB 1 versus NARB 2 ). By contrast, all the pairwise generation-three comparisons between populations from different foundations were significantly differentiated, both when comparing populations from Sintra and Arrábida and when comparing populations derived from different foundations of the same location (e.g. NARA 1 versus NARB 3 ). The few significant results obtained using D est were also between populations of different foundations (table 6 in electronic supplementary material). To visualize the results obtained from all populations derived in 2005, a principal coordinate analysis using pairwise F ST values involving all founders and populations at the third generation was carried out (figure 3). The first two axes explain 82.72% of the variation, with the four founder populations located closely together in a central position, and the derived replicated populations dispersed around them. FWA and FWB are farther apart than NARA and NARB foundations mostly due to FWA, which is farther away from the other foundations.
When comparing the third generation of populations derived from different years (72 pairs), highly significant genetic differentiation was always obtained (F ST ranging from 0.0058 to 0.0241, P < 0.001; see table 4 in electronic supplementary material). Significant D est values were obtained in 38 of the 72 tests (see table 7 in electronic supplementary material).

Effective population size in the founders and at generation three
No significant differences in the initial effective population sizes were found across foundations (see table 8  A significant decline in effective population size (dN e ) was found between generations zero and three across foundations (Z = 2.20; P = 0.028), with N e changing from a mean number of 314.7 (standard error of the mean (SEM) 74.3) breeders in the founder populations to 96.9 (SEM 10.6) breeders at the third generation (see table 8 in electronic supplementary material). No significant differences were detected in the N e decline either between foundations, years, or locations. Moreover, no significant differences were detected between the 2005 foundations either between or within locations.  (Weir and Cockerham 1984) were used as genetic distance estimate. f refers to founder populations and replicate populations at generation three are labelled from one to three.

High and consistent genetic variability among initial samples collected from the wild
Genetic variability was very high for all founder populations and in total not significantly different between them, indicating negligible initial sampling effects across multiple collections from the wild. Moreover, these data suggest high stability in the genetic variability of the natural source populations, across both space and time. The high mean genetic diversity levels estimated (from 0.894 to 0.914) place these D. subobscura natural populations among the most diverse for this species, when compared to the analysis of similar sets of loci from other European populations (Pascual et al. 2001).
Moreover, the lack of differentiation between founders (the initial samples of wild individuals) from Sintra and Arrábida suggests that there is extensive gene flow between the two locations, suggesting in turn that they are a single deme. This natural population has maintained remarkably stable genetic composition for neutral markers across years and seasons involved in the two sampling points of this study (autumn 2001 and spring 2005).

Changes in genetic variability during the first generations of a founding event
During the first three generations after introduction, there was a significant decline in genetic variability across all our laboratory populations, an average decline of 4.00% per generation in allelic richness and 0.59% per generation in gene diversity. The stronger decline in allelic richness is in accordance with expectations, as it is a more sensitive measure of the loss of low frequency alleles when population size is reduced (Nei et al. 1975).
In fact, from generation zero to three, the number of alleles with frequency lower than 5% decreased further than the number of alleles in higher frequencies, though more frequent alleles were also affected. This observation is in accordance with studies of the loss of alleles during the colonization of America by D. subobscura (Pascual et al. 2007). However, there is a 4.6% to 36% probability that some alleles were not lost, but instead were not detected due to their low frequency, given the use of a generation-three sample size of 30 individuals genotyped per population (Gregorious 1980). Nevertheless, since the founders and third generation were characterized using the same sample size this effect is not sufficient to explain all of the difference in the distribution of frequencies of alleles between generations. Thus, we conclude that loss of lower frequency alleles has occurred during the early steps of the founding event, although the present quantitative estimates of the magnitude of that loss are biased upward.
Due to this stochastic loss of alleles after as few as three generations, these newly-founded laboratory populations diverged in allelic richness, though not in gene diversity. In particular, there was a clear effect of year, and year × location interaction, for allelic richness, mainly due to the low values for the three replicate populations derived from the Arrábida foundation in 2001. This was also the foundation with the highest decline in allelic richness (6.71% per generation) from initial collection to generation three, and the foundation with the slowest subsequent rate of adaptation to laboratory conditions (Simões et al. 2008b;Santos et al. 2012). Yet, this was initially one of the most genetically variable founder populations, despite having the smallest effective population size. Drift effects may explain all these observations, in that a stronger bottleneck effect during the first laboratory generations of these populations may have led to a higher loss of low frequency alleles with a corresponding but smaller impact on gene diversity (Nei et al. 1975). Differences in the decline of allelic richness were also found between the two locations in 2005, though not between the two foundations from the same location. This particular case suggests that initial founder effects involving different locations or years can inflate later divergence during laboratory evolution.
Thus, our results illustrate the point that multiple populations which derive from the same source population may develop genetic differentiation quickly as a result of an initial bottleneck (cf. Wade and McCauley 1988;Charlesworth 2009), particularly if gene flow from the source populations or among the derived populations does not occur after introduction (Boileau et al. 1992;Dlugosch and Parker 2008). Our data thus demonstrate the relevance of stochastic effects for the genetic variability of neutral markers during the establishment of new populations, which in turn provides an important 'null model' for the interpretation of observed genetic changes in terms of adaptive evolution (Keller et al. 2009) when changes in population structure occur.

Do F ST and D est tell the same story?
F ST and related measures may underestimate genetic differentiation when highly polymorphic molecular markers, such as microsatellites are used leading to the development of alternative estimators such as D est (Jost 2008). Other problems may however arise when estimating genetic differentiation from D est , particularly when the goal is to describe the average amount of differentiation observed over multiple loci (e.g. see Ryman andLeimar 2009, also Leng andZhang 2011). The fact that F ST may increase as polymorphism drops is relevant to our study, since genetic drift is expected to cause a drop in gene diversity, which may also inflate the expected increase in genetic differentiation between populations. Here we found a high concordance between differentiation measured by F ST and D est across all population pairs (see figure 2), despite the lower statistical significance of the latter estimates. Thus, the two measures of genetic differentiation tell a similar story with respect to early differentiation among our populations. We will chiefly base our discussion on F ST estimates.

Disentangling sources of differentiation between wild and laboratory populations
After three generations of culture in the laboratory environment, populations derived in 2001 from Sintra foundations were highly differentiated genetically from populations derived from Arrábida (Simões et al. 2008a). We also detected genetic differentiation at the third generation between the two sets of populations collected in the same location in 2005. Our present results suggest that such differences between foundations could have been caused by sampling effects during the collection from the wild, a problem which has been emphasized by the authors of other studies (Waples 1998;Keller and Taylor 2008). If this is generally the case, laboratory populations derived from the wild may not be representative of their source population, leading to overestimation of differentiation between natural populations. Nevertheless, as we have found no significant differentiation between different sets of founders collected a few days away from the same location, we conclude that overestimation of genetic differentiation due to initial-collection sampling effects is not relevant when analysing as few as 30 individuals from natural populations. Our findings suggest instead that genetic differentiation is more likely to occur during the first generations of evolution in the laboratory due to the effects of low effective population size. This fits the model of Wade and McCauley (1988) for evolution after colonization of a new patch. According to their model, both the number of founders and subsequent effective population size (N e ) across generations are major factors causing genetic differentiation between independent recently founded colonies, with the first factor having a stronger weight. Applying the authors' equations (see equations 1-3, pp. 998 in Wade and McCauley 1988) to our data for the number of founders and effective population sizes, we obtained similar values to those observed in our microsatellite analysis (table 9 in electronic supplementary material). We conclude that the pairwise differentiation we encountered at generation three is easily explained by neutral drift effects during the first three generations of laboratory culture. However, the fact that these populations are adapting to a novel lab environment has probably augmented drift effects, as high variance in reproductive success can result in very low effective population size even in quickly expanding populations (Hedrick 2005).

Is there a signature of history in the initial laboratory evolution of differentiation?
Though no consistent differentiation was observed between the six initial founding samples by the third generation they had significantly differentiated from other founders but not from their own founding sample except for the case of the AR 1−3 populations previously discussed. These results suggest the presence of a signature of the founders on the gene pool of some of the derived populations at generation three. Thus, although sampling effects did not lead to statistically inferable differentiation at the time of first collection, genetic differentiation aligning each new laboratory population with its founder nonetheless arose. This was probably due to the fact that founder populations differ enough in alleles of lower frequencies which are lost preferentially during the first generations of laboratory evolution to have later effects.

The impact of initial genetic drift on subsequent responses to selection
We have concluded this far that genetic sampling effects during a founding event can have repercussions on the genetic variability of newly-founded populations. These evolutionary changes can be misleading if taken as a signature of selection. Moreover, these chance effects can impact the subsequent evolutionary potential of populations, if loss of genetic variability due to bottlenecks affects additive genetic variation in traits relevant for fitness. As an illustration of possible associations between drift events and subsequent response to selection, it is interesting to note that the 2001 Arrábida populations, which had (i) the lowest N e values, (ii) the fastest decline of genetic variability, as well as (iii) the greatest genetic differentiation after three generations of laboratory culture were also the populations that had the slowest subsequent rate of change in the most relevant fitness traits during laboratory evolution (Simões et al. 2007(Simões et al. , 2008b. This leads us to suggest that the differences in the experimental evolution of life-history characters inferred in our earlier publications might have been due to stochastic sampling effects that arose in the first few generations of laboratory culture, and not, as we concluded before, due to differences between the wild source populations (cf. Simões et al. 2008b). Our finding of a significant association between early genetic variability and subsequent laboratory adaptation supports this scenario (see Santos et al. 2012).

General significance for conservation and experimental biology
Most biologists and many conservationists do not reflect much on the evolutionary genetic upheaval that taking individuals out of wild populations necessarily causes. Here we have shown that this upheaval begins immediately within the first three generations of sampling from the wild. In particular, even if a reasonably large sample of organisms is first collected, we have shown here that the much smaller N e values that will almost always prevail in laboratories and zoos are likely to cause an immediate evolutionary shift away from the population-genetic state of the initial sample.
Some might argue that equalization of family contribution, frequently recommended in captive breeding programmes with conservation purposes might have led to contrasting results. Such a management procedure is expected to approximately double the effective population size in a simple neutral model while further benefiting N e by removing the between family component of selection (Rodríguez-Ramilo et al. 2006). In particular, this method has been shown to improve founder representation in the initial generations (Loebel et al. 1992). Nevertheless, a big contrast between the census size of the natural population and the recent colonizer must still occur leading to the expected divergence between populations in the early steps of colonization.
A different procedure that might reduce the decoupling between the natural population and derived colonizers, at least for the purpose of scientific inference, is to create isofemale lines during the first generations of the founding event. While this is a common practice in order to preserve molecular genetic variability over the entire ensemble of isofemale lines (e.g. Kauer et al. 2003), it is unclear what impact such intense inbreeding might have on polygenic traits (Hoffmann and Parsons 1988). Thus, creating isofemale lines too could generate potentially serious problems of its own even for the purpose of scientific inference.
This makes laboratory populations inherently suspect as guides to the properties of wild populations, even their proximally ancestral wild populations. Contrary to the assertions or hopes of Harshman and Hoffmann (2000), we do not feel that many specific inferences about the properties of wild populations are likely to be obtainable from the study of their laboratory derivatives (vid. Matos et al. 2000), just as many results obtained from specific laboratory evolution experiments may not generalize to other laboratory populations (e.g. Rose et al. 2005). Thus laboratory populations, in our opinion, are best viewed as specific instances of a broad range of possible evolving populations. This makes them useful for strong-inference tests of wide-ranging Popperian scientific theories, but not notably useful as material for the inductive study of innumerable properties of the wild populations from which they were derived.
But, more constructively, studies like the present one provide an opportunity to study the evolutionary possibilities that can arise from vicariance events and large-scale dispersal to novel environments. While the study of the colonization of the New World by a European D. subobscura population has provided an important window on the long-term patterns of such events (Gilchrist et al. 2004;Pascual et al. 2007), experimental evolutionary studies allow us to look in much greater detail at colonization events, with respect to both the evolution of components of fitness (vid. Simões et al. 2007Simões et al. , 2008b and, as provided here, the population genetics at a fine scale of temporal resolution. With the application of genomewide sequencing to such experimental evolution, as well as greater replication of source and derived populations, we may soon be able to determine experimentally just how such key evolutionary processes as adaptation and speciation work genetically as population structure ramifies across geographical landscapes.