LIVESTOCK TRADE HISTORY, GEOGRAPHY, AND PARASITE STRAINS: THE MITOCHONDRIAL GENETIC STRUCTURE OF ECHINOCOCCUS GRANULOSUS IN ARGENTINA

A sample of 114 isolates of Echinococcus granulosus (Cestoda: Taeniidae) collected from different host species and sites in Argentina has been sequenced for 391 bp from the mitochondrial cytochrome c oxidase subunit I gene to analyze genetic variability and population structure. Nine different haplotypes were identified, 5 of which correspond to already characterized strains. Analysis of molecular variance and nested clade analysis of the distribution of haplotypes among localities within 3 main geographic regions indicate that geographic differentiation accounts for the overall pattern of genetic variability in E. granulosus populations. Significant geographic differentiation is also present when the sheep strain alone is considered. Our results suggest that geographic patterns are not due to actual restricted gene flow between regions but are rather a consequence of past history, probably related to the time and origin of livestock introduction in Argentina.

Echinococcus (Cestoda: Taeniidae) includes a group of 4 species, i.e., Echinococcus vogeli, E. oligarthrus, E. multilocularis, and E. granulosus. They require 2 hosts to complete their cycles, i.e., a carnivore, in which the adult occurs, and a herbivore, in which the metacestode develops. Hermaphroditic adult parasites are believed to reproduce mainly by self-fertilization (Thompson, 1995), although there is some evidence of crossfertilization (Lymbery et al., 1997;Haag et al., 1998). The virtual absence of outcrossing, associated with the great asexual proliferation potential of the metacestode, may lead to a high degree of strain differentiation, which is well known in at least 1 of the species, i.e., E. granulosus (Thompson and McManus, 2002).
Echinococcus granulosus strains have different intermediate host specificities. Whereas adults consistently infect canids, metacestodes from different strains appear to be adapted to distinct species of domestic and wild herbivorous hosts, including sheep, cattle, pig, horses, and wild cervids. However, the host specificities of a given strain vary enormously. The sheep strain is the least specific, using a wide range of intermediate host species, including several domestic ungulates, humans, and macropods, although fertile cysts (metacestodes undergoing active asexual reproduction) are more frequent in sheep. It is also the most geographically widespread E. granulosus strain, occurring in all continents. Some Echinococcus strains are highly divergent, as has been shown by genetic, morphologic, and developmental characters (see Eckert and Thompson, 1997 for a review), leading to the notion that they should be regarded as different species (Bowles et al., 1995;Thompson et al., 1995;Thompson and McManus, 2002).
Cystic hydatid disease (the infection caused by the metacestode of E. granulosus) is considered a reemerging zoonosis in several countries, such as Bulgaria, Kazakhstan, and the Republic of China (Eckert et al., 2000). Although South America has several endemic areas (Cabrera et al., 1995;Schantz et al., 1995;Eckert et al., 2000), little is known about parasite genetic variation in these regions. In Argentina, where most of the present data have been collected, hydatid disease is endemic in several provinces, affecting not only livestock such as cattle (7% infected), sheep (12.5%), pigs (9.8%), and goats (6.0%) (data provided by Servicio Nacional de Sanidad y Calidad Agropecuaria [SENASA], 1987[SENASA], -1996 but also wild animals such as hares (Schantz and Lord, 1972). In some regions of Neuquen, Chubut, and Rio Negro Provinces, human prevalences of 6,000-14,900 per 100,000 inhabitants have been reported (Larrieu et al., 1999). Kamenetzky et al. (2002) have shown the presence of a great genetic variability among 147 E. granulosus isolates collected from different host species distributed through the south of South America, but mainly within Argentina. Our interpretation suggests the presence of at least 5 strains plus some minor genetic variants. In this article, we analyzed a 391-bp-long fragment of the mitochondrial cytochrome c oxidase subunit I (CO1) from the same Argentine isolates within a population genetics framework. Our goals were to evaluate the genetic variability and population structure of E. granulosus in the region and to ascertain some of their possible causes.

Sampling and sequencing
One hundred and fourteen E. granulosus parasites (101 metacestodes and 13 adults) were isolated from different hosts and provinces in Argentina. For some purposes, isolates were grouped into 3 geographic areas (south, central, and north; see Fig. 1), according to the information received from field workers during the collection of parasites from their hosts. The distribution of isolates among hosts and geographic areas is shown in Table I. After DNA extraction, a 391-bp segment of the mitochondrial CO1 gene was amplified by polymerase chain reaction (PCR), purified, and sequenced automatically, as described previously (Kamenetzky et al., 2002).
In this article, the term ''strain'' always refers to a genetically homogeneous group of isolates belonging to the same species, i.e., E. granulosus; ''haplotype'' designates the mitochondrial sequence used to discriminate strains, and ''variants'' are haplotypes with minor genetic differences within a strain.

Statistical analyses
Sequences were aligned using the Pileup program from the GCG Package (Genetics Computer Group, Version 9.1). Polymorphism estimates were calculated with Arlequin Version 2.0 (Schneider et al., 2000). The total mitochondrial CO1 genetic variability in the whole sample of 114 isolates and also in some subsamples (see Results) was FIGURE 1. Geographic distribution of the 36 Echinococcus granulosus sampling sites (indicated by stars). Their geographic coordinates were used in the nested clade analysis, but for other purposes, the 36 sites were grouped into 3 ''populations'', delimited by their latitude: South Ͼ 37ЊS 00Ј Ͼ Central Ͼ 28ЊS 50Ј Ͼ North. FIGURE 2. Multiple alignment of the 47 polymorphic sites among the 9 CO1 sequences obtained in this study. Haplotypes corresponding to the already known strains are represented by G1-G7 as in Bowles et al. (1992), and their minor variants are indicated by letters (G1a-G1c, and G7a). Vertically oriented numbers indicate the site position. tested for significant genetic differentiation among geographic regions. For this, we used the analysis of molecular variance (AMOVA) statistics (Excoffier et al., 1992) developed in Arlequin Version 2.0 (Schneider et al., 2000). A second approach was followed to decompose the total genetic variability, using nested clade analysis (reviewed in Templeton, 1998). Networks were estimated by a statistical parsimony procedure using the TCS Version 1.13 software (Clement et al., 2001), nested by the rules described in Crandall (1996), and tested for geographical associations with the program GeoDis Version 2.0 (Posada et al., 2000), using the geographic coordinates of the 36 localities shown in Figure 1.

RESULTS
Five of the 9 CO1 sequences found in the whole sample are identical to already known haplotypes (Bowles et al., 1992), corresponding to 5 distinctive E. granulosus strains. The other 4 sequences included minor genetic differences and, therefore, were referred to as variants of the 5 already known haplotypes (see Fig. 2 and Table II for details). Our data show that whereas the southern and central populations have a predominance of the sheep strain G1 haplotype, the northern population has a higher frequency of haplotypes G1c (sheep strain variant) and G2 (Tasmanian sheep strain). Moreover, no pig strain isolate was found in the north.
Polymorphism estimates were calculated not only from the whole sample but also from 2 other subsamples including: (1) all isolates collected from human hosts or (2) all haplotypes related to the sheep strain (G1, G1a, G1b, G1c, and G2). The results are shown in Table III. The whole sample comprises a few, but divergent, haplotypes, representing the strains (see also  Table II). When the major haplotype differences due to strain differentiation are excluded from the analysis, i.e., considering only the sheep strain subsample, nucleotide diversities reduce by 1-2 orders of magnitude and haplotype diversities are reduced to about half. However, it is interesting to note that the northern population appears to be less affected when only the sheep strain subsample is considered. Human hosts from all over Argentina harbor a great variety of parasites. The polymorphism estimates of the human subsample (n ϭ 63) are comparable with those obtained for the entire sample (n ϭ 114). Indeed, humans are potential hosts for most E. granulosus strains found in this study, with the exception of the pig strain.
As a first approximation to the mitochondrial genetic structure of E. granulosus in Argentina, we ran AMOVA using the same sample and subsamples of isolates described previously. As shown in Table IV, geographic differentiation in Argentina is high, but some within-region substructuring occurs (most variation is found within populations), which might represent the presence of several divergent strains. However, we believe that the highly significant F ST values are not merely related to the use of different host species in each geographic region because when only the sheep strain is considered, F ST becomes even higher.
To further evaluate the geographic genetic structure of E. granulosus populations in Argentina and in an effort to identify the causes of genetic differentiation, we conducted a nested clade analysis (Fig. 3). First, we obtained a network in which the number of steps (connections) was calculated according to a statistical 95% confidence parsimony criterion. Connections between haplotypes requiring more than 8 steps have a lower than 95% parsimony probability and, therefore, were excluded from the total network, resulting in 3 independent subnetworks (clades 2-1, 2-2, and 2-3). Clade 2-1 includes the sheep and Tasmanian sheep strains, clade 2-2 the camel and pig strains, and clade 2-3 is the cattle strain.
Second, in an attempt to identify the causes of these geographic associations (historical vs. demographic causes), we performed a distance analysis of the nested cladogram. Briefly, the geographic coordinates of each sampling location were determined (see Fig. 1), and the network clades were separated by their specific location. Two geographic distance estimates were then calculated: (1) D c , or clade distance, which indicates how geographically widespread are the individuals bearing hap-lotypes from a particular clade and (2) D n , or nested clade distance, representing how far individuals from 1 particular clade are from all other clades of a nested category (Templeton et al., 1995). If some geographic restriction exists, and if it is due exclusively to small-distance dispersal on a generation basis, then all significantly restricted clades will be geographically close to their evolutionary neighbors. On the other hand, if the restriction is due to long-distance or historical movements, then the restricted clades can be found geographically far away from some of their evolutionary sister clades. Long-distance movements are inferred when there is a significant discrepancy in the patterns of D c versus D n distances, whereas a pattern of concordance between these 2 measures implies short-distance movements.
As shown in Table V, significantly small clade (D c ) distances occur for G1c, 1-2 (G2), and G7a, which are all tip clades in their nested categories (see Fig. 3), whereas a significantly small nested clade distance (D n ) occurs for the average interior minus average tip clade distance (I Ϫ T) within category 2-1. A significantly large nested clade (D n ) distance exists for G2 and for the average interior minus average tip (I Ϫ T) clade distances (D c ) of 2-1 and 2-2. All other estimated distances are  not statistically significant, as tested by a randomization procedure based on 1,000 resamples.

DISCUSSION
The intraspecific genetic variability of flatworms has been little explored. Most genetic studies have focused on strain differences because of their implications in parasite diagnostics and disease control. Moreover, most of our knowledge of E. granulosus genetic variability comes from Australian populations, with a long history of control programs (Thompson and McManus, 2002). One of our previous studies in E. granulosus populations of Argentina (Rosenzvit et al., 1999) showed, by mitochondrial DNA sequences and ribosomal DNA PCR-restriction fragment length polymorphism, the presence of 4 distinct strains in human and domesticated intermediate hosts, namely the sheep, Tasmanian sheep, camel, and pig strains. Recently, a larger dataset has allowed us to confirm the previous results (Kamenetzky et al., 2002) and to find additional variation by sequencing a part of the mitochondrial CO1 gene. In this article we report the analysis of population genetic structure based on these data. Some of our findings are not new. Thus, the existence of more than 1 E. granulosus strain in a single geographic area has already been reported, i.e., in northwestern China, for example, where the sheep and camel strains were identified (Zhang et al., 1998), and in southern Brazil, where the simultaneous occurrence of sheep and cattle strains was shown as well (Haag et al., 1998). Although our isolates were grouped somewhat arbitrarily into geographic regions, which do not represent real populations in the genetic sense, such a high number of different haplotypes and a marked geographic differentiation have never been reported for E. granulosus. Lymbery et al. (1997) suggest that a possible cause for the lack of genetic structure in E. granulosus populations from Australia is the high mobility of its hosts. The superposition of a sylvatic cycle (maintained by dingos and macropods) and a domestic cycle (with dogs and mainly sheep as hosts) could also contribute to the genetic homogeneity of populations by spreading parasite genes throughout the continent.
In other macroparasites for which population genetic data have been obtained, both the absence and the presence of geographic structure have been explained by the extent of gene flow. Restricted gene flow in genetically structured populations is inferred by the hosts' sedentary habits, such as soil-dwelling insects (Blouin et al., 1999) and snails (Dybdahl and Lively, 1996;Sire et al., 2001), or with limited dispersal capacities, such as deer compared with domestic ruminants (Blouin et al., 1995). The absence of correlation between geographic and genetic distances for the CO1 gene in populations of hookworms from China was attributed to uneven human host movements and variable effective sizes (Hawdon et al., 2001). Even though historical and demographic hypotheses have been suggested to explain population genetic patterns, no statistic tool has been used to discriminate between them.
Three causes might explain the geographic pattern of E. granulosus genetic differentiation in Argentina: (1) regional differences in animal livestock, (2) actual restricted gene flow, and (3) historical causes, i.e., variations in time of domestic animal (along with their parasites) introductions and regional peculiarities in animal trade or husbandry practices during the last few centuries. The first cause can never be ruled out because regional differences indeed exist. For example, most of our pig strain isolates are found in the central region (see Table II), where pigs are traditionally raised in Argentina. But this does not account for the significant geographic differentiation found within the sheep strain, unless the sheep strain variants are adapted to different livestock species or strains. We did not consider natural selection on CO1 as a plausible explanation because the nucleotide differences present in the sheep strain haplotype variants do not lead to amino acid substitutions, except for G1b, in which an isoleucine is replaced by a valine. Because the mitochondrial genome is inherited as a single unit, these substitutions could nevertheless be favored by hitchhiking and, for simplicity, we define this possibility as a historical cause, together with random processes such as bottleneck events, or human-directed causes such as parasite introduction with domestic animals. The geographic distribution of CO1 haplotypes in this study shows that the derived haplotypes G1c and G2 are restricted to the northern population, whereas the ancestral G1 haplotype is more frequent in the central and southern regions. These geographic associations could be best explained by historical causes, as indicated by the discrepancy between D c and D n of these tip clades in the nested clade analysis (see Table V). Moreover, the significantly large D c and small D n for the average interior minus average tip clades (I Ϫ T) within category 2-1 strongly supports the idea that the derived clades occur far from their ancestor. If these regional differences are not a mere sampling artifact, they imply long-distance, historical movements. Most of our northern samples come from an isolated mountainous region called Valle del Tafi and Cumbres Calchaquies, where livestock was introduced a long time ago. These domestic animals probably remained isolated for decades because of their low economic value. Animal husbandry in this region is used exclusively for subsistence.
It would be difficult to date exactly when that isolation occurred and which host species or strains (together with their parasite variants) have been introduced, but there is a fair possibility that the E. granulosus G2 haplotype came with sheep imported from Australia at the beginning of the 20th century. The G1c variant could have arisen by mutation from the common sheep strain haplotype (G1), which might have been introduced with European cattle much earlier (the earliest cattle trades occurred in the 16th century), and increased in frequency by genetic drift or selection acting on other mitochondrial genes. The frequency of genetic variants can increase rapidly in Echinococcus spp. populations because of their asexual amplification in the metacestode stage, followed by egg dispersion via domestic or wild dog movements. Apparently, because neither dogs nor ruminants from this isolated region disperse to the south, the range of these parasite variants remains restricted.
The explanation for the restricted G7a (pig strain variant) haplotype distribution is less clear. It may be mainly related to the high intermediate host specificity of this strain and to some geographic peculiarities of pig husbandry in Argentina. An analysis of a larger set of pig strain isolates would be necessary to confirm the geographic association and to fully understand its causes. We are now working on the development of nuclear microsatellite markers, which might be useful to analyze population structure and transmission within single strains. Studying the pattern and the causes of the apparently complex genetic structure of E. granulosus will permit a better understanding of the disease process and diagnosis as well as enable control measures adapted to each parasite strain.

ACKNOWLEDGMENTS
Thanks to PADCT, FAPERGS, CNPq, CAPES, CBAB, CABBIO, and CONICET for supporting our work. We are also indebted to Aldo Mellender de Araújo, John Hawdon, and Don McManus for the review of this manuscript.