Phylogeny of Drosophila and related genera inferred from the nucleotide sequence of the Cu,Zn Sod gene

The phylogeny and taxonomy of the drosophilids have been the subject of extensive investigations. Recently, Grimaldi (1990) has challenged some common conceptions, and several sets of molecular data have provided information not always compatible with other taxonomic knowledge or consistent with each other. We present the coding nucleotide sequence of the Cu,Zn superoxide dismutase gene (Sod) for 15 species, which include the medfly Ceratitis capitata (family Tephritidae), the genera Chymomyza and Zaprionus, and representatives of the subgenera Dorsilopha, Drosophila, Hirtodrosophila, Scaptodrosophila, and Sophophora. Phylogenetic analysis of the Sod sequences indicates that Scaptodrosophila and Chymomyza branched off the main lineage before the major Drosophila radiations. The presence of a second intron in Chymomyza and Scaptodrosophila (as well as in the medfly) confirms the early divergence of these two taxa. This second intron became deleted from the main lineage before the major Drosophila radiations. According to the Sod sequences, Sophophora (including the melanogaster, obscura, saltans, and willistoni species groups) is older than the subgenus Drosophila; a deep branch splits the willistoni and saltans groups from the melanogaster and obscura groups. The genus Zaprionus and the subgenera Dorsilopha and Hirtodrosophila appear as branches of a prolific “bush” that also embraces the numerous species of the subgenus Drosophila. The Sod results corroborate in many, but not all, respects Throckmorton's (King, R.C. (ed) Handbook of Genetics. Plenum Press, New York, pp. 421–469, 1975) phylogeny; are inconsistent in some important ways with Grimaldi's (Bull. Am. Museum Nat. Hist. 197:1–139, 1990) cladistic analysis; and also are inconsistent with some inferences based on mitochondrial DNA data. The Sod results manifest how, in addition to the information derived from nucleotide sequences, structural features (i.e., the deletion of an intron) can help resolve phylogenetic issues.


Introduction
The taxonomy and systematics of Drosophila have been the subject of many investigations. A few landmarks are the monographs by Sturtevant (1921), Patterson and Stone (1952), Throckmorton (1975), and Wheeler (1981and Wheeler ( , 1986. Important recent contributions include a cladistic and revisionist monograph by Grimaldi (1990), and several molecular studies, the most notable and inclusive of which is DeSalle and Grimaldi (1991). Throckmorton's (1975) assessment of previous taxonomic, phylogenetic, and biogeographic studies moved him to conclude that the genus Drosophila originated in the Old World tropics, probably in Asia. Throckmorton's other important conclusions include that (1) the first major radiation of the genus is represented by the subgenus Scaptodrosophila, primarily distributed throughout the Old World tropics from Africa to Australia and the Pacific, although some species groups (including subtiIis and victoria) occur in the New World; (2) the radiation of the subgenus Sophophora (comprising the melanogaster, obscura, saltans, and willistoni groups) preceded the radiation of the subgenus Drosophila; (3) the genus Chymomyza is part of the Sophophora radiation; and (4) the genus Zaprionus emerged as part of the Drosophila subgenus radiation (which also includes the subgenera Hirtodrosophila and Dorsilopha). Grimaldi (1990) has carried out a cladistic analysis of morphological characters and produced a phylogeny that challenges Throckmorton's conclusions in important respects; in particular, Grimaldi places Chymomyza, Zaprionus, and Hirtodrosophila outside the lineage of the genus Drosophila. He also places Scaptodrosophila outside the Drosophila-genus lineage (thus agreeing with Throckmorton) and raises it (as well as Hirtodrosophila) to the genus category. DeSalle and Grimaldi (1991) as well as DeSalle (1992) have shown that molecular data (derived particularly from mitochondrial DNA) disagree with some of Grimaldi's (1990) conclusions.
We present here the DNA coding sequence of the gene Sod (which codes for the Cu,Zn superoxide dismutase) in 15 species representing the drosophilid genera and subgenera just mentioned. Our results are largely consistent with the phylogenetic relationships proposed by Throckmorton (1975)--more so, in fact, than with those proposed by Grimaldi (1990) or DeSalle and Grimaldi (1991). The propitious discovery of a second intron, present in the medfly Ceratitis capitata (family Tephritidae) as well as in Scaptodrosophila and Chymomyza, places the latter two taxa outside the genus Drosophila. The absence of this second intron from Hirtodrosophila also locates the branching of this taxon after the split of Chymomyza from the genus Drosophila, thus contradicting the mtDNA-based conclusion of DeSalle and Grimaldi (1991) and DeSalle (1992).

C A C C C T T G C C C A G A T C A T C ; and O, A C G G A A G T C T A -G A A G G G C T T T T T G G G C T T T G C C A C C T G . Three additional oligonucleotides were used only for sequencing: /, G A C A T -G C A G C C A T T G G T G T T G T C ; IR, G A C A A C A C C A A Y G G C T G -
CATGTC; and CR, CAAGGGTGGACACGAGCTGAGCAAG. The IR primer failed in three species (C. procnemis, Z. tuberculatus, and D. lebanonensis), for which it was replaced by IR140 (TGTAC-CTTCGGCACGTCTGG). In addition we used standard M13 sequencing primers. In some cases (nine species) primers that were different for different species (but are all represented by A, B, and D in Fig. 1) were designed using noncoding gene regions so as to sequence the coding fragments from both DNA strands. All compressions and ambiguities were resolved by multiple sequencing of both strands.
Computer-Assisted Sequence Analysis. DNA and protein sequences were assembled and analyzed using the Darwin package written by Mr. Robert Tyler from our laboratory. Phylogenetic analyses were made with the PHYLIP 3.4 and 3.5c package (Felsenstein 1989). The codon usage table was computed with the CODONS program (Lloyd and Sharp 1992).

Structure of the Sod Gene
The structure of the Sod gene is outlined in Fig. 1. In all species the coding sequence is interrupted after the 22nd codon by an intron 300-700 bp in length. Several Drosophila species that we had earlier sequenced exhibit no other intron; but a second short intron (< 100 bp), between codons 95 and 96, occurs in Chymomyza and in the medfly, Ceratitis capitata (Kwiatowski et al. 1992a,b), which belongs to a different dipteran family.
D. lebanonensis, a species of the subgenus Scaptodrosophila, also exhibits the second intron, which is, however, absent from Zaprionus tubercuIatus as well as from all other Drosophila species now sequenced.

Materials and Methods
Species. The 15 species studied are listed in Table 1 Kwiatowski et al. (1992a).
DNA Preparation, Amplification, Cloning, and Sequencing. We prepared genomic DNA from about 10-20 flies following the method of Kawasaki (1990). The Sod gene was amplified by the high-fidelity PCR technique and cloned into plasmids (pucl9 or puc21) (Kwiatowski et al. 1991b). Double-stranded DNA templates were sequenced as described earlier (Kwiatowski et al. 1992a). The primers for PCR amplification were designed by comparing available dipteran Sod se-
The noncoding regions are not shown in Fig. 2. They were not used for phylogenetic analysis because they are so highly diverse that their alignment becomes uncertain in many cases. The primer sequences are not shown either. The complete coding region is amplified by means of the N and O primers (see Fig. 1), which yield single or multiple PCR bands 1,150-1,850 bp in size The sequences have been obtained by double-strand sequencing of single clones. This procedure fixes PCRderived nucleotide errors. However, we have determined in a preliminary experiment (Kwiatowski et al., 1991 b) that the cumulative transversion-plus-transition error generated by our procedures is 3 × 10 4. This rate would be expected to yield fewer than two erroneous nucleotide determinations in the whole data set given in Fig. 2, which is trivial compared to the average of about 100 bp differences between species pairs. Table 2 gives the number of pairwise nucleotide differences out of the 439 bp sequenced in the 13 species for which the sequences are complete. The table also gives the number of inferred amino acid differences out of the 153 encoded by the gene. (The first seven amino acids covered by the N primer are identical in the six species for which sequences have been published, including C. capitata, except for C. amoena, which differs by one amino acid from all others.) Table 2 shows that D. lebanonensis is more different from Zaprionus and all other Drosophila species than any of these is from the rest. For example, the average number of nucleotide differences between D.
lebanonensis and the other Drosophila species plus Zaprionus is 107, whereas it is less than 100 between any of these other species. This result is consistent with the presence, noted above, of a second intron in the D.
lebanonensis Sod gene, which places Scaptodrosophila (as well as the genus Chymomyza, which also has the second intron) outside the subgenera Sophophora and Drosophila. Since the genus Zaprionus shares with all other Drosophila species the lack of the second intron, it can be concluded that Zaprionus is closer to the Sophophora and Drosophila subgenus than either Chymomyza or Scaptodrosophila. The distances shown in   Table 1. Table 3 gives the G + C content for the 343 bp sequenced in all 15 species (column labeled b), as well as for the complete set of nucleotides sequenced in 13 of the species (column a). The percent G + C of any given species is virtually identical in both sets. When all codon sites are taken into account, the G + C content is about 50% in all species, except melanogaster, simulans, subobscura, and virilis, for which it is about 60%. The difference in G + C content between the two sets of species becomes larger when only the third coding position is considered: the four exceptional species noted have 69-80% G + C, whereas the average is about 50% for the other species. (The G + C content is particularly low in Ceratitis, Chymomyza, and Zaprionus.) Variation in Sod G + C content among dipterans has been noted earlier (Kwiatowski et al. 1992a,b). Similar heterogeneity occurs in the Adh gene (Starmer and Sullivan 1989), and it reflects a bias in codon preferences that is generally encountered in well-expressed genes of other taxa, ranging from bacteria to humans (Sharp et al. 1988).

Phylogenetic Analysis
Differences in the transition/transversion ratio and G + C composition are a potential problem when inferring phylogenies from sequence data. It is difficult to assess the transition/transversion bias for the Sod coding region a a, includes the complete set of 439 nucleotides sequenced in 13 species; b, subset of 345 nucleotides sequenced in all 15 species. The two initial nucleotides derived from the N primer, which are conserved in all species, have been added to the sequences for estimating G + C content in our data since the taxa used are fairly distant in most cases. For the two most closely related species, D. melanogaster and D. simulans, the ratio is 1.2. For the more distantly related pairs, virilis/hydei and willistoni/saltans, the ratio is 1.6 and 1.9, respectively. The within-species ratio for 11 Sod alleles sequenced in D. melanogaster is 2.5 (Hudson et al. 1994). For a set of closely related species and subspecies of the D. willistoni group the ratio is about two (Antezana et al. unpublished data from our laboratory). Likelihood analysis (see below) suggests a transition/transversion ratio of two, and this ratio is used in calculating differences between sequences. In any case, the results of our analysis are not sensitive to the value of this ratio: a broad range (from one to 20) of transition/transversion ratios yields very similar trees. Figure 3 shows the Sod phylogenetic relationships obtained by the distance method (Fitch and Margoliash 1967;FITCH algorithm  Zaprionus. This early separation of D. lebanonensis and Chymomyza is consistent with the presence of the second intron noted earlier, which became deleted in the short evolutionary interval that preceded the divergence of any other species in the remaining clade. The subgenus Sophophora appears in Fig. 3 as polyphyletic, with the willistoni and saltans groups as the sister clade of a grouping that includes the melanogaster and obscura groups as one clade and the rest of the species as the other clade. This latter clade appears as a bush that includes the genus Zaprionus as well as D. busckii, (Dorsilopha), D. pictiventris (Hirtodrosophila), and species of the subgenus Drosophila. This particular clade appears as monophyletic in Fig. 3 in a relatively high number of replications (59% in A and 75% in B). The alternative grouping (which places Hirtodrosophila and/or Zaprionus outside the subgenera Sophophora and Drosophila) occurs with much lower frequencies (1% in A and 4% in B). Therefore, the distance phenetic analysis of the Sod sequences supports Throckmorton' s rather than Grimaldi' s hypothesis concerning the phylogeny of these species. However, our bootstrap values for this grouping (59 and 75% for A and B, respectively) are far from robust, since bootstrap results below 70% are suspect (Hillis and Bull 1993).
A deep split between the two Sophophora clades (the willistoni plus saltans and the melanogaster plus obscura groups) had been recognized by Throckmorton (1975 ribosomal RNA subunit, which shows, as in Fig. 3, the willistoni and saltans groups as a clade that has as a sister clade the other Sophophora species as well as the Drosophila subgenus. (See also Pelandakis et al. 1991.) However, the neighbor-joining method (Saitou and Nei 1987; result not shown), as well as other phylogenetic analyses of our data (see below), shows the Sophopho-  The trees on the left use all sites; the two on the right exclude the third codon bases so as to allow evaluation of the possible bias introduced by differences in G + C composition and by superimposed substitutions at particular sites. The trees in Fig. 4 are among the most parsimonious ones obtained with the DNAPARS algorithm; they also are quite similar to one another and to the distance trees shown in Fig. 3.
The bootstrap results show fairly high confidence (617 to 810 times out of 1,000 replicates) for the deep- the radiation of all the species in the latter cluster as a sister group to Sophophora. It manifests as well the deep split between the two Sophophoran clades: willistoni + saltans vs melanogaster + obscura. The four trees in Fig. 4 differ with respect to the branching order within the cluster that includes Zaprionus, Hirtodrosophila, and subgenus Drosophila. Figure 5 shows a maximum likelihood phylogeny obtained with the program D N A M L (Felsenstein 1981). Approximate confidence limits of branch lengths (assuming a transition/transversion ratio = 2) show that these are significantly positive (P < 0.01) in all cases except for the one between Chymomyza and the large clade that includes most Drosophila species, but this split is confirmed by the deletion of the second intron in all species of the larger clade. The two deepest branches (D. lebanonensis and Chymomyza) are the same, and in the same order, as in all previous trees. Also, the Sophophora species appear as a cluster, distinct from the cluster that includes the subgenus Drosophila, Hirtodrosophila, and Zaprionus. The relationships just mentioned also persist in the phylogeny obtained with the FITCH program for distance data, provided with the trees obtained with D N A B O O T and D N A M L (Fig. 6).
W e have tested alternative hypotheses of the Drosophilidae phylogeny by statistical evaluation of pertinent trees. For simplicity we have pruned our trees so that they only include six species of representative taxa: Scaptodrosophila (D. lebanonensis), Zaprionus (Z. tuberculatus), Hirtodrosophila (D. pictiventris), subgenus Drosophila (D. virilis), and Sophophora (D. willistoni and D. melanogaster). The trees are shown in Fig. 7.
Tree 2 is advanced by Grimaldi (1990) (Grimaldi 1990) and mtDNA hypotheses (DeSalle and Grimaldi 1991; DeSalle 1992) whereas trees 5 and 6 are consistent with the hypotheses proposed by Throckmorton (1975), Beverley and Wilson (1984, based on larval hemolymph), and Thomas and Hunt (1993, based on the Adh gene), and with the Sod results herein presented.
phology and DeSalle (1992) based on mtDNA, while trees 5 and 6 represent Throckmorton's (1975) view and the Sod results. We have used the methods of Kishino and Hasegawa (1989) and Templeton (1983;Felsenstein 1985a) to calculate the mean and variance of log likelihood, and step differences between trees, respectively. According to these tests, none of the six trees is significantly worse than the best one when m a x i m u m likelihood and maximum parsimony analyses are performed on all 343 sites. However, when only the first and second codon positions are considered for maximum parsimony analysis, trees 1, 2, and 3 are significantly worse than tree 6, which is the best (Table 4). Similar results are obtained with the full set of 13 sequences. Here again, the two trees (similar to 1 and 2) with Hirtodrosophila and Zaprionus branches outside  Fig. 7 the Drosophila clade (subgenera Drosophila and Sophophora) are rejected relative to the best tree (similar to 6), which has the Hirtodrosophila and Zaprionus branches inside, as part of a monophyletic cluster that includes the subgenus Drosophila. Similar results with respect to the position of Zaprionus have been recently obtained with Adh sequences (Thomas and Hunt 1993). A molecular study based on immunological distances of larval hemolymph protein also places Hirtodrosophila within a clade that includes the subgenus Drosophila, but not Sophophora (Beverley and Wilson 1984). Table 5 shows time estimates for various phylogenetic events. The relevant nodes are labeled in Fig. 6. The estimates are based on the number of amino acid replacements given in Table 2. We use amino acid rather than nucleotide differences, because the former are likely to be more reliable in this case for two reasons. One is that the substantial differences observed in G + C composition among species, particularly in the third codon sites, introduce a bias that is difficult to evaluate as to its magnitude and significance (Gillespie 1986;Woese 1991). The other reason is that the rate of amino acid substitutions in the Cu,Zn SOD of diverse organisms has been shown to be fairly constant during the last 60 MY (million years); the rate is approximately 15 aa/100 aa/100 MY for PAM-corrected data (Kwiatowski et al. 1991a(Kwiatowski et al. , 1992a; for the PAM correction, see Dayhoff 1978). The evolutionary time estimates given in Table 5 are only rough estimates, because they not only depend on the assumption of a molecular clock, but also on the particular rate previously established for SOD, and on the limited amount of information provided by a dozen or so Sod sequences. Nevertheless, the estimates shown in Table 5 may be as reliable as any currently available in the literature, since sequence data are more precise than data sets based on immunological distances, two-451  Fig. 6 b PAM is the estimated percent of amino acid differences corrected for superimposed and back replacements c Zaprionus, Dorsilopha, and Hirtodrosophila are included in the Drosophila subgenus radiation dimensional protein electrophoresis, and restriction analysis.

Divergence Time
Our time estimates are somewhat lower than those of Collier and Maclntyre (1977) based on microcomplement fixation studies of alpha-glycerophosphate dehydrogenase, and those of Spicer (1988), based on two-dimensional protein electrophoresis. Collier and Maclntyre (1977) estimate the Tephritidae radiation at 90 MY (our estimate, 77 MY). Spicer (1988) estimates the Drosophila genus radiation at about 60 MY (ours, 44). The radiation of the Drosophila subgenus is estimated by Collier and Maclntyre (1977) as well as by Spicer (1988) at 50 MY (ours, 33 MY). The estimates of Beverley and Wilson (1984), based on immunological distances for a larval hemolymph protein, also are somewhat higher than those shown in Table 5. However, lower numbers than ours are estimated by Thomas and Hunt (1993) based on the nucleotide sequence of the Adh gene: Scaptodrosophila divergence, approximately 45 MY (ours, 56 MY); Drosophila genus radiation, 40 MY (ours, 44); Drosophila subgenus radiation, 27 MY (ours, 33 MY).

Discussion
The Cu/Zn superoxide dismutase is an abundant enzyme in eukaryotic organisms, with highly specific superoxide dismutation activity that protects aerobic cells against the harmfulness of free oxygen radicals (Fridovich 1986). Cu,Zn SOD is distinctly interesting for investigating phylogenetic issues, because (1) it is apparently present in all eukaryotes, and (2) it evolves at a fairly high rate, so as to be informative for recent evolutionary events, i.e., within the last 100 MY (Lee et al. 1985;Ayala 1986; see Table 2). Yet (3), it is well conserved over long time spans, so 60% of the amino acid residues remain identical between organisms from different kingdoms, such as humans and yeasts (Ayala 1986;Kwiatowski et al. 1991a). In higher Diptera the Sod gene consists of a 462-bp coding region interrupt-452 ed by one or two introns (Kwiatowski et al. 1992a; see Fig. 1).
The Drosophilidae are well-studied organisms with respect to genetics and systematics. Yet many issues remain controversial. The commonly accepted taxonomy (Wheeler 1981(Wheeler , 1986 and the evolutionary account of Throckmorton (1975) have been recently challenged by Grimaldi (1990) in important respects. A profusion of molecular investigations have failed to settle the issues and have often yielded incongruous outcomes. The results conveyed in the present paper provide helpful evidence toward resolving some issues.
On the basis of biogeographical, morphological, and other considerations, Throckmorton (1975) has argued that the divergence of Scaptodrosophila (represented in our study by D. lebanonensis) precedes the first major radiation of the genus Drosophila. This is supported by all our data. The presence of the second Sod intron in Scaptodrosophila (as well as in Ceratitis and Chymomyza), and its absence in the species of the Sophophora and Drosophila subgenera, definitely places the phylogenetic divergence of Scaptodrosophila before the Drosophila radiations (a position also endorsed by Grimaldi 1990;and DeSalle and Grimaldi 1991; but see Villarroya and Juan 1991). Grimaldi (1990) has accordingly raised Scaptodrosophila to the taxonomic status of "genus." The presence of the second Sod intron in the two Chymomyza species investigated, and its absence from the species of the subgenus Sophophora, Drosophila, Hirtodrosophila, and Dorsilopha (as well as from the genus Zaprionus), places the Chymomyza lineage outside the Drosophila radiations. This is also supported by the analysis of the Sod sequence data.  Our results therefore contradict Throckmorton's inclusion of Chymomyza as a member of the Sophophora radiation and support the phylogenetic position of Chymomyza proposed by Grimaldi (1990, his Fig. 542, p. 100) and DeSalle and Grimaldi (1991).
The Sod sequence data indicate that Scaptodrosophila and Chymomyza diverged from the Drosophila lineage within a short time interval (between 56 and 55 MY ago, according to our date estimates). The data are thus insufficient to decide which one of the two lineages is the sister clade of the Drosophila clade. The radiation of the genus Drosophila happened shortly afterward (estimated at 44 MY in Table 4); but it was during this brief time preceding the Drosophila radiation that the Drosophila lineage lost the second Sod intron.
The absence of the second Sod intron from D. pictiventris (subgenus Hirtodrosophila) excludes the position of Hirtodrosophila outside the Chymomyza + Drosophila clade, as proposed by DeSalle (1992) on the basis of mtDNA data. (See also DeSalle and Grimaldi 1991.) The Sod sequence data support Throckmorton's (1975) position of Hirtodrosophila within the radiation of the subgenus Drosophila (sensu laW, i.e., inclusive also of the genus Zaprionus and the subgenus Dorsilopha) and are inconsistent with Grimaldi's (1990) opposite conclusion (as well as DeSalle's).
The Sod sequence data also support Throckmorton (1975) on the phylogenetic position of the genus Zaprionus, which he sees as part of the subgenus Drosophila (s.l.) radiation. The Sod phylogenies contradict Grimaldi (1990), DeSalle and Grimaldi (1991), and De-Salle (1992), who place Zaprionus outside the clade comprising the subgenera Drosophila and Sophophora. Throckmorton's (1975) proposal that the Sophophora radiation preceded the radiation of the subgenus Drosophila is also supported by the Sod sequence data.
Several systematists, Throckmorton (1975) among them, have noted that the evolution of the drosophilids is modulated by rapid radiations, or bursts of cladistic expansion. The short time spans between cladistic events that follow one another in rapid succession are unlikely to leave conspicuous traces in the organisms' morphology or genetic makeup. The sequence of phylogenetic events may then be difficult to determine. This hardship is further intensified in the evolution of the drosophilids by the relative scarcity of fossil specimens, substantial conservation of morphology, and occasional homoplasy. It is thus not surprising that the systematics of the Drosophilidae has remained controversial in the face of extensive and authoritative investigations.
Will molecular information eventually provide the definitive answers concerning phylogenetic matters? Nucleotide sequence data have indeed the potential to do so. The DNA of an organism has a record of its evolutionary history. There are many genes (and other DNA sequences) in each organism, so more and more data can in principle be accumulated until a particular phylogenetic issue of interest is settled. But possibility in principle and securing the data are very different matters. Obtaining a DNA sequence is a laborious process (compare it with a morphological observation such as eye color or wing length), so at best only a few relevant DNA sequences (or other highly informative molecular data) are known for most groups of organisms. Whenever the molecular data are very limited, as currently they are in most cases of interest, variance in evolutionary rates, homoplasy, and other difficulties can yield erroneous conclusions when taken at face value. (Homoplasy is a particularly nagging problem, since the nucleotide bases provide only four possible alternatives at each particular site in a DNA sequence.) The present investigation illustrates some of the virtues and potential pitfalls of molecular data. The pitfalls are apparent in the variation of outcomes concerning details of the phylogeny obtained by different analytical methodologies: compare Figs. 3-6. An attempt to attenuate the difficulties of homoplasy, by ignoring the third codon positions, increases only slightly the stability of branches in the phylogeny. (Compare B and D with A and C in Fig. 4.) The contributing reasons are three: (1) the twofold degeneracy of the genetic code implies that silent sites occur in first codon positions, not only in the third positions; (2) the evolution of eodon preferences, reflected in G + C content in the third positions, is itself phylogenetically informative; (3) the data set becomes reduced when one-third of the nucleotides are excluded from consideration. The potential pitfalls of molecular data are illustrated de facto by the observation that mtDNA sequence data (DeSalle and Grimaldi 1991;DeSalle 1992) yield conclusions that are inconsistent with the Sod sequence data.
One virtue of DNA sequences that is illustrated by the Sod data is that phylogenetic information derives not only from the direct comparison of nucleotides at particular sites in a sequence, but also from the organization of the DNA sequences. Deletions of well-defined DNA segments may be particularly informative, as exemplified by the deletion of the second Sod intron, owing to the low probability of independent occurrence or restoration of such an event within a defined phylogeny; that is, homoplasy is particularly unlikely.
There can be little doubt that the accumulation of DNA sequence data may eventually settle any given phylogenetic issue. It would seem equally certain that in the interim, or for that matter at any time, the only reasonable approach to settling phylogenetic relationships is to use all available information--molecular, morphological, biogeographical, etc.--and to weigh it according to its value in a particular case.