Molecular mapping of deletion breakpoints on chromosome 4 of Drosophila melanogaster

As part of our effort to induce and identify mutations in all genes on chromosome 4 of Drosophila melanogaster, we have mapped the breakpoints of eight chromosome 4 deficiencies relative to the predicted genes along this chromosome. Although the approximate locations of Df(4)G, Df(4)C3, Df(4)M101-62f, Df(4)M101-63a, Df(4)J2, Df(4)O2, Df(4)C1-10AT, and Df(4)B2-2D are known (some from cytological observations and others predicted from P element locations), the extents of these deletions have not been mapped with respect to the predicted genes identified by the Drosophila Genome Project. Polymerase chain reaction primers were designed to amplify the predicted exons of all chromosome 4 genes, and homozygous embryos for each deficiency were identified and their DNA used to test for the presence or absence of these exons. By testing for the inability to amplify various exons along the length of the chromosome, we were able to determine which predicted genes are missing in each deficiency. The five deficiencies, Df(4)G, Df(4)C3, Df(4)C1-10AT, and Df(4)B2-20 (all terminal deletions), and Df(4)M101-62f (a proximal interstitial deletion), enabled us to partition the gene-containing, right arm of chromosome 4 into five regions. Region A [uncovered by Df(4)M101-62f] contains the proximal-most 21 genes; region B [uncovered by Df(4)B2-2D] contains the next 12 genes; region C [uncovered by Df(4)B2-2D and Df(4)C1-10AT] contains the next 17 genes; region D [uncovered by Df(4)B2-2D, Df(4)C1-10AT, and Df(4)C3] contains the next 21 genes; and region E [uncovered by Df(4)B2-2D, Df(4)C1-10AT, Df(4)C3, and Df(4)G] contains the distal-most ten genes. By using Df(4)M101-62f, Df(4)B2-2D, Df(4)C1-10AT, Df(4)C3, and Df(4)G in complementation tests, we can assign newly induced recessive lethal mutations to one of the five regions on chromosome 4. This will substantially reduce the amount of DHPLC analysis required to match each mutation to a predicted transcript on chromosome 4.


Introduction
The fourth chromosome of Drosophila melanogaster is the smallest of the chromosomes.It is about 5 Mb in length, and approximately 4 Mb of that comprises satellite repeats and does not appear to contain any genes (Locke and McDermid 1993).The gene-containing region is approximately 1.2 Mb in length, and forms a polytenized region with cytological band designations 101E-102F8.Chromosome 4 has several unusual features that set it apart from the other autosomes and point to its heterochromatic nature.Unlike the other chromosomes, there is no crossing over in females during normal meiosis (Bridges 1935).The fourth chromosome often shows a poorly banded appearance in salivary gland polytene chromosome spreads similar to regions of β-heterochromatin.Chromosome 4 contains low and middle copy repetitive elements that are typically found in β-heterochromatin (Miklos et al.1988;Carmena and Gonzalez 1995;Pimpinelli et al. 1995).The recent annotation of the Drosophila genome shows that the density of transposable elements on chromosome 4 is significantly higher than that of the other arms.The average density of transposable elements on the X, 2L, 2R, 3L and 3R arms is only 10-15 elements per megabase (million bases), while the density on chromosome 4 is 82 elements per megabase.The difference between the densities is made up almost exclusively in LINE-like and TIR elements, and these elements are distributed all along the chromosome, both between and within genes (Kaminker et al. 2002).Chromosome 4 is also highly enriched for the repetitive element DINE-1, which has been localized almost exclusively to chromosome 4 and centric heterochromatic regions (Locke et al. 1999).Heterochromatic protein 1 (HP1), an important constituent of heterochromatin, has been shown to bind extensively to chromosome 4 (James et al. 1989).A chromosome binding protein, Painting of Fourth (POF), has been shown to localize exclusively to chromosome 4, and it has been speculated that this protein may be involved in the regulation of genes in a heterochromatinized environment (Larsson et al. 2001).In addition, P element transgenes inserted into chromosome 4 frequently show the variegated expression (position-effect variegation) typical of genes inserted into heterochromatic boundaries (Wallrath and Elgin 1995;Wallrath et al. 1996).Mapping studies of variegating and non-variegating transgene inserts into chromosome 4 suggest that the polytenized regions of this chromosome comprise interspersed euchromatic and heterochromatic domains (Sun et al. 2000).
The lack of crossing over during meiosis makes the construction of a conventional genetic map of chromosome 4 impossible.Hochman (1976) undertook an extensive mutational screen on chromosome 4, in which he was able to identify 37 vital loci and 6 visible mutations.From a statistical analysis he also estimated the presence of another ∼30 loci, bringing the total gene number on chromosome 4 to approximately 70.He was, however, only able to map his mutations based on complementation to two chromosome 4 deletions: Df(4) M101-1 (101EF;102B6-7) and Df(4)G (102E2;102F8).He assumed the majority of the mutants (26) fell into the large region between these two deletions.
There are relatively few chromosome 4 deficiencies available from the stock center.Most of these deficiencies were created by X-ray mutagenesis or, more recently, FLPase mediated events (Ahmad and Golic 1998;Kramps et al. 2002).Two of the deficiencies we discuss here, Df(4) B2-2D and Df(4)C1-10AT, were created in a novel P element screen (Sousa-Neves et al., unpublished).These deficiencies have been mapped approximately on the basis of cytology and using only a few chromosome 4 genes in complementation analysis.Release 3.1 of the Drosophila genome sequence (Adams et al. 2000;Celniker et al. 2002;Misra et al. 2002) has identified 82 putative loci on chromosome 4.We are currently undertaking a large mutational screen, using ethylmethyl sulfonate (EMS), in which we hope to create mutations in every locus on chromosome 4.We then propose to match each mutation to a chromosome 4 transcript using DHPLC analysis and confirm the locations by sequencing.To streamline our identification and matching process, we have used polymerase chain reaction (PCR) amplification to map molecularly the putative transcripts to the available chromosome 4 deficiencies.We have been able to map the extents of these deletions, and hence divide the chromosome into five regions based on the molecular mapping of five chromosome 4 deficiencies.Complementation analysis can thus be used to map each of our new EMS-induced mutations to one of the five groups, which is the first step in the localization of the mutations to their respective genes.This set of molecularly defined deletions will also be of value to other Drosophila researchers who wish to map chromosome 4 mutations.Fig. 1 An example of a polymerase chain reaction (PCR) experiment used to determine molecular breakpoints of a deficiency.DNA from homozygous Df(4)G embryos was tested with PCR primers from four genes: CG2999 (unc-13), CG11081 (plexA), CG11062 (activin-beta), and CG11027 (Arf102F).Oregon R DNA was used as a positive control for PCR amplification, and a PCR negative control (no DNA template) was also used to ensure that the amplified bands were dependent upon the addition of template DNA.The PCR products from unc-13 and plexA were successfully amplified in the Df(4)G DNA lanes, indicating that these two regions are present in Df(4)G DNA.The products from the activin-beta and Arf102F primers were not amplified in the Df(4)G lanes, indicating that these two regions are deleted in the Df(4)G DNA

M101-62f
Could not be located
Crosses and identification of homozygous embryos Deficiency stocks were mated to a ci 1 ey R stock.Virgin Df(4)/ci 1 ey R females were mated to Df(4)/ci 1 ey R males and allowed to lay eggs on agar plates.After approximately 24 h, single embryos were picked from the plates into tubes, and the DNA was extracted from each embryo by squashing in 10 μl of 10 mM TRIS-Cl, pH 8, 1 mM EDTA, pH 8, 200 μg/ml Proteinase K solution.The tubes were incubated at 37°C for 45 min, and heated at 99°C for 10 min.Each embryo preparation was tested first for the presence of the ci 1 chromosome using primers JS1 (ATACATATGTTTCATTACGG) and JS3 (TACTCAGTTCAAATCTTGTG), and second for DNA quality using a PCR primer pair known to be present on the deficient chromosome.Embryo preparations that produce a band in the DNA quality test, but fail to produce a band in the ci 1 test must therefore be homozygous for the deficiency.

Molecular mapping
Homozygous deficient embryo DNA was used to test for the presence or absence of primers specific to genes located on chromosome 4. Primer pairs for nearly all the putative or known gene exons on chromosome 4 (based on Release 3.1) were synthesized and PCR amplification was performed using a selection of primer pairs and homozygous deficient embryo DNA as a template.A PCR product of the expected band size indicated that the primers (and hence that exon) were still present in the deficiency DNA, while the lack of a band indicated that the gene was The interstitial deletion of Df(4) O2, which removes the region from 102D1 to 102D4.h The Df (4)M101-63a deficiency, which is cytologically normal since no band aberration can be seen completely or partially deleted.An example of this method showing two genes present and two genes absent in Df(4)G DNA is shown in Fig. 1.Once an approximate molecular location for the deletion had been established, primer pairs specific to each exon of the genes near the breakpoints were used for PCR amplification to map the location of the deletion breakpoint in finer detail.A list of the relevant primers is shown in Table 1.Interstitial breakpoints were mapped to a location within a particular exon of a gene, or to the region between two genes.If the most distal exon of the last gene on chromosome 4, CG18026 (caps), was shown to be absent, the deficiency was deemed a terminal deletion.

Cytology
Polytene chromosome squashes were made from dissected salivary glands of third instar larvae of the eight Df(4) containing stocks (Fig. 2a-h).The glands were stained in lacto-aceto-orcein before squashing to enhance the visibility of the bands.Photographs of the slides were taken under phase contrast microscopy.Most of the Df (4)s were stocked over a cytologically normal fourth chromosome, with the exception of Df(4)B2-2D and Df(4)C1-10AT, which were kept most easily as fourth chromosome trisomics.Df(4)B2-2D is stocked with two ci D chromosomes and Df(4)C1-10AT is stocked with two ci D spa pol chromosomes.

Results
The mapping of each Df( 4) is described below.

Df(4)M101-62f
This deficiency is the proximal-most interstitial deletion of chromosome 4. Using methodology like that shown in Fig. 1, we mapped the distal breakpoint of this deletion to a 946 bp region of DNA between exons 4 and 5 in gene CG1674.Cytological observations place this breakpoint just proximal to band 102B1 (Fig. 2a).The proximal breakpoint location is not clear from our cytological observations.Our molecular breakpoint analysis has placed it somewhere proximal to the first exon of CG17923, the second most-proximal predicted gene on chromosome 4.
The most proximal predicted gene on chromosome 4, CG32013, appears to be present in DNA from homozygous Df(4)M101-62f embryos; however, all attempts to amplify a PCR product that spans the expected deletion (CG32013 to the CG1674 breakpoint) from homozygous deficient embryo DNA were unsuccessful.In addition, PCR amplifications using CG32013-specific primer pairs produced bands of the expected sizes with DNA from embryos that lack chromosome 4 entirely [nullo-4s derived from C(4)RM/0 parents].This paradoxical result can be explained in several ways.
First, gene CG32013 is repetitive in nature, and thus the primer pairs for CG32013 are able to amplify regions on other chromosomes besides chromosome 4. BLASTN comparison (Altschul et al. 1990) of the CG32013 sequence with the genome showed significant homology of some regions of the CG32013 gene sequence to centromeric regions of other chromosomes.In addition, hybridization of CG32013 probes to genomic DNA blots resulted in multiple bands (data not shown), indicating that there are repetitive sequences in CG32013.However, we could not find any additional whole copies of the CG32013 gene using a BLASTN search.We sequenced the nullo-4 PCR products amplified with CG32013 primer pairs and found the products to be exactly identical to the released genome sequence predicted for chromosome 4. Therefore, multiple copies of CG32013, if they exist, must be highly conserved, but not in the current genome sequence.
Another explanation for the ability to amplify regions of CG32013 from nullo-4 embryo DNA is that gene CG32013 is not actually located on chromosome 4 as shown in the Release 3.1 sequence.We attempted to verify the Release 3.1 location of gene CG32013 by PCR amplifying the region between CG32013 and the adjacent gene CG17923, which we know to be on chromosome 4 because of its absence in Df(4)M101-62f and nullo-4s.As PCR templates, we used genomic DNA isolated from five different D. melanogaster stocks, including the same stock used to produce the RPCI-98 BAC library (Hoskins et al. 2002).In addition to the genomic DNA templates, DNA from clone BACR05L22, the proximal-most BAC on chromosome 4 based on the physical map of Locke et al. (2000), was also used as a PCR template.The proximal end of this clone corresponds to the proximal end of the Release 3.1 sequence.The PCRs using the genomic DNA all failed to produce the band predicted from the Release 3.1 sequence, but the BAC clone DNA did produce the predicted band.We also performed Southern blot analysis of this region using the five genomic DNAs and BACR05L22 DNA.A hybridization probe was produced by PCR amplification of Oregon R DNA using a primer pair specific to gene CG17923.The probe was hybridized to genomic DNA digested with EcoR I or Hind III.The Release 3.1 sequence predicts both digests should result in a single fragment (2.8 kb for EcoR I, and 9.0 kb for Hind III) spanning the CG32013-CG17923 genes.The BAC clone produced the predicted bands in both the EcoR I and Hind III digest lanes; however, the genomic DNAs from the five Drosophila stocks all produced a common, single band of a different size (1.4 kb for EcoR I, 4.7 kb for Hind III).No hybridization was seen at the sizes predicted from the Release 3.1 sequence.A second probing was done, using a hybridization probe from gene CG32013.This probe should produce identical bands to the first probing if the two genes are indeed adjacent.The BAC digests showed hybridization to the same, predicted, single bands for the EcoR I and Hind III digests, while this probe resulted in hybridization to 8-12 bands in the EcoR I and Hind III lanes of all five stocks, indicating it contained repetitive sequences.None of these bands was the predicted size, and none was the same size as the band produced from the CG17923 probe, indicating that the two genes are not located within the same restriction fragment as predicted in Release 3.1.Both the PCR and Southern data indicate these two genes cannot be adjacent on chromosome 4. Since the restriction digest pattern of the BAC clone differs from the digest pattern of genomic DNA in this region, we must also conclude that neither clone BACR05L22 nor the sequence data from Release 3.1 is representative of the Drosophila genome in the region proximal to gene CG17923.The Southern blot results, combined with the PCR results from nullo-4 embryos, indicate that gene CG32013 is likely not located on chromosome 4 and that BACR05L22 is chimeric with the gene CG32013-containing sequence being ligated adjacent to CG17923 on the end of a chromosome 4bearing fragment.

Df(4)O2
This deficiency is an interstitial deletion located at 102D1-D4 (Fig. 2g).We have mapped the proximal breakpoint to a region within the distal half of the gene CG32019 (bt).The distal breakpoint is located within a 12.6 kb region between the distal exon of CG11091 (sphinx) and the proximal exon of CG11186 (toy).The genetic data indicate that Df(4)O2 is deficient for CG11186 (toy), so it is likely that the regulatory region of toy is deleted (Kammermeier et al. 2001).

Df(4)M101-63a
This chromosomal aberration genetically fails to complement only ci recessive alleles and RpS3A [Minute(4)101 or CG2168].One breakpoint has been localized to a 263 bp region within the RpS3A gene, which accounts for the lack of complementation to Minute(4)101 mutations.The sequences on either side of the breakpoint are still present in DNA from homozygous Df(4)M101-63a embryos, indicating that this aberration is not a deletion.Thus the lesion in RpS3A is due to either an insertion into the RpS3A gene, or an inversion with one breakpoint located in the RpS3A gene.We attempted to obtain sequence past the breakpoint by single-primer PCR and inverse PCR, but these attempts were unsuccessful.Cytological observations do not show any aberration in the banding pattern (Fig. 2h), indicating that the inverted or inserted region must be relatively small in cytological terms.

Discussion
The current annotation (Release 3.1) of the Drosophila genome indicates that chromosome 4 contains 82 putative gene transcripts, interspersed with repeat elements.We have used these sequences to design PCR primers for exons of these genes.Amplification by PCR from homozygous deficient embryo DNA has allowed us molecularly to map the breakpoints of eight chromosome 4 deficiencies.The putative genes have been partitioned to the five regions of the chromosome specified by the breakpoint locations of five of the eight deficiencies.
Chromosome 4 deficiencies B2-2D, C1-10AT, C3, J2, and G are all terminal deletions of chromosome 4. Df(4) M101-62f is a interstitial deletion of the proximal region of chromosome 4, and Df(4)O2 is an interstitial deletion of chromosome 4 in the 102D region.Df(4)M101-63a is either an inversion of the chromosome with one breakpoint in RpS3A, or an insertion into this same region.One gene, CG1748 (RhoGAP102A), could not be placed within any of the five groups.It lies within the approximately 35 kb region between the distal breakpoint of Df(4)M101-62f and the proximal break of Df(4)B2-2D.
We have demonstrated that there is an error in the Release 3.1 genome sequence in the region proximal to gene CG17923.This error was likely due to the sequencing of the BAC clone BACR05L22, which we have shown to have different restriction fragments than the same region of genomic DNA extracted from five strains, including the isogenized stock used to build the BAC library.The ability to amplify PCR products using primers specific to gene CG32013 on DNA from nullo-4 embryos indicates that this putative gene is, in fact, located on a chromosome other than the fourth.Because of this error, there may be genes located in the region proximal to CG17923 that have not been included in this study owing to their absence from the Release 3.1 sequence.
The molecular mapping of these deficiencies will be a valuable resource for Drosophila researchers working on chromosome 4 genes.The inability to make a conventional genetic map owing to the lack of recombination on the fourth chromosome has retarded the investigation of chromosome 4 genes.Complementation mapping to the deficiencies studied here will allow researchers easily to delimit a mutation to one specific block of genes.The chromosome 4 deficiency map will also be a very important resource for our own chromosome 4 mutational screen.The ability to locate the mutations to specific blocks of genes by complementation mapping will significantly reduce the number of DHPLC runs needed to find base substitutions, and will reduce the number of false identifications of mutants due to second site nonlethal base pair changes.

Fig
Fig. 2a-h Polytene chromosome squashes from the salivary glands of chromosome 4 deficiency chromosomes heterozygous with cytologically normal chromosomes.a The cytological location of the Df(4)M101-62f breakpoints.This chromosome is deleted for the region including 101E-102A6.b The breakpoint locations of Df(4)B2-2D.The deleted chromosome points to the left and the cytologically normal ci D chromosome points to the right.The chromosome is deleted for bands 102B2 to the telomere.c The Df(4)C1-10AT chromosome lying adjacent to two ci D spa pol chromosomes.The deletion removes the region from 102C2 to the telomere.dThe Df(4)J2 deletion, which includes the terminal portion of the chromosome starting at 102D2.e The Df(4)C3 deletion, which is cytologically identical to Df(4)J2, and removes the region from 102D2 to the telomere.f The deleted region of Df (4)G, which removes the region from 102E2 to the telomere.g The interstitial deletion of Df(4) O2, which removes the region from 102D1 to 102D4.h The Df (4)M101-63a deficiency, which is cytologically normal since no band aberration can be seen

Fig. 3
Fig. 3 A summary map of chromosome 4 deletions.The chromosome has been divided into five regions designated as A, B, C, D, and E based on the breakpoint locations of five chromosome 4 deficiencies [Df(4)M101-62f, Df(4)B2-2D, Df(4)C1-10AT, Df(4)C3, and Df(4)G].The breakpoints of two other deficiencies [Df(4)J2 and Df(4)O2] are also shown, but they have not been used to partition chromosome 4.The cytological locations of the breakpoints are indicated.The inset boxes show the locations of the deleted regions in more detail.The CG numbers refer to the gene identifiers from the annotated sequence Release 3.1 of the Drosophila genome(Adams et al. 2000;Celniker et al. 2002;Misra et al. 2002)