A Spitzer survey of Deep Drilling Fields to be targeted by the Vera C. Rubin Observatory Legacy Survey of Space and Time

The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) will observe several Deep Drilling Fields (DDFs) to a greater depth and with a more rapid cadence than the main survey. In this paper, we describe the ``DeepDrill'' survey, which used the Spitzer Space Telescope Infrared Array Camera (IRAC) to observe three of the four currently defined DDFs in two bands, centered on 3.6 $\mu$m and 4.5 $\mu$m. These observations expand the area which was covered by an earlier set of observations in these three fields by the Spitzer Extragalactic Representative Volume Survey (SERVS). The combined DeepDrill and SERVS data cover the footprints of the LSST DDFs in the Extended Chandra Deep Field-South field (ECDFS), the ELAIS-S1 field (ES1), and the XMM Large-Scale Structure Survey field (XMM-LSS). The observations reach an approximate $5\sigma$ point-source depth of 2 $\mu$Jy (corresponding to an AB magnitude of 23.1; sufficient to detect a 10$^{11} M_{\odot}$ galaxy out to $z\approx 5$) in each of the two bands over a total area of $\approx 29\,$deg$^2$. The dual-band catalogues contain a total of 2.35 million sources. In this paper we describe the observations and data products from the survey, and an overview of the properties of galaxies in the survey. We compare the source counts to predictions from the SHARK semi-analytic model of galaxy formation. We also identify a population of sources with extremely red ([3.6]$-$[4.5] $>1.2$) colours which we show mostly consists of highly-obscured active galactic nuclei.


INTRODUCTION
Surveys by the Spitzer Space Telescope have proved extremely valuable for finding and characterizing distant galaxies. The redshifting of the peak of stellar emission at 1.6 m into the Spitzer bands makes them especially sensitive to high-redshift galaxies (e.g. Berta et al. 2007;Stefanon et al. 2015;Cecchi et al. 2019). Spitzer data thus provide a very useful complement to deep surveys in the optical, where the surface density of galaxies is higher, but intrinsically luminous, high-redshift galaxies that are either quiescent or dust-reddened can be outnumbered by lower-redshift, lower luminosity blue galaxies. The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST; Ivezić et al. 2019) will observe the Southern sky in six optical bands ( , , , , and ) in about 800 passes (summed over all bands) over ten years, to a co-added 5 depth of ≈ 24.4 − 27.1, depending on band. Within the survey area, there will be several Deep Drilling Fields (DDFs) where observations are repeated more frequently, resulting in both a better sampled cadence, and a deeper coadded final image ( ≈ 26.2 − 28.7, depending on band; Brandt et al. 2018;Scolnic et al. 2018). The DDFs will thus become important reference fields for both time domain and ultra-deep imaging studies.
We therefore proposed to observe the DDFs that had already been defined by the LSST team in the near-infrared with the Spitzer Space Telescope during its post-cryogenic mission (after the liquid helium cryogen supply for the telescope was exhausted in May 2009). Although only the two shortest wavelength bands of the Infrared Array Camera (IRAC; Fazio et al. 2004a;Carey et al. 2010), at 3.6 and 4.5 m, continued in operation following the exhaustion of the cryogenic coolant in Spitzer in 2009, their sensitivity was almost unchanged, as was the optical behaviour of the telescope and instrument.
The observations described in this paper supplement an earlier set of observations over smaller areas in these three fields by the Spitzer Extragalactic Representative Volume Survey (SERVS; Mauduit et al. 2012), for which images and catalogs are available from the Infrared Science Archive (IRSA) (a second data release of SERVS, including the data fusion of Vaccari (2015) and deeper Spitzer catalogs is planned). The DeepDrill images are of similar depth to those from SERVS (a 5 depth of ≈ 2 Jy in both bands), but cover more than twice the area (≈ 27 deg 2 compared to 12 deg 2 in these fields covered by SERVS (see Table 1), though SERVS also includes a further 6 deg 2 in the Lockman and ELAIS-N1 fields). We also note that deeper warm Spitzer data in the ECDFS field were taken recently as part of the "Cosmic Dawn Survey" (principal investigator P. Capak). Figure 1 shows the SERVS-DeepDrill survey in the context of other surveys at ≈3.6 m. The IRAC image and catalog data on the three DDFs described in this paper will be made available though the Infrared Science Archive (IRSA).
The scientific motivation for this survey closely followed that for SERVS, namely the study of galaxy evolution as a function of environment from ∼ 5 to the present, but with the additional feature of the deep and multi-epoch LSST DDF data. The DDFs are expected to be observed throughout the ten year duration of the LSST survey, with a cadence as frequent as once every two nights at times of year when the fields are available for observation (Brandt et al. 2018;Scolnic et al. 2018). The time domain adds the ability to detect and obtain light curves of supernovae in distant galaxies (of which we expect ∼ 10 4 in the DDFs; LSST Science Collaboration et al. 2009), and to allow the study of AGN flares, tidal disruption events, and other variable phenomena. The Spitzer data provide information on the properties of host galaxies of supernovae and other transients and help to identify and classify AGN, and, indeed, SERVS has already proved useful for these types of investigations (Lunnan et al. 2014;Falocco et al. 2015;Chen et al. 2018). In conjunction with other data at optical and shorter near-infrared wavelengths, the Spitzer survey in these fields will enhance the study of the host galaxies of supernovae and AGN through improved estimates of stellar mass, star formation history and reddening (Pforr et al. 2013).
Dusty star-forming galaxies detected in the mm/submm with positions from ALMA can often be identified with faint IRAC sources, allowing better understanding of their stellar masses and extinctions (Simpson et al. 2014;Gómez-Guĳarro et al. 2019;Leung et al. 2019;Patil et al. 2019;Dudzevičiūtė et al. 2020;Ocran et al. 2020a,b). Other uses include obtaining constraints on stellar masses and ages of galaxies in overlapping deep spectroscopic surveys (Calabrò et al. 2017;Thomas et al. 2017;Khusanova et al. 2019), studying cosmic background radiation (Mitchell-Wynne et al. 2016) and exploiting fields suitable for deep multi-conjugate adaptive optics observations of distant galaxies (Lacy et al. 2018).
The DDFs have garnered significant observational resources from other telescopes, from the radio and far infrared through to the X-ray. A list of the large area (> 1 deg 2 ) surveys in the DDFs may be found in Table 2 (see also Table 1 of Chen et al. 2018), and their coverages are illustrated in Figures 2, 3 and 4. In the X-ray, the XMM-SERVS survey (Chen et al. 2018) is covering the original SERVS areas in ES1, XMM-LSS and ECDFS. The optical data are less homogeneous, including data from Hyper Suprime-Cam (HSC) (Aihara et al. 2018;Ni et al. 2019), the Dark Energy Survey (DES; Abbott et al. 2018), and the ESO ESIS and VOICE surveys (Berta et al. 2006;Vaccari et al. 2016), however, as all three fields will be targeted for deep LSST observations this is not a major concern. In the near-infrared, the VISTA VIDEO survey (Jarvis et al. 2013) covers the whole SERVS area, and is supplemented by VEILS (Hönig et al. 2017) which covers the DES fields that are repeatedly observed to find supernovae and other time-domain phenomena (hereafter the DES DDFs). The fields are covered by the SWIRE survey in the mid-infrared (Lonsdale et al. 2003), and the HerMES survey in the far-infrared (Oliver et al. 2012). In the radio, existing deep surveys from the ATCA (ATLAS) (Franzen et al. 2015), GMRT (Smolčić et al. 2018) and LoFAR (Hale et al. 2019) cover a significant fraction of the fields. The MIGHTEE survey with MeerKAT, currently underway, will image the inner regions of all three fields even more deeply at 0.9-1.7 GHz (Jarvis et al. 2016). Vaccari (2015) combined SERVS data with catalogues of optical and near-infrared photometry that were available at the time in all five SERVS fields, and Pforr et al. (2019) used these catalogues to derive photometric redshifts for ≈ 4 million galaxies. Furthermore, the Herschel Extragalactic Legacy Project has incorporated SERVS data within their workflows to produce multi-wavelength catalogues and extract more accurate FIR/SMM fluxes to study the dust properties of infrared galaxies over cosmic time (Vaccari 2016;Hurley et al. 2017;Małek et al. 2018;Shirley et al. 2019). For very challenging applications, such as identifying rare sources, and obtaining photometric redshifts accurate enough to study environments, more accurate photometry that allows for the difference between the relatively large Spitzer point spread function and overlapping ground-based surveys in the near-infrared or optical is needed. This more refined photometry requires the application of forced photometry techniques such as The (Lang et al. 2016), and has been successfully used in the XMM-LSS field (Nyland et al. 2017), with the remaining SERVS/DeepDrill fields to follow. The improved photometry and photometric redshifts from it enable the accurate estimation of environmental parameters for galaxies out to at least ∼ 1.5 (Krefting et al. 2020). The photometry in XMM-LSS was also used to obtain photometric redshifts for X-ray AGN in the XMM-SERVS survey (Chen et al. 2018).
This paper is structured as follows: Section 2 describes the observations, Section 3 the processing of the image data and tests to assess the quality of the astrometric and photometric calibration. Section 4 describes the image and catalogue data products to be included in the release. In Section 5, we present an overview of the galaxy population in DeepDrill, including colours and source counts, and also highlight sources with very red [3.6]−[4.5] colours found in the survey. Section 6 contains a short summary.

OBSERVATIONS
We were awarded time to perform a survey of three of the four LSST DDFs that have been defined at the time of writing: 1 the ELAIS-S1 field (ES1), the XMM-Large-Scale Structure Survey field (XMM-LSS) and the Extended Chandra Deep Field-South field (ECDFS). The fourth DDF identified by LSST, the COSMOS field, has deep coverage (to 5 depth of ≈ 0.3 Jy in both bands) in the inner 2 deg 2 from several Spitzer surveys (Sanders et al. 2007;Steinhardt et al. 2014;Ashby et al. 2018). There is also a wider survey (SHIRAZ; Annunziatella et al. 2020 in preparation), to a similar depth as SERVS/DeepDrill that covers an additional ≈ 2 deg 2 outside of the central area to overlap with the Hyper Suprime-Cam Deep Survey (Aihara et al. 2018).
The central areas of all three of our fields were observed as part of SERVS (Mauduit et al. 2012) during the early months of the post-cryogenic Spitzer mission (2009-07-28 to 2011-03-06). The DeepDrill Survey (Program ID 11086, P.I. Lacy) was observed between 2015-05-04 and 2016-12-26 (Table 1). The DeepDrill observations followed the SERVS Astronomical Observation Request (AOR) construction, with each AOR making up a square tile of nine pointings, each pointing consisting of six repeats of 100s frames dithered using the IRAC small cycling dither pattern. 2 The use of Fowler sampling in the IRAC detectors (Fazio et al. 2004b) means that the 100s frame time corresponds to a little less than 100s integration on sky: 93.6s at 3.6 m and 96.8s at 4.5 m. The fields were imaged in two epochs to facilitate rejection of asteroids, with a targeted depth of 12 frames. Due to scheduling constraints, the time separation of the two epochs was non-uniform, ranging from a few weeks to ∼ 1 year. The spatial coverage is also non-uniform. Areas around the edges of the SERVS fields in particular received additional coverage, and some outlying regions did not receive the full coverage. Figure 5 shows the distribution of coverage in each field. The area in each band with a coverage of 9 or more 100-second frames (i.e. with >87% of the sensitivity of the nominal coverage of 12 frames), and the area with coverage of 9 or more in both bands at the same position are listed in Table 1 (the area with both bands is slightly smaller as the two IRAC detectors are offset). We encourage users with a need for uniformity in depth to make use of the supplied coverage maps.
The survey was designed such that source confusion only becomes significant near the nominal flux density limit of the survey, where there are about 30 beams per source, the typical value at * * A smaller area survey (4 deg 2 ) will also be carried out at 2-4 GHz in ECDFS. § The XMM-LSS field of the HSC survey contains one ultradeep pointing and three deep ones, so the depth varies with position. ‡ ‡ 10 magnitude limits from Abbott et al. (2018) +0.75 to convert to 5 ; note that there is significant overlap between DeepDrill and the DES Deep Drilling fields (see Figures 2-4, which, when the data are co-added, will be significantly deeper than the main survey).
In Appendix A, we show how the confusion noise is expected to vary with depth of coverage in the survey, including both confusion from randomly-distributed sources and an additional term due to galaxy clustering. To more accurately extract faint source parameters from the deepest parts of the survey we recommend PRF fitting of sources and their near-neighbors, which can be further improved by using a prior from a higher resolution survey of similar or greater depth.

Image processing
Data processing of the DeepDrill data was similar to that carried out for the SERVS dataset (Mauduit et al. 2012), using a data cleaning pipeline derived from processing SWIRE and COSMOS data (Lonsdale et al. 2003;Sanders et al. 2007). The processing began with the Corrected Basic Data (CBCD) frames from the Spitzer Science Center (basic calibrated data frames with corrections for common artifacts, see Lowrance et al. 2016). A refined dark frame for each AOR was constructed after identifying and masking astronomical sources in the data and subtracted from each individual frame in the AOR. Hot pixels in the 4.5 m data were masked, and the col-umn pulldown correction provided in the CBCD was improved (see Lowrance et al. 2012). The images were then rectified to a common background level corresponding to the mean background during the observations. Further corrections were made for latent images on a frame-by-frame basis, as these are particularly prevalent in warm mission data. The data were then mosaicked using (Makovoz et al. 2006) (see table 3 of Mauduit et al. 2012, for the parameters used).
A pointing issue was found and corrected in the ES1 field, where the pointing refinement task in the Spitzer data processing pipeline failed for four of the AORs in the South of the field. We also found that we needed to correct the photometric calibration of the SERVS data in ES1, which was taken early in the post-cryogenic mission, while the instrument performance was still being characterized and before the final array temperatures had been set. This was done by comparing the fluxes of sources in the overlap between the SERVS and DeepDrill datasets, and applying the measured offsets to the SERVS data (1.04 at [3.6] and 0.98 at [4.5]). The ES1 data are thus all on the final warm mission calibration. Following Mauduit et al. (2012), the ≈ 1 deg 2 in the Southwestern part of the XMM-LSS field that was taken during the cryogenic mission as part of the SpUDS programme (Kim et al. 2011) was included in the final images. The calibrations of these data are the same as those of the DeepDrill data to within 1%, but the SpUDS data are significantly deeper. Similarly, the central ≈ 0.5 deg 2 of the ECDFS field contains data from the much deeper cryogenic SIMPLE (Damen et al. 2011) and GOODS (Dickinson et al. 2003) programmes. However, in this case we used only a depth of ≈ 12 frames per sky position to obtain an approximately uniform depth for the survey of that field.

Astrometric accuracy
We matched the DeepDrill catalogs to Gaia Data Release 2 (DR2; Lindegren et al. 2018). The IRAC pointing is calibrated using 2MASS (Skrutskie et al. 2006), and the pointing refinement pipeline now includes proper motion information to provide positions for epoch J2000 (Lowrance et al. 2016). We therefore used the proper motion information in Gaia to derive positions appropriate for the year 2000 to match to the Spitzer positions. We matched the dualband DeepDrill catalogues to Gaia using a 1. 0 match radius. 3% of sources in the DeepDrill survey have counterparts in Gaia DR2.
The results are shown in Table 3, where we list the mean systematic offset between Spitzer and Gaia DR2 positions Δ(R.A.) and Δ(Decl.), along with the scatter (R.A.) and (Decl.), representing the positional accuracy of a typical source in the survey. This scatter is independent of source flux, and is thus probably dominated by the scatter in the pointing refinements of individual frames, which is ≈0. 3 on an individual CBCD frame (Lowrance et al. 2016). All systematic offsets when averaged over a mosaic are < 0.1 arcsec.

Photometric accuracy
The photometric calibration history of IRAC is given in Carey et al. (2012). The post-cryogenic mission calibration factors were obtained by matching the fluxes of the standard stars to the cryogenic observations, and have an absolute accuracy ≈ 3%. Some scatter is to be expected, as the ≈ 1. 8 IRAC PSF is undersampled by the 1. 2 pixels, and the sensitivity across each pixel varies as a function of position within the pixel. As most of the sources in DeepDrill are slightly extended, and because any given point in the sky is covered by a large number of observations, this intrapixel sensitivity variation is assumed to average out. Aperture corrections were applied as described in Mauduit et al. (2012), using the values in Table 2 of that paper. As noted in Section 3, some SERVS fields (including ES1) were taken early in the Spitzer warm mission, before the calibration factors were finalized. In SERVS, this issue was dealt with by forcing the flux densities to the same scale as SWIRE. The SWIRE flux densities were based on an early calibration of IRAC, and thus there are significant differences between the calibration of the DeepDrill data (which are based on the final Spitzer post-cryogenic mission calibration) and the original SERVS data in the same fields. Table 4 gives the ratio of the calibration factors derived from comparing the DeepDrill to the SERVS Aperture 2 (3. 9 diameter) flux densities for sources > 10 Jy in the same field. The more accurate DeepDrill calibration is preferred.

DATA PRODUCTS
This section briefly describes the data products from DeepDrill that are available in the data release. These consist of images and two sets of catalogues, single-band catalogues and dual-band catalogues.

Images
Images were made at a pixel scale of 0. 60 per pixel (oversampling the PSF width of 1. 8), and are calibrated in MJy sr −1 . This results in a conversion factor from pixel values to Jy of 8.46. Coverage images were made, in units of 100-second IRAC frames, along with uncertainty images. Finally, mask images, showing the location of bright stars in the fields are included, made using the methods described in Mauduit et al. (2012).

Catalogues
Single-band (3.6 m and 4.5 m) catalogues were produced using SEx (Bertin & Arnouts 1996). Parameters used are shown in Table 5, note some of these differ from Mauduit et al. (2012), principally to improve background filtering: the background mesh size was changed from 32 to 16 pixels, and the filtersize from five to three. These changes improved the background estimates in regions of scattered light around bright objects. The default convolution filter (2-pixel [1. 2] FWHM) was used for source detection, along with a weight map (the depth of coverage map was used). Photometric apertures labelled 1-5 corresponded to the SWIRE (Lonsdale et al. 2003) standard apertures with radii 1. 4, 1. 9, 2. 9, 4. 1 and 5. 8, respectively, with aperture corrections applied per Mauduit et al. (2012). Uncertainties in the flux densities are from SEx , adjusted to allow for the effects of pixel resampling and detector gain. An additional 3% error is added in quadrature to account for the systematic error in the IRAC flux density scale. The raw output from SEx was filtered to output sources with a signal-tonoise ratio > 5 in the SWIRE aperture 2 (3. 9 diameter). The flag column in the catalogue is a bitwise flag that takes the first and second bit of the SEx flag (bit 1: photometry may be affected by neighbours or bad pixels, and bit 2: the source was blended with a neighboring object), and adds a further flag bit (3) to indicate that the source is either a bright ( < 12) star, or within the region affected by the halo of a bright star according to the rules in Mauduit et al. (2012). The catalog columns are listed in Table 6. The star masks used to create this flag are included in the data delivery to the Infrared Science Archive (IRSA). The 3.6 m and 4.5 m catalogues contain 2.7 and 2.5 million sources, respectively, summed over all three fields.
Dual-band catalogues were created by matching the two singleband catalogues produced by SEx with a 0. 6 matching radius (before applying the 5 cut) and then applying a 3 cut for the signal-to-noise ratio of the detection in a 3. 9 diameter at both 3.6 m and 4.5 m. (We considered using the dual-image capability Figure 2. The footprints of multi-wavelength surveys on the ELAIS-S1 field (see Table 2 for survey references), superimposed on a greyscale of the IRAC 4.5 m coverage. Upper left: optical surveys (the DES DDFs are shown in cyan with rectangles indicating the individual chips, the ESIS survey in mauve with cross-hatches, and the LSST footprint is shown as a black circle), upper right, near-infrared surveys (VIDEO in red, left hatched and VEILS in orange, right-hatched), lower left, 24 m coverage from SWIRE in magenta and far-infrared coverage in HerMES in green (the L4 data is deeper than the hatched L6 data), and lower right the X-ray XMM-SERVS coverage in yellow, cross-hatched and the ATLAS radio survey in red, with circular hatches.
of SE , but we found that the approach of performing two independent source detection rounds on the individual channels and then merging the results in catalogue space resulted in a more reliable catalogue.) There will thus be objects present in the merged catalogue that are not present in the single-band catalogues (and vice-versa). The 3.6 m positions are given in the catalog as these correspond to the smallest PSF. Columns are listed in Table 7. The dual-band catalogues in each field contain approximately 800,000 sources, giving a total of 2.4 million sources.
Multiwavelength catalogues in the center 3-5 deg 2 of each of the DeepDrill fields are in the process of construction (Nyland et al., in preparation, Nyland et al. 2017). These employ forced  Table 2  photometry with the (Lang et al. 2016), using the groundbased near-infrared VIDEO data as a prior to overcome source blending issues. Recovered IRAC magnitudes from this technique are a good match to the aperture magnitudes in SERVS/DeepDrill (Nyland et al. 2017). These catalogues are not part of the initial DeepDrill data release, but will form part of a subsequent data release.  Table 2 for survey references), superimposed on a greyscale of the IRAC 4.5 m coverage. Upper left: optical surveys (the LSST footprint is shown as a black circle, the DES DDFs are in cyan with rectangles indicating the individual chips, the HSC-deep survey in is green with left hatches, and the VOICE survey is in blue with circular hatches). Upper right: near-infrared surveys (VIDEO in red, left hatched and VEILS in orange, right-hatched), lower left, 24 m coverage from SWIRE in magenta and far-infrared coverage in HerMES in green (the L2 data is deeper than the hatched L6 data), and lower right the X-ray XMM-SERVS coverage in yellow, cross-hatched and the ATLAS radio survey in red, with circular hatches.

[3.6]-[4.5] colour as a function of redshift
To show the variation of galaxy colour with redshift in DeepDrill we use the photometric redshifts in XMM-LSS from Krefting et al.
(2020). These use -based forced photometry (Nyland et al. 2020, in preparation, see also Nyland et al. 2017;Cotton et al. 2018) to obtain redshift estimates for 690,000 sources in the overlap between SERVS and DeepDrill. These photometric redshifts have an uncertainty in /(1 + ) ≈ 0.03 and outlier fraction 1.5% in the redshift range 0 < < 1.5. To supplement these photometric   (Papovich 2008). Figure 6 shows this colour as a function of redshift. The models (based on Maraston 2005) show that the dip in the [3.6]−[4.5] colour at ≈ 0.6-0.8 is a strong function of the shape of the spectral energy distribution in the near-infrared, which is itself a strong function of the age and nature of the stellar population. Nevertheless, the spectroscopic and photometric redshifts show that most galaxies follow a fairly tight trend in [3.6]−[4.5] colour as a function of redshift. Galaxies with blue colours (bluer than the blue dotted line at [3.6]−[4.5]= −0.3 in Figure 6) are at ≈ 0.5-0.9, and galaxies with red colours (redder than the red dotted line corresponding to [3.6]−[4.5]=0.1) are at > ∼ 1.3. Also shown in Figure 6 is the boundary color commonly used to select AGN candidates from WISE data (Stern et al. 2012 In Figure 7 we plot the [4.5] magnitude against redshift for the sample matched to OzDES, the A2SUDS sample and the galaxies with photometric redshifts (note that the "banding" in the distribution of photometric redshifts is an artifact of the photometric redshift algorithm). Most of the galaxies in the survey have stellar masses between 10 10 and 10 11 , and galaxies with masses of 10 11 can be detected out to ≈ 5. As expected, the AGN are red and bright, though it should be noted that around the AllWISE limit of ≈ 19.5 the counts of red galaxies begin to rapidly exceed those of AGN, making selection of AGN based on [3.6]−[4.5] colour alone highly contaminated at faint magnitudes.  Mauduit et al. (2012) were used to correct the counts at AB magnitudes < 21.25; at fainter magnitudes we ran a new set of simulations (6000 simulated point sources in each half-magnitude bin) on the DeepDrill ES1 field to improve our estimates of completeness (Figure 8 (c)). These simulations involved scaling point sources extracted from the mosaics to a known flux density corresponding to the mid-point of the magnitude bin, adding a grid of 6000 of these sources at known positions to the mosaic, and rerunning SExtractor. The resulting catalogues were then matched to the known positions within 1. 2, and the fraction of recovered sources noted as the completeness value for that bin. These extractions also allowed us to examine possible biases in the recovered flux densities near the flux density limit from both Eddington bias (Eddington 1940) and biases from source Aperture corrected flux densities in apertures 1-5 Jy 9

Source counts from
Isophotal flux density Jy 10 SEx Auto flux density Jy 11-17 Uncertainties on columns 4-10 Jy 18 Kron radius 0. 6 pixels 19 Signal-to-noise ratio -20 Local RMS noise Jy 21 Coverage 100s frames 22 Flag (see text) - The DeepDrill counts agree well with the deeper counts from the literature, starting to diverge slightly at ≈ 22, probably due to differences in the photometric algorithms used when the significance of the DeepDrill detections drops below ≈ 10 , and the source confusion limit is approached. We also constructed counts from the deeper data in the ≈ 0.45 deg 2 of the XMM-LSS field that uses images from the SpUDS survey, verifying that the counts in that area have the same shape as the overall counts to ≈ 22.5, beyond which they are more complete, as expected.
To measure the counts in DeepDrill, we used the flux densities within the 3. 8 diameter aperture (with an aperture correction of 0.736 at [3.6] and 0.716 at [4.5]). This will slightly underestimate the fluxes (and hence the counts) at bright magnitudes ( ∼ 16 − 18), where the galaxies may be resolved on scales larger than the aperture (at brighter magnitudes the counts are dominated by stars). A comparison to the counts made with the 8. 2 diameter aperture (with aperture corrections of 0.920 and 0.905 at [3.6] and [4.5], respectively) shows that the counts in this range are about ≈ 0.1 dex higher in both bands. Over other magnitude ranges the differences are less than 0.1 dex (except at > ∼ 21, when the higher noise results in more incompleteness in the larger aperture). It is also the case that the aperture corrections are derived from point sources, so both the fluxes and the completeness corrections are strictly incorrect for marginally-resolved galaxies. However, the good agreement between the counts from the two apertures with very different aperture corrections argues against this being a significant problem. The counts of Ashby et al. (2015) used a point-source fitting method and agree with ours to within 0.1 dex over the whole magnitude range. The point-source fitting method works better on blended objects, but also can be affected by flux boosting from very low level cosmic rays, whereas our use of SE will probably lead to slight undercounting at faint magnitudes due to blending near the confusion limit.
We also compare our source counts to two semi-analytic simulations, GALFORM and S , and to the Evolution and Assembly of GaLaxies and their Environments (EAGLE) hydrodynamic simulations. Semi-analytic simulations can generate large samples for easy comparison to surveys like DeepDrill, but require some assumptions to be made regarding the effects of dust attenuation in relation to the geometry of disks and bulges to compute observed fluxes, compared to hydrodynamical simulations, which use 3D radiative transfer to calculate these effects directly.   Stern et al. (2012) ("AGN cut"; cyan dashed line), above which candidate AGN dominate in the much shallower WISE survey. Three models based on Maraston (2005) evolutionary synthesis models (see Guarnieri et al. 2019) are shown, a 3 Gyr old dusty star-forming galaxy, a 3 Gyr passive galaxy and a 100 Myr galaxy. (Note that the 3 Gyr models do not extend beyond = 2.17, where the age of the Universe is 3 Gyr.) We also plot the composite SED of the A2SUDS submm galaxies from Dudzevičiūtė et al. (2020). variant using the techniques described in Merson et al. (2013). 3 The lightcone covers the redshift range 0 < < 6, has a sky coverage of 18.09 deg 2 and contains model galaxies with apparent, dust attenuated magnitudes in the [3.6] band down to 2 Jy (see also Krefting et al. 2020). The S simulated lightcone was built from the S semi-analytic model of galaxy formation and evolution (Lagos et al. 2018(Lagos et al. , 2019 to compare with the DeepDrill survey. This lightcone has an area of ≈ 107.9 deg 2 , and a flux selection at the [3.6] band > 0.575 Jy (equivalent to an AB magnitude of 24.5), with the same redshift range of 0 ≤ ≤ 6.
We computed the predicted number counts from the EAGLE hydrodynamical simulations (Schaye et al. 2015;Crain et al. 2015) by extracting all galaxies at 0 < < 8 from their public database (McAlpine et al. 2016), and particularly their redshift, and apparent [3.6] and [4.5] flux densities (see Camps et al. (2018) for details of how these were computed). To calculate the number counts at a given band, we first calculated the area spanned by the box size of 100 Mpc projected in the sky (we assume that the width of the box is negligible with respect to the luminosity distance and hence we can assume all galaxies in the box to be at the same redshift). We then calculate the number of galaxies per unit area at each redshift and apparent magnitude bin. We then integrate the latter over redshift to obtain a total number per unit area in each magnitude bin, which we then divide by the width of the magnitude bin to compare to our other measurements. We do this at 3.6 m, 4.5 m and for subsamples of red and blue IRAC galaxies.
Overall, there is a fair agreement between the models and theobservations in the regime where galaxies dominate the observed counts ( > ∼ 18). At intermediate magnitudes, ≈ 18-21, the semianalytic models underpredict the number of galaxies compared to both our counts and those in the literature (by up to a factor of 1.8 in the [3.6] band and 2.2 in the [4.5] band for S ), but the EAGLE hydrodynamic simulation does better. At the faint end ( ≈ 22) the GALFORM model does best, with both S and EAGLE underpredicting the counts. All lightcones contain only the stellar emission of galaxies in the IRAC bands, neglecting warm dust  Figure 6). Also plotted are two instances of the 3 Gyr old dusty star-forming galaxy model in Figure 6 with stellar masses of 10 10 and 10 11 , and a 1 Gyr dusty galaxy model with 10 11 from the same library, along with the composite submm galaxy SED from A2SUDS. Counts and errors are expressed as log 10 ( ), where is the count per square degree per magnitude. The errors are based on Poisson statistics, combined with the uncertainty in the completeness estimates (which dominate at fainter magnitudes).  (Ashby et al. 2015). The magenta dotted line shows star counts from the UDS field (which is within the DeepDrill XMM-LSS field) from Ashby et al. (2015), using the model from Wainscoat et al. (1992) as refined by Cohen (1993Cohen ( , 1994Cohen ( , 1995 and Arendt et al. (1998). (As all three fields have Galactic latitudes of −60 • ± 15 • these will be representative of the survey as a whole.) The dashed black and green lines show the galaxy counts from the S (Lagos et al. 2018(Lagos et al. , 2019 and GALFORM semi-analytic simulations, respectively, and the dotted red line those from the EAGLE hydrodynamic simulations.  Counts and errors are expressed as log 10 ( ), where is the count per square degree per magnitude. The errors are based on Poisson statistics, combined with the uncertainty in the completeness estimates (which dominate at fainter magnitudes). emission from AGNs. Including this may improve the agreement between some of the models and the data. We also examine the source counts as a function of colour, using the red and blue colour cuts described above to isolate intermediate-( ≈ 0.7) and high-redshift ( > ∼ 1.3) galaxies. Figure  8 (d) shows these counts, again compared to the model galaxies. Both the blue and red counts start well above all the model counts (in the case of the blue counts at < 18 this is due to stars), with EAGLE producing the largest fraction of blue galaxies, and S the smallest. Both sets of observed counts converge to nearagreement with the S simulation at the faint end, but the other models do less well, with both EAGLE and GALFORM overpredicting the blue counts, and GALFORM also overpredicting the red counts. This suggests that the redshift distribution of the sources is quite different in the three simulations. With increasing number of photometric and spectroscopic redshift for SERVS and DeepDrill, we will be able to compare the redshift distributions of the data and the simulations, which will add another dimension to the tests shown here. A detailed discussion of the galaxy luminosity function and stellar mass function using DeepDrill data will also be presented in future papers.

Very red objects in [3.6]-[4.5]
In this section, we investigate the population of objects in DeepDrill at extreme red [3.6]−[4.5] colours. Such objects are of interest for both Galactic astronomy, where late-type dwarf stars with strong molecular bands can have extreme [3.6]−[4.5] colours (Leggett et al. 2010), and for extragalactic astronomy, where such [3.6]−[4.5] colours are sometimes found for highly obscured, highly luminous AGNs (Tsai et al. 2015). In Table 10 we list the 19 objects detected in both IRAC bands with a signal-to-noise ratio > 3 ([3.6]−[4.5]>1.2) that also have 24 m coverage in SWIRE (3 depth ≈ 100 Jy; total area of overlap ≈ 30 deg 2 ). 16 of the objects have detections at 24 m, implying their SEDs continue to rise rapidly in the mid-infrared. These are good candidates for dusty high-AGN. In Figure 9 we plot the logarithm of the flux density ratio between 3.6 and 5.8' m, 5.8 / 3.6 , versus the logarithm of the flux density ratio between 4.5 and 8.0' m, 8.0 / 4.5 for the 15 of these objects that are covered by the SWIRE IRAC 5.8 m and 8.0 m observations, and compare to the colours of spectroscopically-confirmed AGNs from Lacy et al. (2013). Our very red IRAC objects are redder than nearly all of the confirmed AGNs, but follow the extrapolation of the colour trend of AGN to the red (though fall a little below the selection area of Donley et al. 2012). A few of the very red objects have faint detections in VIDEO (Table 10), for these objects the −[4.5] colours of the sources detected at 24 m are also consistent with AGNs (Messias et al. 2012).
The very red AGNs that are detected at 24 m have some similarities to objects selected on the basis of extremely red − 24 m colours in surveys with Spitzer (dust-obscured galaxies or "DOGs"; Dey et al. 2008), very red − 22 m colours between the Sloan Digital Sky Survey and WISE (Ross et al. 2015) and objects with very red colours between the WISE bands themselves ("Hot DOGs"; Tsai et al. 2015). These sets of objects are also dominated by AGN emission in the mid-infrared. However, we emphasize that the [3.6]−[4.5] colours of the very red objects presented here are typically more extreme. None of the DOGs, only 11/77 of the red AGN in Ross et al. (2015) and 5/20 of the Hot DOGs have [3.6]−[4.5] colours as red as the very red objects described here.
Three sources are not detected at 24 m, and are thus less likely to be AGN. One of these (DD J033258.19−274143.8) is in the much deeper coverage of S-CANDELS, and has faint detections at 5.8 and 8.0 m (Figure 9). This object was also detected in multiple epochs of archival Hubble Space Telescope (HST) coverage (2004-08-13 using the ACS instrument through the F850LP filter for proposal ID 9500 and 2011-12-30 through F814W for the CANDELS observations, proposal ID 12060). We downloaded the relevant data from the HST archive, and noted that the object has a significant proper motion, moving approximately 0.7 arcsec in R.A. and 0.5 arcsec in Decl. between the two epochs. We therefore tentatively identify this object as a brown dwarf star. With a [3.5]−[4.6] colour of 1.6 in AB magnitudes (2.2 in Vega magnitudes), this places the star in a spectral class approximately at the transition between T and Y-dwarfs (Leggett et al. 2017). We speculate that the remaining two objects that lack a 24 m detection may also be brown dwarfs.
To further investigate the possibility of an AGN origin for the majority of the very red sources we searched for serendipitously available mid-infrared spectroscopy as well as deep radio continuum data in the DeepDrill fields. For one of the very red sources, DD J022050.38−053714.1 (5MUSES-033), the Spitzer IRS spectrum from the CASSIS database (Lebouteiller et al. 2011) shows a strong mid-infrared continuum and spectral features consistent with an AGN at its photometric redshift (derived from SWIRE data) of 2.02 (Rowan-Robinson et al. 2013). This source is also detected in the VLA Sky Survey (VLASS; Lacy et al. 2020) with a 3 GHz flux density 3GHz = 2.0 ± 0.2 mJy. The correspondingly high radio luminosity of 3GHz = 5×10 25 W Hz −1 is much more consistent with the radio emission being powered by an AGN rather than by star formation (e.g. Kimball et al. 2011). One further source, DD J022008.87-041819.0 is also detected in VLASS with 3GHz = 0.54 ± 0.2 mJy. Its photometric redshift is 2.01, which implies 3GHz = 1.3×10 25 W Hz −1 , also more likely to be powered by an AGN than by star formation. In addition to VLASS, we also searched for counterparts in the deep radio surveys with published source catalogs listed in Table 4 (LOFAR, GMRT, and ATLAS). None of the very red sources is detected in these radio surveys. We also searched for counterparts in deep (sub-mJy beam −1 rms sensitivity) commensal 340 MHz data from the VLA Low-band Ionosphere and Transient Experiment (VLITE 4 ; Clarke et al. 2016;Polisensky et al. 2016) that were observed during an ultra-deep, single-pointing VLA imaging campaign centered on the Hubble Ultra Deep Field (HUDF) within CDFS at 3 GHz (Abrahams et al. 2020, submitted). Of the 12 very red sources in CDFS that fall within the currently imaged portion (spanning a width of 2 deg) of the deep VLITE data, one source (DD J033401.66−265017.0) has a compact counterpart in VLITE with a primary-beam-corrected flux of 1.0±0.3 mJy at a SNR of 4.8. The lack of detection in VLASS implies a moderately-steep spectral index, 0.34 GHz 3.0 GHz < −0.9. Unfortunately, the source currently lacks a photometric redshift, though assuming it is at > 1 it is also more likely to be powered by an AGN. Full details on the deep VLITE imaging centered on the HUDF will be presented in a forthcoming study (Polisensky et al., in preparation).

SUMMARY
We describe the DeepDrill survey, which images ≈ 27 deg 2 in three of the four pre-defined LSST Deep Drilling fields to a depth of ≈ 2 Jy. Accuracy in photometric and astrometric calibration is described. We illustrate the use of the [3.6]−[4.5] colour to divide objects into high ( > 1.3) and low ( ∼ 0.7) redshift bins. This property of the [3.6]−[4.5] colour will be particularly valuable for breaking degeneracies in photometric redshift estimates obtained from LSST optical data alone in regions of the DDFs not covered by other deep near-infrared data (e.g. Pforr et al. 2013).
We show that source count data at [3.6] and [4.5] alone can provide useful comparisons to models of galaxy formation. The model 4 VLITE provides data at 340 MHz from a subset of the VLA antennas during regular VLA observations above 1 GHz over a field-of-view (measured at the full-width at half power of the primary beam) of 5.5 deg 2 and a maximum resolution of ∼ 5 in the VLA A configuration. and observed counts generally agree well, but a small discrepancy observed at intermediate magnitudes warrants further investigation. These comparisons will be further enhanced when the Spitzer data are fully combined with the other multiwavelength data in the LSST Deep Drilling fields. We are currently working on completing multiband 0.4-4.5 m catalogs using forced photometry in the centers of the fields (defined by the overlap between the DeepDrill and VIDEO surveys). We show that most objects with extremely red [3.6]−[4.5] colours are mostly identifiable as dusty AGNs at > 1, but we also find one likely T or Y brown dwarf star and two further brown dwarf candidates. Future papers will discuss the stellar masses of galaxies in DeepDrill (Maraston et al. 2020, in preparation) and their clustering (van Kampen et al. 2020 in preparation).
The 3-5 m region in the near-infrared is uniquely useful for capturing light from the peak in the stellar SED at rest frame ≈ 1 m at ∼ 1 − 4. Although new surveys with Spitzer are no longer possible, we expect that the legacy value of the existing survey datasets will last well into the future. When combined with the existing and planned multiwavlength datasets in these fields, the DeepDrill data will be able to help answer fundamental questions about the nature of AGN and galaxy evolution, and its dependence on environment from before Cosmic Noon to the present.

DATA AVAILABILITY
The data products from the post-cryogenic Spitzer surveys of the three LSST DDFs described here (images, coverage maps, uncertainty images, bright star masks and single and dual-band catalogues) are available from IRSA (https://irsa.ipac.caltech. edu/data/SPITZER/DeepDrill). These include mosaic images, coverage maps, uncertainty images and bright star masks. Each field has two single-band catalogues cut at 5 , and a dual-band catalogue requiring a detection at > 3 at both 3.6 and 4.5 m. The simulated lightcone catalogue from SHARK is also included in the release, its columns are described in Table 11.

ACKNOWLEDGEMENTS
We would like to thank the referee for a thorough review that significantly improved the paper. We also thank the contributors to the original DeepDrill proposal not listed as coauthors on this paper, including M. Dickinson, I. Prandoni and L. Spitler, and especially I. Smail for helpful comments and suggestions. This work is based on obser-vations made with the Spitzer Space Telescope, which was operated by the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA. Support for this work was provided by NASA through an award issued by JPL/Caltech. The National Radio Astronomy Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc. WNB acknowledges support from NASA grant 80NSSC19K0961. GW acknowledges support from the National Science Foundation through grant AST-1517863, by HST program number GO-15294, and by grant number 80NSSC17K0019 issued through the NASA Astrophysics Data Analysis Program (ADAP). Support for program number GO-15294 was provided by NASA through a grant from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Incorporated, under NASA contract NAS5-26555. IRS acknowledges support from the United Kingdom Science and Technology Funding Council (ST/P000541/1). Basic research in radio astronomy at the Naval Research Laboratory is funded through 6.1 Base funding. This research has made use of the NASA/IPAC Extragalactic Database (NED), which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration. This work made extensive use of (Taylor 2011) for catalog matching and analysis.
shows that, as the depth of the survey is increased beyond the nominal depth of 12 frames, T decreases, but by factors less that the square root of the number of frames, and the contribution of confusion noise becomes dominant. The contribution from clustering is small, but almost constant with depth as it is dominated by bright sources well above the survey limit, so becomes more significant in deeper surveys. Source confusion in very deep IRAC images can be effectively mitigated by PRF subtraction or forced photometry techniques (e.g. Ashby et al. 2015;Nyland et al. 2017) that remove the PRF wings from the images, reducing Ω e and Ω b .
Positional accuracy is also affected by source confusion (Hogg 2001). In the case of DeepDrill, however, the low value of results in the effect of centroid errors due to source blending at the measured ≈ 30 beams per source being < ∼ 0. 1 (Hogg 2001, their Figure 2).