Detecting Dark Matter Cores in Galaxy Clusters with Strong Lensing

We test the ability of strong lensing data to constrain the size of a central core in the dark matter halos of galaxy clusters, using Abell 611 as a prototype. Using simulated data, we show that modeling a cluster halo with ellipticity in the gravitational potential can bias the inferred mass and concentration, which may bias the inferred central density when weak lensing or X-ray data are added. We also the highlight the possibility for spurious constraints on the core size if the radial density profile is different from the assumed model. These systematics can be ameliorated if central images are present in the data. Applying our methodology to Abell 611 and imposing a reasonable prior on the stellar mass-to-light ratio restricts the core size to be less than about 4 kpc, with a minimum reduced $\chi^2$ of 0.28 for 0."2 positional errors. Such small cores imply a constraint on the dark matter self-interaction cross section of the order of $0.1\ \mathrm{cm^2/g}$ at relative velocities of about 1500 km/s.


INTRODUCTION
Galaxy clusters provide a critical test of dark matter theories if their inner dark matter density profile can be measured. Hierarchical structure formation models make concrete predictions about cluster density profiles. For example, in the cold dark matter (CDM) paradigm, dark-matter-only simulations show that hierarchical structure formation leads to cuspy dark matter halos with a Navarro-Frenk-White (NFW) density profile (Navarro et al. 1996(Navarro et al. , 2010Gao et al. 2012), with the 3-D density profile ρ ∝ r −1 in the inner region. However, self-interactions or the feedback effects from baryons can potentially modify the inner slope. For example, Active Galactic Nuclei (AGN) may potentially cause flattening of central cusps of cluster mass halos (Peirani et al. 2008;Martizzi et al. 2012;Mead et al. 2010). Self-interacting dark matter (SIDM) models (Spergel & Steinhardt 2000) and simulations thereof (Sokolenko et al. 2018;Elbert et al. 2015;Rocha et al. 2013) predict shallower slopes for radial dark matter density profiles.
The presence or absence of dark matter cores in clusters is an open question. Sand et al. (2008) found an inner logarithmic slope of ∼ −0.5 in two relaxed clusters using a E-mail: kandrad1@uci.edu pear near the Einstein radius of the lens, which is typically tens of kpc for galaxy cluster lenses, well within the cluster scale radius, and also where the effects of baryons and dark matter self-interactions are strongest. Separating the two effects is critical.
In this work we constrain the dark matter distribution using strong lensing alone. This allows us to characterize the strengths and limitations of strong lensing separately from other techniques. Our primary goal is to determine how well one can constrain the size of a central core in a lensing cluster, i.e., determine the radius (if any) below which the density profile becomes relatively flat.
We adopt a flat cosmology with Ω Λ = 0.7, Ω m = 0.3, and H 0 = 70 km s −1 M pc −1 . At the adopted lens redshift of 0.288, the distance to the lens is 893.8 Mpc, and 1 is equal to 4.329 kpc. We define halo mass as M 200 , the mass enclosed by a sphere of radius R 200 , which we define in turn as the radius at which the halo density is 200 times the critical density of the universe at the redshift of the halo. This paper is organized as follows. In Section 2 we discuss the new lensing code and lens mass profiles. In Section 3 we discuss the mock data sets and lens models used in our analysis, and the results of those models. In Section 4 we describe the data and lens model for Abell 611, and discuss the results of the analysis for that cluster. Our conclusions are summarized in Section 5.

HALO MODELS AND LENSING SOFTWARE
We choose a flexible mass model for our tests which has a core, and for which the mass distribution approaches NFW as the core radius goes to zero. We require a fast method to calculate the magnification and deflection at each point. These requirements are in a new lensing software, QLens, which allows for both pixel image modeling (using pixelated source reconstruction) and point image modeling (with option to include fluxes, time delays, and multiple sources at different redshifts). QLens includes 14 different analytic lens models to use for model fitting, including 10 different density profiles where ellipticity can be introduced into either the projected density or the lensing potential. In addition, a built-in nested sampler is included, along with an adaptive MCMC algorithm called T-Walk; however the code can also be compiled with MultiNest or PolyChord. The QLens package is now available on GitHub by request and includes a student-friendly tutorial to get users started with point image modeling. Here, we describe the novel features implemented in QLens that are critical to the results of this paper.
We consider two types of cored halo models for mod-eling cluster halos. The first model is a cored NFW profile (cNFW), for which the density profile is defined as This will be the primary lens model we use in this work for the cluster halo. Note that as the core radius r c → 0, the density profile reduces to the standard NFW form. This profile has been used in lens modeling in Newman et al. (2013a,b), and was found by Peñarrubia et al. (2012) to provide a reasonably good fit to cored DM halos found in hydrodynamical simulations of Governato et al. (2012). Analytic formulas for the kappa profile and deflection of the corresponding spherical model are given in Appendix A.
The second cored halo model, which we call the Corecusp model, is defined as ρ = ρ 0 r n s r 2 + r 2 c γ/2 This is an extension of the "cuspy halo model" of Muñoz et al. (2001). This model also allows for a core in addition to a scale radius, where r c < r s . In the limit of large r, the logslope is given by n, whereas in the limit of small r and zero core, the log-slope is given by γ. Note that in this model, both the core and scale radius are added in quadrature to r, resulting in a more rapid turnover compared to the cNFW model at the scale and core radii. As a result, the profile does not reduce exactly to NFW in the limit r c → 0. On the other hand, the greater flexibility afforded by the variable inner and outer log-slopes may become useful when combining strong lensing with data that probe the density profile on larger scales, e.g. weak lensing or X-ray data. For the purposes of this paper, however, this profile provides a comparison of how sensitive the core constraints are to the exact nature of the turnover behavior of the density profile near r c . As we are primarily interested in the behavior in the region interior to r s , we fix n to 3 in this work, to match that of an NFW profile. The relevant lensing formulas are given in Appendix A.
For each of the above lens models, we add ellipticity by making the replacement R 2 → qx 2 + y 2 /q in the projected density profile. The deflection and Hessian of the lens mapping must be calculated by numerical integration (see Schramm 1990;Keeton 2001a), which is computationally expensive. While the integrals can be done relatively quickly using Gaussian quadrature, it is not known a priori how many points will be required for the integral to converge beyond a specified tolerance. This can be solved by an adaptive quadrature scheme where the integration is done at successively higher orders and an error estimate is obtained after each iteration, then stopping when the error falls below a specified tolerance. To implement this, we employ a modification of Gaussian quadrature known as Gauss-Patterson quadrature, which consists of nested quadrature rules whereby a given order of integration retains the function evaluations from the lower orders, thus ensuring they are not wasted (at the cost of allowing up to a maximum order of 511 points). 1 For lensing calculations, we find this adaptive quadrature scheme requires nearly an order of magnitude fewer function evaluations compared to Romberg integration for a tolerance ∼ 10 −3 . This reduces the expense of lensing calculations enormously for elliptical projected density profiles.
The mass distribution of the dark matter halo in strong lenses has been shown to be consistent with elliptical isodensity contours in several studies (Yoo et al. 2005(Yoo et al. , 2006Kochanek & Dalal 2004). Hence, when modeling the cluster halo, there is strong motivation for introducing ellipticity into the projected density profile as described above. However, because of the computational burden of performing the integrations for elliptical density profiles, it is common to instead use the "pseudo-elliptical" model in which the halo ellipticity is incorporated into the gravitational potential rather than the projected density (Golse & Kneib 2002). Here, we consider both approaches, and will compare the pseudo-elliptical approximation to the full elliptical density approach.

MOCK DATA MODELING
We are interested in determining the capabilities of strong lens modeling for inferring the cluster dark matter halo properties. The following questions guided our choice of mock data sets.
(i) Is it possible to distinguish between a core and a cusp with strong lensing alone?
(ii) To what extent can central images help in determining the inner density profile?
(iii) How do inaccuracies in the outer density profile affect the result? Do they lead to a spurious detection of a core or cusp?
(iv) Does the use of an elliptical potential rather than using a elliptical density profile lead to inaccurate results?
Modeling the mock data allows us to test the power of models to constrain relevant parameters in a controlled way.

Mock Data Preparation
To test the ability of the lens models and software to constrain the relevant halo and BCG parameters six sets of mock data were created, as follows: • Cuspy, no central images • Cored, no central images • Cuspy, with central images • Cored, with central images • Cored, no central images, highly elliptical halo (axis ratio = 0.5) • Cuspy, no central images, highly elliptical halo (axis ratio = 0.5) 1 The algorithm described above is quite similar to adaptive Clenshaw-Curtis quadrature except it is an open interval quadrature rule, thus dodging the issue of having to evaluate the projected density or its derivative at r = 0.
The mock data sets were constructed to be similar in nature to Abell 611 in most respects, including mass, redshift, position angle, offset and ellipticity. To examine the usefulness of central images in constraining system parameters, two image sets were generated for each of the cored and cuspy cases; one image set included positive parity central images and one did not. The input parameters of the mock data objects are summarized in Table 1, and calculated parameters are shown in Table 2.
The mock data sets consisted of a dark matter halo and a bright central galaxy, offset by ∼ 1 , each at a redshift of 0.288, matching the redshift and inferred offset of Abell 611 (D11). The dark matter halo was modeled by a cNFW profile. The scale radius ("r s ") was chosen to be 50 and the halo mass to be 1.1 × 10 15 M , similar in magnitude to Abell 611 and other galaxy clusters. The dark matter halo is oriented 132.5°counterclockwise from the x-axis. In the cored cases only, a uniform density core is modeled with a transition radius of 10 .
We chose an axis ratio of 0.8 for primary mock data sets, as that is similar to that of typical clusters. But since galaxy clusters can sometimes have highly elliptical structure (see Richard et al. (2010), Table 7), we constructed two separate mock data sets with a highly elliptical dark matter halo (axis ratio = 0.5) to investigate the effects of more severe ellipticity on the model inferences, which we discuss in Section 3.4.
The BCG was modeled with a dual pseudo-isothermal ellipsoid (dPIE) profile (defined in Appendix A2) that allows separate specification of the tidal break radius and core radius. The QLens mass parameter for dPIE profiles,"b", can be expressed as where σ 0 is the central velocity dispersion, G is the gravitational constant, Σ crit is the critical surface density of the lens at the relevant redshift, r cut is the tidal cutoff radius and r c is the core radius. Therefore, the mass parameter "b" roughly corresponds to the Einstein radius (and reduces exactly to the Einstein radius as r c → 0), and b ∝ σ 2 0 . The BCG was given a stellar mass of 1.34 × 10 12 M , which corresponds to b = 0.60, and a small core of uniform density, with a radius of 0.05 , similar to that noted in Abell 611 (D11).
The tidal break radius of the BCG in Abell 611 found in recent literature is between 10 and 20 (Newman et al. (2009) and (D11)); a value of 15 was chosen for this mock data analysis. The axis ratio was chosen at 0.75, oriented at an angle of 72.5°, which is 60°clockwise from the halo orientation. The BGC was positioned at (0. 5, −0. 9) in the image plane. This corresponds to an offset from the dark matter halo by ∼ 4 kpc, consistent with the offset found for Abell 611 in D11 and Newman et al. (2009), and similar to that of other clusters (Newman et al. 2013a).
Sources for the mock data were chosen in four redshift groups: 0.908, 2.00, 2.06 and 2.59, respectively, similar to source redshifts Abell 611 (D11). For each redshift, one or two compact sources were created within an area of approximately 1 , with two to three source points each. Simulated position errors with a standard deviation of 0. 2 were incorporated into the image positions. Figure 1 shows the source  plane and image plane representations for the cored and cuspy data sets.

Mock Data Lens Model
A lens system model was constructed with two lens objects, one to represent the halo and one to represent the BCG. As in the mock data preparation, the halo lens was modeled with a cNFW profile and the BCG lens with a dPIE profile. The systems were analyzed using a nested sampling algorithm with 1,000 live points. There are 8 free parameters (7 for the dark matter halo and one for the BCG). Source plane χ 2 evaluations were used in the beginning of each run, with a switch to image plane evaluations occurring mid-run. The image plane χ 2 is calculated as follows: where i is the image index, σ i are the image position uncertainties, x obs,i is the observed image position and x mod,i is the modeled image position. A similar χ 2 can be calculated in the source plane, as described in Keeton (2001b). We used uniform priors for the parameters as shown in Table 1. All of the models showed a close match between the data images and the model images. An example final fit image is shown in Figure 2.

Data without Central Images
We first consider as our baseline a dataset with no central images included, as might be expected for a cluster system with a bright object near the center that would obscure such central images. Figure 3 shows the posterior probability profiles for the cNFW "r c " parameter (core break radius) for the six cases. For the cored data set with no central images (dotted curve), the median value of core radius (true value of 10. 0) is 7. 0, with a 1-σ lower bound of 4. 8. For the cuspy data set (true core radius of zero), the median fit value is 0. 3, with a 1-σ upper bound of 0. 6. The model is able to accurately distinguish between the cored and cuspy cases.
Triangle plots showing the posterior distributions of the parameters can be found in Figures C1 and C2 in Appendix C. The parameters are successfully recovered, with all of the true values of the parameters within ∼ 1 standard deviation of the best-fit posterior value.

Data with Central Images
It is interesting to look at the same analysis with the central images included to see how useful the central images are in constraining parameter values. The solid curves of Figure 3 provide an illustration of this. For the cored data set, the best fit value of core radius is 8. 5, with a 1-σ lower bound of 6. 5. For the cuspy data set, the best fit value is 0. 05, with a 1-σ upper bound of 0. 12. Clearly, the central images greatly enhance the ability of the model to accurately constrain the core radius. The other parameters follow a similar pattern, with the mass and scale radius parameters of the dark matter halo determined with some uncertainty (but with more certainty than in the cases without central images), while the axis ratio, position angle and centroid coordinates are determined with high certainty.

Constraints on Surface Density
One measure of the utility of the model is the ability to accurately reproduce the (2-dimensional) surface density of the cluster. Here we examine scaled surface density, κ(r) ≡ where Σ(r) is the surface density and Σ crit is the critical surface density for the pertinent lens and source redshifts. Figure 4 shows the circularly-averaged κ versus radius for the cored and cuspy data sets, both with and without central images. Surface density is very accurately determined in the region where images are present, and the presence of central images enhances the accuracy of the predictions in the inner regions. Only in the radial regions far from the images does the predicted κ deviate significantly from its true value.

Pseudo-Elliptical Approximation
As discussed in Section 2, an often-employed approximation is to use an elliptical form for the gravitational potential of the object rather than the density itself (Golse & Kneib 2002). The limits of validity for that approximation is given in Golse & Kneib (2002) to be for the range of ellipticities 0.25, which corresponds to an axis ratio q 0.75. Here we compare the results of such an approximation to that of using the true elliptical density. The dot-dashed lines in Figure 3 show the cored and cuspy cases (without central images) but using the pseudo-elliptical approximation, for the mock data set with axis ratio q = 0.8. The pseudo-elliptical model has somewhat less power to resolve the different cases than the model that uses the full elliptical density (solid lines in the figure). Specifically, in the cored case, where the true value of the core radius is 43.3 kpc, the core radius posterior for pseudo-elliptical model peaks at ∼20 kpc, while the true elliptical model shows a peak at ∼30 kpc. In the cuspy case, where the true core radius is zero, the pseudo-elliptical model produces a peak posterior at ∼3 kpc, whereas the true elliptical model has a peak at ∼1 kpc.
Turning now to the mock data sets with high ellipticity (i.e., axis ratio q = 0.5), Figure 5 illustrates the posterior distribution results for the mass, concentration and core radius Figure 2. Example of a final fit image for the mock data. This fit is for the cuspy halo, without central images. The data points are shown in red, and the modeled images in cyan. The points appear as purple when the best fit model and data images overlay. The unmatched model images near (0, 0) are positive-parity "central" images, which are typically unobservable due to low magnification and/or obscuration by bright objects in the center of the cluster.
parameters. For this very elliptical halo, the models using the full elliptical density profile recover the parameters well, with the true value of all parameters located within the 1-σ posterior contours. In contrast, the pseudo-elliptical approximation does not accurately recover the input parameters. In the cuspy case, the true value for the halo mass is outside the 2-σ contour of the posterior. In the cored case, the true values for halo mass and concentration are both well outside the 2-σ contours of the posteriors. As an example of how this could bias inference of core size, if weak lensing or X-ray constraints were used that constrain halo mass and concentration to be close to their proper values, this will in turn cause r c to be biased low. The lower left posterior in Figure 5 demonstrates that if the halo mass were fixed to its (correct) value of 1.1 × 10 15 M , the value of r c would be inferred at approximately 5 , half as large as the true value (10 ). This illustrates the dangers of combining different probes to obtain core constraints if systematic errors are present in the lens model.

Model Dark Matter Halo Radial Density Profile
In order to explore the importance of the shape of the assumed halo profile in detecting cores, an alternative set of mock data was constructed using a Corecusp profile for the dark matter halo rather than a cNFW profile. The Corecusp profile is very similar to cNFW but has a faster transition between the inner and outer slopes. (Refer to Section 2 for definitions of the various density profiles.) The Corecusp parameters for scale radius and core radius were the same as those used in the cNFW mock data, i.e., 63. 7 and 10 , respectively. The slope parameters for inner and outer log- arithmic slope were set to -1 and -3, respectively, matching those of a cNFW profile. Figure 6 shows the results of fitting a cNFW dark matter halo lens to mock data constructed with a Corecusp dark matter halo, with no central images. The fit is very good in the region of radius values of 10 to 50 , close to where the images are located. However, in attempting to fit that area as well as possible, the model predicts a cored halo for the region interior to approximately 1 , when in fact the halo is cuspy. In the case of the cored halo, the model does predict a core but overestimates the surface density in the core by approximately 40%. Without data in the inner regions to guide it, predictions become unreliable there.

Summary of Mock Data Results
By modeling these mock data, it is clear that the surface density of the cluster, and thus cored or cuspy characteristics, can be well-predicted in the radial regions having image data points. Central images provide data in the inner regions (1 to 8 in our model) and thus enhance accuracy there. Using a profile shape that sufficiently approximates . Plots of scaled surface density (κ) versus radius for the mock data models. The use of the central images in the fitting enables a tighter constraint to the mass density in the inner region. The median (50th percentile) posterior value of the parameter set is shown as a solid red line, and the 16-to 84-percentile band is shown in gray. The true parameter value is shown as a dashed blue line. The radii bands in which images are located are highlighted in red. A reference redshift of z = 1.49 is used in the calculation of κ. The plots are slices that are averaged over 360°. The BCG can be observed as the bump at a radius of approximately 1 .
[From top]: Cored without central images, cupsy without central images, cored with central images, cuspy with central images.
the true halo profile is important, as mismatches can lead to inaccurate predictions in regions devoid of image data. The use of the pseudo-elliptical approximation can lead to inaccuracies in parameter recovery exceeding 2-σ for highly elliptical halos.

MODELING OF ABELL 611
Having tested the sensitivity of our model to varying mass profiles using strong lensing alone, we now turn to the real cluster data. We test for the presence for a core by fitting the data to two different profiles: cNFW and Corecusp.

Abell 611 Data
The data for the modeling of Abell 611 were taken from sources A and B in Table A.1 of D11. Their originally reported redshifts were 0.908 and 2.06, respectively, however subsequent analysis (see Newman et al. 2013a;Belli et al. 2013) indicate that the correct redshift for source A is 1.49. We have adopted that value. Source C consists of two points at a reported redshift of 2.59, however the inclusion of this source in our model led to a large shift of the centroid of the dark matter halo, which was inconsistent with the findings from D11 and Newman et al. (2009). This source also has the weakest photometry in the data set, with HST F606W magnitude fainter than 27.0 (Richard et al. 2010). Therefore, we decided not to include Source C in the model. Source D is a 4-image source with no confirmed spectroscopic redshift. D11 elected not to include this source in their models, and we also exclude it. We did test the inclusion of sources C and D, and found a substantial increase in the resulting χ 2 of the model. We set the origin to be the coordinates of the BCG as given in Table 4 of D11 (J2000: 120.236 78°, 36.056 572°).
The resulting data set contained 25 images in set A and 24 in set B, for a total of 49 images of 13 source points. We adopted 0. 2 as the position error value, as did D11 (but see Section 4.5.1 for a discussion of the importance of that assumption). The images are located in a range of 7 to 28 from the BCG center.

Abell 611 Lens Model
Following D11, we constructed a lens model with a dark matter halo, BCG, and seven other perturbing lenses. The model parameters are described in Table 3. Since we are using various density profiles for the dark matter halo, we need a consistent way to compare concentration and core radius, and have therefore adopted the following definitions. We define "core radius" as the radius at which the logarithmic slope of the density is -1, i.e., We define the concentration aŝ where r 200 is the radius at which the density is 200 times the critical density of the universe at the redshift of the lens, and r −2 is the radius at which the logarithmic slope of the density profile is -2. Note that for the cNFW profile, in the limit r c → 0 where the profile reduces to NFW, we have r −2 = r s . The Abell 611 system was analyzed using the MultiNest sampling algorithm (Feroz et al. 2009) with 4,000 live points. There were 14, 15 and 13 free parameters for the cored NFW, Corecusp and NFW profiles, respectively. Considering the 49 image points being generated from 13 sources, we then have

Cluster Halo and BCG Models
We studied three mass profiles for the cluster halo: cNFW, Corecusp and NFW. Uniform priors were used on all free halo parameters except the core scale parameter r c,k pc , for which a log prior was used. The BCG was modeled as a dPIE profile with mass parameter b left free, and other parameters fixed at the values given by Newman et al. (2013a). Table 3 summarizes the parameter values and ranges.

Cluster Member Models
The seven perturbing lens elements were modeled with dPIE profiles, allowing for separate specification of their mass, core size, cutoff radius, axis ratio, orientation angle and centroid. Perturbers 1 and 2 are quite close to image groups B.4 and B.5, allowing a stronger constraint on their Ein- stein radii. As such, the mass and cutoff radius parameters for those perturbers were left free. To avoid a proliferation of parameters, the mass and cutoff parameters for perturbers 3 through 7 were anchored together, allowing two parameters to specify the mass and scale for that group. The "b" parameter of the dPIE lens is proportional to the lens mass and varies as the square of velocity dispersion (see Equation 3). Faber & Jackson (1976) show that velocity dispersion scales as L 1/4 , so the relevant scaling relation is The mass parameters for perturbers 3 through 7 are anchored together according to this relation. The luminosities are shown in Table B1. Similarly, the cutoff radii and core radii were scaled using and The exponent 1/2 in the two equations above correspond to a constant mass-to-light ratio among perturbers 3 through  Table B1 for these values.
7. The cutoff radius normalization for those perturbers was a free parameter. As the core radii are difficult to constrain without visible images near the core of the cluster member, they were fixed according to a normalized core radius of 0. 035 for an ST magnitude of 18.0, matching the assumption of D11. For the the centroid locations, axis ratios, and orientations of the perturbers, D11 used GALFIT to determine those values, and we adopt them. They are shown in Appendix Table B1.

Abell 611 Results
The resulting fits were very good for all the profiles modeled. The reduced χ 2 for the fits range from 0.28 and 0.30. The cNFW is our baseline model and is marginally favored by the Bayesian evidence (ln(Z) = -44.8) over the Corecusp model (ln(Z) = -46.6). The resulting best-fit images for the cored NFW profile model are shown in Figure 7, and the key best-fit parameters for all models are shown in Table 4. The posterior distributions for the cored NFW and Corecusp models both exhibit a bimodal solution for the lens model parameters. There is a clear "small-core" mode, with a core radius < 1 , and a "large-core" mode, with core radius ∼ 15 . Two-dimensional posterior distribution plots for selected parameters for the cNFW and Corecusp cases are included in Appendix Figure C3 and Figure C4, respectively. The χ 2 for the best fit points of each of the two modes are very similar: 16.5 for the small-core cNFW mode and 17.1 for the large-core cNFW mode. The small-core mode is associated with a higher halo mass and a lower BCG mass, while the large-core mode is reversed in that regard.
There is clearly a degeneracy between the halo core and BCG mass, as the halo and BCG are nearly co-centered (approximately 1 apart), and it is the combination of their masses that determines the surface density and hence deflection angles. In evaluating these two modes, we can ask whether the resulting BCG mass is consistent with prior constraints on early-type galaxies. The luminosity of the BCG was found to be 5.47 × 10 11 M in V-band (Newman et al. 2013a), which, given the median BCG masses in Table 4 would imply best-fit stellar mass-to-light ratios of 9 and 13 for the small-core and large-core modes, respectively. At the low-mass end, we infer the small-core BCG mass 3.0 × 10 12 M at 95% CL, equating to a minimum stellar mass-to-light ratio of 5.5; for comparison, the large-core mode requires a BCG mass 5.6 × 10 12 M at 95% CL, equating to a stellar mass-to-light ratio of 10.3. Such high mass-to-light ratios imply that both solutions require a steep stellar initial mass function (IMF). However, as we will show in Section 4.3.1, the IMF slope required by the small-core mode is consistent with recent constraints from high-mass early-type galaxies, whereas the large-core mode is inconsistent with these constraints.
As might be expected, the NFW halo model produces nearly identical posteriors to the small-core mode cNFW solution, albeit with a slightly higher χ 2 (17.0 versus 16.3 for cNFW). The concentrationĉ 200 is 4.1 for the small core solution, consistent with previously studied mass-concentration relations (Neto et al. 2007), although it should be noted that those relations were created for systems with NFW profiles, and may not be easily compared to other forms of profiles that have cores. Interestingly, the small-core mode of the Corecusp model prefers an inner slope of 1.05, very similar to an NFW inner slope. However, the Corecusp solution is more concentrated (ĉ 200 = 6.2) and has a much smaller halo mass, as can be seen in Table 4.
The resulting posterior distributions for many parameters are similar between the cored NFW profile and the Corecusp profile, and are unimodal. These include orientation angle (θ), centroid location (x c , y c ), axis ratio (q) and scaled surface density (κ). In particular, κ tot at 5 is very well constrained and is remarkably consistent between the models, varying between 1.28 and 1.32. A plot of κ versus radius is shown in Figure 8, with cNFW and Corecusp posteriors separated into their large-core and small-core components. Their median values are in close agreement in the range of radii between 1 and 30 .

Implications for the Stellar Initial Mass Function of the BCG
The high stellar mass-to-light ratio we have inferred for the BCG would suggest a fairly steep stellar initial mass function. Given the high central dispersion of the BCG, this is not suprising: many authors in recent years have inferred a bottom-heavy IMF in early-type galaxies, using either spectral lines (La Barbera et al. 2013;Cappellari et al. 2013;Conroy et al. 2017;Lyubenova et al. 2016;Conroy & van Dokkum 2012) or strong lensing (Leier et al. 2016; New- man et al. 2013a). In La Barbera et al. (2013) and Cappellari et al. (2013), spectra from a large sample of early-type galaxies (SPIDER and ATLAS-3D, respectively) were analyzed, revealing a trend in the IMF log-slope: galaxies with low central dispersions show a shallow slope consistent with a Chabrier/Kroupa IMF, whereas galaxies with higher dispersions show a steeper slope, comparable to or even steeper than that of a Salpeter IMF. This begs the question, are either our small-or largecore solutions compatible with constraints on the IMF in early-type galaxies? These solutions require stellar mass-tolight ratios of at least M * /L V 5.5 and 10.3, respectively (at 95% CL). Since M * /L V depends on other factors besides the IMF (e.g. metallicity, stellar ages), one way to compare IMF constraints is to define the "IMF mismatch" parameter α salp = (M * /L V )/(M * /L V ) salp , where (M * /L V ) salp is the mass-to-light ratio generated by a Salpeter IMF. Positive α salp values then would imply an IMF that is more bottomheavy compared to Salpeter. In Newman et al. (2013a), the stellar mass-to-light ratio M * /L V for Abell 611 was estimated using the BCG colors and a stellar population synthesis model. Under the assumption of a Salpeter IMF, they infer (M * /L V ) salp = 3.98. Using this, our small-and largecore solutions require α salp values of at least 1.38 and 2.59, respectively, at 95% CL.
In Cappellari et al. (2013) a trend line is fit to log α salp as a function of dispersion (see their Figure 13), with corresponding lines for 2.6σ scatter. Using the fact that the luminosity-weighted dispersion of the BCG within its halflight radius is 306 km/s, we estimate the median value for α salp ≈ 1.0 with the ±2.6σ bounds at 0.5 and 2.0. Lyubenova et al. (2016) find a similar range using galaxies in the CAL-IFA survey, for which α salp lies in the approximate range 0.6-1.5, while Leier et al. (2016) find a similar range 0.5-1.5 in the SLACS lens sample. Given these constraints, it is evident that our small-core solution (α salp 1.4) is compatible with current constraints, whereas the α salp 2.6 required by the large-core solution lies beyond the upper bound for all the surveys mentioned here. Next, we go further and estimate the constraint on the slope of the IMF of the BCG in Abell 611 from our lensing analysis. We will model the IMF using a double power-law model, ξ(M) ∝ M 1−Γ where Γ = 1.35 (Salpeter) for M > M while Γ for M < M will be freely varied. This is identical to one of the models used in Leier et al. (2016) and one of the parametric models used in Conroy et al. (2017). To relate the mass-to-light ratio to the IMF slope Γ, we have where L V (M) is the V-band stellar luminosity-mass relation.
For the lower mass cutoff we adopt the usual convention M low = 0.1M , and the high mass cutoff M high will be determined by the particular isochrone used. For the massluminosity relation we use the Padova isochrones (Girardi et al. 2004) assuming metallicity Z = Z (the same choice was adopted in Newman et al. (2013a)), and consider a few different stellar ages. To account for the fact that the stellar ages inferred for Abell 611 may differ from our choices here (along with possible slight differences in the SPS model used), we will write L V (M) = λL V,0 (M) where L V,0 (M) is generated from the isochrone, and λ is a correction factor which we expect to be close to 1 if the correct median stellar age is assumed. For a given assumed stellar age, we perform the integration in Eq. 10 by interpolating over a table of values in L V,0 (M) generated from the isochrone, then solve for λ using the Newman et al. (2013a) values (M * /L V = 3.98, Γ = 1.35).
With λ in hand, we can then use Eq. 10 to plot the IMF slope Γ needed to produce a given stellar mass-to-light ratio. In practice, we find that the results are nearly identical regardless of stellar age (we tried ages in the range of 6-10 Gyr), since the luminosity is sensitive to the IMF slope at high stellar mass which is still Salpeter in our model; since λ ≈ 1 for 8 Gyr, we adopt this stellar age in the following. In Figure 9 we plot α salp as a function of IMF slope Γ, while on the right side is plotted the posteriors of the small-and large-core solutions in α salp . Note that the curve equals 1 at Γ = 1.35, since the above procedure enforces consistency with the Newman et al. (2013a) results. From this figure we see that an IMF slope Γ ≈ 1.5 is required to produce a α salp ≈ 1.4, which is close to the 95% CL lower limit required by our small-core solution. By contrast, a slope Γ 2 is required to be consistent with the large-core solution. Among the SLACS lenses, Leier et al. (2016) found that for the double power law IMF, a slope of 1.7 implies a dark matter fraction of zero, and hence steeper slopes are ruled out. Indeed, Chabrier et al. (2014) argue that in the most extreme starburst conditions, the IMF "saturates" at a slope Γ ≈ 1.7. Recently Conroy et al. (2017) investigated a galaxy with a similar dispersion to ours (NGC 1407) and found Γ = 1.7, possibly reaching the saturation limit. While the small-core solution is consistent with these constraints, the large-core solution clearly is not, painting a consistent picture with the above constraints on α salp .
Our inference of a stellar IMF slope Γ 1.5 carries some important caveats. First, any inference about the IMF slope depends on the form of the IMF used. A popular variant is the "bimodal" IMF (Vazdekis et al. 1996), favored in several recent studies (Lyubenova et al. 2016;Leier et al. 2016;La Barbera et al. 2016), which uses a variable slope Γ b at M > M whereas the slope tapers to zero at low masses. As an additional check, we repeated the above analysis using the bimodal IMF and found that a slope Γ b 3 is required for the small-core mode. This is near the upper limit of the ranges observed in surveys (Lyubenova et al. 2016 find a maximum Γ b ≈ 3.1, while La Barbera et al. 2013 infer Γ b ∼ 3.0 for ∼ 300 km/s dispersions); again, the large-core mode requires a much higher slope and hence is likely ruled out.
Another caveat is that the IMF slope may vary with radius, as recent studies have suggested (La Zieleniewski et al. 2017;van Dokkum et al. 2017). This is important because the presence of the BCG in the total projected density is only distinct out to ∼ 0.3 times the effective radius, as can be seen from the size of the "bump" in Figure 8 (note that the effective radius is ∼ 10 arcsec). Thus, we may only be sensitive to the stellar mass in the inner regions, where the mass-to-light ratio is high. If the IMF indeed becomes shallower further out, then the total M * /L V may potentially be lower than we infer from the strong lens modeling. It would also imply that the stellar mass profile becomes steeper than the light profile at larger radii, which could be an important systematic when inferring stellar masses from lensing. Allowing for a possible steepening of the stellar mass profile relative to the light profile when doing the lens modeling (to account for this systematic) is left to future work.

Performance of the Pseudo-Elliptical Model
The pseudo-elliptical cNFW model yielded a best-fit χ 2 that was slightly higher than the corresponding elliptical cNFW fit, and the resulting posteriors were generally similar for most parameters. Interestingly, the median posterior value for the BCG mass was approximately 50% higher when using the pseudo-elliptical approximation. This is remarkable given that the ellipticity is not extreme: the inferred ellipticity of the potential contours is ≈ 0.19 (where is defined the same as in Golse & Kneib 2002), which is low enough that it might appear "safe" to use the pseudo-elliptical approximation. By contrast, the cNFW fit inferred an axis ratio q ≈ 0.67 for the density contours, markedly lower than one might have naively expected from the pseudo-elliptical fit. (We note that in our mock data runs, there did not appear to be systematic difference in the BCG mass when using the pseudo-elliptical approximation, even in the case of high ellipticities.) The inferred BCG mass using the pseudo-elliptical model makes the best-fit stellar mass-to-light ratio even higher (≈ 13 for the small-core mode), making it much harder to reconcile with IMF constraints for early-type galaxies. We conclude that the pseudo-elliptical model can bias the results significantly even if the inferred ellipticity is not extreme, and hence modeling lenses with true elliptical density contours is strongly preferred. Newman et al. (2013a,b) used long-slit spectroscopic observations of velocity dispersion in the BCG and spherical Jeans equation analysis to find a χ 2 for those stellar kinematic data, which is then incorporated into their overall fit. The attraction to this approach is that it incorporates constraints from the inner region of the cluster, where there are no strong lensing images due to the bright BCG image. They assume that the BCG is centered at the same location as the dark matter halo and that the system is spherical. For their fiducial case they assume an isotropic system, i.e., β aniso = 1 − (σ 2 θ /σ 2 r ) = 0, but they also ran models for β aniso values between -0.2 and +0.2, with constant values of β in all cases. We adopt their dispersion observations and error values. However, we excluded the innermost point from the analysis, as that point is subject to systematic error from slit and seeing effects that are greater than the observational error (A. B. Newman, personal communication, December 7, 2018). We then apply the spherical Jeans analysis.

Consideration of Stellar Kinematic Data
As a starting point, we used a cNFW model similar to our baseline but adopt a fixed BCG mass of 1.5 × 10 12 M sun , which is similar to the mass found in Newman et al. (2013a) and D11. We then produced a "large core" chain, with r core > 10 , and a "small core" chain, with r cor e < 3 . We analyzed an isotropic case with β = 0 as well as mildly radially and tangentially biased cases with β = ±0.2.
Following Cappellari (2008), the velocity dispersion over the line of sight can be found from where Σ BCG is the BCG surface density (derived in our case from the 3D dPIE profile), ρ BCG is the dPIE density profile, M(r) is the mass of all components generating the potential and F (r, R, β) is an analytic function derived in Cappellari (2008). The velocity dispersions assuming β = 0 at all radii is shown in Figure 10. The small core and large core cases both provide plausible fits to the stellar kinematic data. As β is varied over a modest range of -0.2 to +0.2, the fits change from favoring a small core to favoring a large core. These represent a constant value of β at all radii; if β were allowed to vary with radius, even within these modest bounds, a wide variety of solutions could be accommodated. In Schaller et al. (2015), they examined six simulated clusters similar in size and character to Abell 611 and found that β did indeed vary beyond the range of -0.2 to +0.2, and could vary significantly over the radius of a cluster. We conclude that the velocity dispersion data does not provide a meaningful constraint for our purpose of discerning core size, because a wide range of core sizes can be fit by the data with only minor variations in anisotropy, and assuming β = 0 fits both large and small cores equally well. We note that the data in this case extends only to ∼ 5 , whereas the half-light radius of the BCG is 10. 7, limiting the influence of the data. However, as data becomes available at larger radii and with less noise, it will offer more constraining power. Also, with more data, it may be possible to use two-dimensional kinematic analysis and/or higher order moments of the velocity dispersion to constrain the anisotropy.

Potential Systematic Errors
Variations on the baseline models were created in order to examine the possible effects of systematic errors. One such possible source is external shear from perturbing galaxies that are close in projection to the line of sight to the lens. We found that including external shear had the effect of changing the posteriors for the halo orientation angle θ by as much as ∼ 10°, as well as the centroid coordinates x c and y c , in some cases by more than 1 . However, we did not observe significant effects on the posteriors of other parameters, and including external shear did not significantly improve the fit.
The type of prior distribution can also impact the modeling results. As our baseline, we used uniform priors with wide ranges (see Table 3), except for the core radius parameter, for which we used a log prior. We tested the impact of using log priors for the mass parameters of the BCG and perturbers. These did not have a significant impact on the resulting model posteriors or fit metrics.
Another source of systematic error is the triaxiality of the cluster, since lensing is only sensitive to the projected mass. Depending on the projection and axes ratios, the projected ellipticity could vary significantly. We do not yet have the models in QLens to take this complication into account. This issue also becomes important when comparing to other probes such as weak lensing (sensitive to projected outer halo shape) and velocity dispersion measurements (sensitive to the 3d mass profile in the inner region) (Newman et al. 2011).

Comparison to Other Works
Abell 611 has been studied by numerous other groups, utilizing a variety of techniques, including strong lensing, weak lensing, X-rays and stellar kinematics. Our emphasis is strong lensing, so here we focus our comparison with strong lensing results of others where possible. The work of D11, Newman et al. (2009);Newman et al. (2013a,b); Monna et al. (2017) are particularly relevant.
The predicted value for core size varies significantly in the literature. Monna et al. (2017) use velocity dispersion measurements of 17 cluster members in their strong lensing analysis, and infer a core size of 5.8 +2.0 −1.6 , although they assume a dPIE profile for their halo. Their result has a reduced χ 2 of 0.7 and they assume position errors of 1 . D11 uses an NFW halo and so does not examine core size. Newman et al. (2013b) find a core size of ∼ 0.7 in their cNFW model, in which they have a reduced χ 2 of ∼ 1 and they assumed position errors of 0.5. . Our preferred solution (i.e., cNFW, small-core mode) is for a core size of <1 , with a reduced χ 2 of 0.28 and assumed position errors of 0. 2. In our preferred model (cNFW, small-core), the median posterior value of the dark matter halo mass M 200 is 1.2 × 10 15 M . As several of the prior analyses (Donnarumma et al. 2011;Newman et al. 2009;Richard et al. 2010) used an incorrect value for the redshift of one of the sources, their strong lensing mass results are not directly comparable, so instead we compare to their weak lensing and X-ray results. The X-ray analysis of D11 found a mass of ∼ 1 × 10 15 M . Newman et al. (2013a), which did use the correct source redshifts, found a halo mass of 8.3 × 10 14 M in their combined analysis. Romano et al. (2010) used two weak lensing techniques and various model profiles, and found M 200 to be in the range of 5.3 × 10 14 M −5.9 × 10 14 M , with moderate uncertainties. In Richard et al. (2010), their X-ray analysis puts the 2-D projected mass within R < 250 kpc as 2.06 × 10 14 M , while the same statistic for our model is 2.12 × 10 14 M , similar to theirs.

The Importance of Position Errors
The magnitude of assumed positional errors σ pos of the observed image positions directly impacts the χ 2 of the strong lensing model, as σ 2 pos appears in the denominator of the equation for χ 2 . This in turn impacts comparisons with other modeling methods. When combining strong lensing analysis with other approaches such as weak lensing, stellar kinematics or X-ray analysis, authors often assume a strong lensing positional error that accommodates possible deficiencies in the lens models (Newman et al. 2013b;Zitrin et al. 2015). In our case, this is not a consideration since we are only employing one type of analysis. In addition, systematic errors (see discussion in Section 4.4) are often difficult to quantify, and an attempt is sometimes made to account for those errors by increasing the assumed positional error, sometimes dramatically. . We made a model run with the positional error as a free parameter, resulting in a best-fit value of 0. 18. We ultimately adopted a position error value of 0. 2. Nevertheless, the reduced χ 2 of our model is quite low at 0.28 (although we did exclude images that would have raised that, as discussed in Section 4.1). Had we used higher values of position error such as 0. 5 or 1. 4, the reduced χ 2 would have been 0.045 or 0.0057, respectively.

CONCLUSIONS
Our main aim in this paper was to put robust constraints on the dark matter densities in the central regions of clusters using strong lensing alone. Constraints on the central dark matter density of clusters is critical for constraining the particle physics of self-interacting dark matter models.
We used simulated cluster data to test whether strong lensing data could distinguish between cuspy and cored data sets (with a core radius of 10 ), both with and without central images present. The non-central images were in the 10" − 30" range, in agreement with observed images. Our main findings from the analysis of mock data are as follows.
• It is possible to distinguish between the cored and cuspy data sets, even in absence of central images, provided the density profile and shape of the density contours are accurately modeled. For the cored halo mock data with core radius of 10 , we infer a core radius greater than 3. 89 at 95% confidence level. For the cuspy data set, we infer a core radius less than 1. 01 at 95% confidence level.
• Approximating the potential with a pseudo-elliptical model rather than using a true elliptical density can degrade parameter recovery for strongly elliptical halos. In the case of a dark matter halo with axis ratio q = 0.5, the halo mass and concentration parameters were both outside their 2-σ contours. Although the inferred core size was not significantly biased in this case, combining these results with other probes such as weak lensing to better constrain the mass distribution would likely bias the inferred core size significantly, illustrating a specific danger of combining multiple probes when modeling systematics are present.
• The use of a radial density profile with a different shape than that of the mock data caused the inferred surface density (hence, core size) in the regions void of images (i.e., either near the center or on the outskirts of the cluster) to be biased. We find this effect can be severe enough to make a cored halo appear cuspy and vice versa, even though the profile remains well-fit in the range of radii where the images are located. This systematic can be alleviated if visible central images are present in the data.
With these lessons, we modeled Abell 611 with two halo profiles ("cNFW" and "Corecusp") that allow for a variable core size, a model for the BCG and seven cluster members (see Section 4.2). Our main findings are the following.
• Both the cNFW and Corecusp models found similar solutions. The cNFW model has the lower χ 2 and is the preferred model with higher Bayesian evidence. Reduced χ 2 values of 0.28 and 0.30 were obtained for the cNFW and Corecusp models, respectively, even with a small value of assumed position error of 0. 2.
• A bimodal solution was found for key parameters such as core size, halo mass and BCG mass. The large-core solution did not allow for reasonable values of BCG stellar massto-light ratios, with (M * /L V ) > 10.3 at 95% confidence level. For the small-core solution, we found (M * /L V ) > 5.5 at 95% confidence level. The required (M * /L V ) for the large core solution is not consistent with the measurement of stellar mass-to-light ratios in ATLAS3D early-type galaxies. The required slope of the IMF for the large-core solution is also inconsistent with various inferences Conroy et al. (2017); Leier et al. (2016), as summarized in Figure 9. This evidence points to the small core as the reasonable solution for Abell 611, consistent with the finding of Newman et al. (2013b).
• We infer a bottom-heavy IMF for the BCG, with IMF log-slope Γ 1.5 (per logarithmic interval) for stellar mass below M , at 95% C.L. Since the lensing data are most sensitive to the BCG mass within 0.2 times the half-light radius, this result is consistent with recent studies that find an extreme bottom-heavy IMF at the centers of massive elliptical galaxies.
• Fitting the pseudo-elliptical halo model to Abell 611 results in an inferred BCG mass that is 50% larger compared to using the true elliptical density. This inflates the stellar mass-to-light ratio significantly, despite yielding a low inferred ellipticity ( ≈ 0.19), and illustrates the danger of fitting the pseudo-elliptical model even in cases where the inferred ellipticity may not be extreme.
• The scaled surface density (κ) at 5 is found to be 1.32 ± 0.01, and is a particularly well-constrained parameter in all models. We expect this to be a key constraint on models of self-interacting dark matter. The inferred core density and core size (for the preferred small-core solution) are consistent with those found previously by Newman et al. (2013b), whose results were used by Kaplinghat et al. (2016) to argue that Abell 611 prefers a self-interaction cross section over mass of about 0.06 +0.07 −0.03 cm 2 /g at a relative velocity of about 1500 km/s. Our robust inference of the core size in Abell 611 underscores the promise of density profile measurements in galaxy clusters to measure the self-interaction cross section of dark matter with a precision of 0.1 cm 2 /g or better.
• The existing kinematic data for Abell 611 only go out to about half of the half-light radius and thus are highly sensitive to the unknown velocity dispersion anisotropy parameter β aniso . With more data that provide constraints on velocity dispersion to 2-3 half-light radii, we expect that the velocity dispersion constraints will play an important role in constraining the mass-to-light ratio of the BCG and hence the underlying dark matter halo profile.
We have shown how gravitational strong lensing can be used to put robust constraints on the dark matter halo core size and core density in galaxy clusters. Our results for Abell 611 prefer a high central density and small core size. The corresponding constraint on the self-interaction cross section at velocities of about 1500 km/s is expected to be at the 0.1 cm 2 /g level, which would be the tightest constraint on the dark matter self-interaction cross section. Table B1. The values of the fixed parameters for the BCG and seven perturbers in the Abell 611 Lens Model. The magnitude values are in the ST magnitude system. The magnitude of the object is used to determine its core radius, cutoff radius and mass parameters via the scaling relations described in the text.  Figure C1. Posterior distributions and two-dimensional correlations using mock data for the cored case without central images. True parameter values are indicated in orange. Figure C2. Posterior distributions and two-dimensional correlations using mock data for the cuspy case without central images. True parameter values are indicated in orange.