A publishing partnership

The following article is Open access

AMBER: A Semi-numerical Abundance Matching Box for the Epoch of Reionization

, , , , and

Published 2022 March 15 © 2022. The Author(s). Published by the American Astronomical Society.
, , Citation Hy Trac et al 2022 ApJ 927 186 DOI 10.3847/1538-4357/ac5116

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/927/2/186

Abstract

The Abundance Matching Box for the Epoch of Reionization (AMBER) is a semi-numerical code for modeling the cosmic dawn. The new algorithm is not based on the excursion set formalism for reionization, but takes the novel approach of calculating the reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ assuming that hydrogen gas encountering higher radiation intensity are photoionized earlier. Redshift values are assigned while matching the abundance of ionized mass according to a given mass-weighted ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$. The code has the unique advantage of allowing users to directly specify the reionization history through the redshift midpoint zmid, duration Δz, and asymmetry Az input parameters. The reionization process is further controlled through the minimum halo mass Mmin for galaxy formation and the radiation mean free path lmfp for radiative transfer. We implement improved methods for constructing density, velocity, halo, and radiation fields, which are essential components for modeling reionization observables. We compare AMBER with two other semi-numerical methods and find that our code more accurately reproduces the results from radiation-hydrodynamic simulations. The parallelized code is over four orders of magnitude faster than radiative transfer simulations and will efficiently enable large-volume models, full-sky mock observations, and parameter-space studies. AMBER will be made publicly available to facilitate and transform studies of the Epoch of Reionization.

Export citation and abstract BibTeX RIS

1. Introduction

Cosmic dawn is a fascinating period in the first billion years of the Universe and is a frontier topic for both theoretical and observational explorations and discoveries. The reionization of hydrogen by the first stars, galaxies, and quasars drastically converts the cold and neutral gas into a warm and highly ionized medium. Galaxies most likely provided the bulk of the ionizing photons (e.g., Bouwens et al. 2015a; Finkelstein et al. 2015), but there may have been early contributions from Population III stars (e.g., Wu et al. 2021) and late contributions from active galactic nuclei (e.g., Madau & Haardt 2015). On large scales, the higher-density regions near radiation sources are generally reionized earlier than the lower-density regions far from sources in this inhomogeneous process (e.g., Barkana & Loeb 2004; Trac et al. 2008).

Our current understanding of when the Epoch of Reionization (EoR) occurred primarily comes from "A Tale of Two Optical Depths." The Thomson optical depth τ (cosmic microwave background, or CMB, photons scattering with free electrons) and the Gunn-Peterson optical depth τGP (Lyα photons scattering with neutral hydrogen) have long provided two major observational constraints. However, the latest measurements tell a starkly different tale than the early observations (e.g., Fan et al. 2002; Kogut et al. 2003). Planck Collaboration et al. (2020) recently inferred τ = 0.054 ± 0.007 from measurements of the CMB temperature and polarization angular power spectra, implying a late reionization midpoint at redshift z ≈ 7.7 ± 0.6 (e.g., Glazer et al. 2018). There are now multiple evidence of dark Lyα troughs extending down to z ≈ 5.5 in the spectra of high-redshift quasars (e.g., Becker et al. 2015; Eilers et al. 2018; Bosman et al. 2018, 2021). This suggests that reionization could have ended at z < 6 (e.g., Keating et al. 2020; Nasir & D'Aloisio 2020), later than previously assumed. When and how the EoR occurred exactly (i.e., the reionization history and process) are of primary interest.

There is a renewed emphasis on using radiation-hydrodynamic simulations (e.g., Gnedin 2014; Norman et al. 2015; Ocvirk et al. 2016; Semelin et al. 2017; Finlator et al. 2018; Doussot et al. 2019; D'Aloisio et al. 2020; Katz et al. 2021) to model the complex astrophysics of sources and sinks on small scales and using semi-analytical/numerical methods (e.g., Furlanetto et al. 2004; Mesinger & Furlanetto 2007; Zahn et al. 2007; Alvarez et al. 2009; Choudhury et al. 2009; Santos et al. 2010; Mesinger et al. 2011; Battaglia et al. 2013b; Hassan et al. 2016; Xu et al. 2017; Hutter 2018) to make theoretical predictions and mock observations on large scales. Radiative transfer (RT) simulations are usually limited to small box sizes L ≲ 100 h−1Mpc and a single one typically requires hundreds of thousands to millions of CPU hours to run on a supercomputer (e.g., Trac & Gnedin 2011). Fast and accurate approaches with larger box sizes are essential for parameter-space studies. These two approaches in tandem provide our best option in making forward progress in synergy with observations.

A variety of independent approaches are crucial for comparison with and interpretation of observations of the 21 cm signal, CMB, Lyα forest, and intensity mapping. However, there has been a lack of development and diversity in semi-analytical/numerical methods. The majority of codes, like the often-used 21 cmFAST (Mesinger et al. 2011), are based on the excursion set formalism (ESF; e.g., Bond et al. 1991; Furlanetto et al. 2004). In Zahn et al. (2011), we compare two RT simulations (McQuinn et al. 2007; Trac & Cen 2007) with two ESF methods (Mesinger & Furlanetto 2007; Mesinger et al. 2011). While the simulations statistically agree within 10% at all redshifts, the models agree within 50% only when compared at the same ionization fraction. There are larger statistical differences when compared at the same redshifts because of the disagreement in reionization histories. Previous semi-numerical methods have input astrophysical parameters such as the star formation efficiency, photon production efficiency, and radiation escape fraction. The output reionization history is an end product after many complex and uncertain calculations, and is not easily controlled. There should be flexibility to directly choose the reionization history as this is the primary question for studies of the EoR.

In this paper, we present a new semi-numerical Abundance Matching Box for the Epoch of Reionization (AMBER) code for modeling the cosmic dawn on large scales. AMBER allows users to directly specify the reionization history through input parameters, a useful feature for theoretical and inference studies that is not currently in other semi-numerical methods. Section 2 summarizes the Simulations and Constructions of the Reionization of Cosmic Hydrogen (SCORCH) project that is used to motivate and calibrate AMBER. Section 3 describes the semi-numerical methods in AMBER, including the novel technique for matching the abundance of ionized mass or volume. Section 4 compares AMBER against two other semi-numerical methods, and Section 5 presents some basic results. Further applications will be presented in upcoming work. For example, Chen et al. (in prep) will model and study the patchy kinetic Sunyaev–Zel'dovich (KSZ) effect, which is a promising probe of the EoR. We conclude in Section 6 and add supplementary material in the Appendix.

2. SCORCH

We first summarize the SCORCH project (Trac et al. 2015; Doussot et al. 2019; Chen et al. 2020) that is used to motivate and calibrate AMBER. SCORCH is designed to provide cosmological simulations, theoretical predictions, and mock observations to facilitate more accurate comparisons with current and future observations. In this section, we describe the N-body simulations (Section 2.1), the radiation-hydrodynamic simulations (Section 2.2), and the reionization-redshift fields (Section 2.3). The simulations are based on the concordance cosmological parameters: Ωb = 0.045, Ωm = 0.3, ΩΛ = 0.7, h = 0.7, σ8 = 0.8, and ns = 0.96.

2.1.  N-body Simulations

In SCORCH I (Trac et al. 2015), we run 22 high-resolution N-body simulations to quantify the abundance of dark matter halos as a function mass M, accretion rate $\dot{M}$, and redshift z during the EoR. The N-body simulations are run using an updated particle-particle-mesh (P3M) code with a hybrid halo finder. Box sizes in the range 10 ≤ L/(h−1Mpc) ≤ 400 are chosen to focus on the atomic-cooling halos, and two realizations of each box size are run to reduce sample variance. Each simulation contains Ndm = 20483 dark matter particles and has a particle mass resolution ${m}_{{\rm{p}}}=8.72\times {10}^{6}{\left(L/100\right)}^{3}\ {h}^{-1}{M}_{\odot }$.

We quantify and fit the halo mass function dn/dM, mass accretion-rate relation $\dot{M}(M,z)$, and halo accretion-rate function ${dn}/d\dot{M}$ for the redshift range z ≥ 6. The new fit for the halo mass function is 20%–40% more accurate compared to Tinker et al. (2008) at the high-mass end hosting in currently observable galaxies. We also model and study the galaxy-halo connection by abundance matching observed high-redshift galaxies with simulated dark matter halos. See Trac et al. (2015) for more details.

In SCORCH II (Doussot et al. 2019), we also run a high-resolution P3M simulation with Ndm = 30723 dark matter particles in a comoving box of side length L = 50 h−1Mpc to generate halo and galaxy catalogs for the radiation-hydrodynamic simulations. A halo finder is run on the fly every 20 million cosmic years to locate dark matter halos and build merger trees. With a particle mass resolution of mp = 3.59 × 105 h−1 M, we can reliably measure halo quantities such as mass and accretion rate down to the atomic-cooling limit for galaxy formation (T ∼ 104 K, M ∼ 108 h−1 M).

2.2. Radiation-hydrodynamic Simulations

In SCORCH II (Doussot et al. 2019), we run three radiation-hydrodynamic simulations with the same cosmic initial conditions, same galaxy luminosity functions, but with different radiation escape fraction fesc(z) models. The simulations are designed to have the same Thomson optical depth τ ≈ 0.06, consistent with recent CMB observations (Planck Collaboration et al. 2020), and similar midpoints of reionization 7.5 ≲ z ≲ 8, but with a different evolution of the ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$.

The radiation-hydrodynamic simulations are run with the RadHydro code, which combines N-body and hydrodynamic algorithms (Trac & Pen 2004) with an adaptive raytracing algorithm (Trac & Cen 2007) to directly and simultaneously solve collisionless dark matter dynamics, collisional gas dynamics, and RT of ionizing photons. The raytracing algorithm has adaptive splitting and merging to improve resolution and scaling. Each simulation has Ndm = 20483 dark matter particles, Ngas = 20483 gas cells, Nrt = 5123 RT cells, and up to 12 billion adaptive rays in a comoving box of side length 50 h−1Mpc. RadHydro has been used to simulate both hydrogen and helium reionization (e.g., Trac et al. 2008; Battaglia et al. 2013b; La Plante et al. 2017; D'Aloisio et al. 2020).

We model high-redshift galaxies using an updated approach that allows us to systematically control the galaxy distributions in the simulations while matching the observed luminosity functions from the Hubble Space Telescope (e.g., Bouwens et al. 2015b; Finkelstein et al. 2015). We populate dark matter halos with galaxies by abundance matching the number densities,

Equation (1)

where LUV is the galaxy UV luminosity and $\dot{M}$ is the halo mass accretion rate. Connecting the mass accretion rate to the star formation rate allows us to model and study the episodic nature of high-redshift galaxy formation.

The radiation escape fraction is allowed to vary with redshift in a power-law relation,

Equation (2)

where f8 is the average escape fraction at z = 8, and a8 is the power-law slope. Sim (a8 = ) 0 has constant fesc, and reionization starts latest but ends earliest out of the three models. Sim 1 has fesc(z) varying linearly with 1 + z and is an intermediate model. Sim 2 has fesc(z) varying quadratically, and reionization starts earliest but ends latest.

Table 1 summarizes the parameters for the three RadHydro simulations. Some additional parameters will be presented later in the paper: the reionization history parameters zmid, Δz, and Az in Section 3.1.1; the Weibull coefficients aw, bw, and cw in Section 3.1.2; and the radiation mean free path lmfp in Section 3.4.1. Each simulation takes approximately half a million CPU hours to run on a supercomputer, and they are infeasible for parameter-space studies. See SCORCH I and II (Trac et al. 2015; Doussot et al. 2019) for more details.

Table 1. RadHydro and AMBER Parameters

Model L (h−1Mpc) Ndm Ngas Nrt f8 a8 τ zmid Δz Az aw bw cw lmfp
Sim 05020483 20483 5123 0.1500.0607.954.682.906.621.851.143.0
Sim 1    0.1310.0607.915.452.696.242.261.203.0
Sim 2    0.1120.0607.836.542.335.543.011.333.0

Download table as:  ASCIITypeset image

2.3. Reionization-redshift Fields

The reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ quantifies the timing of reionization as a function of space and has been measured in radiation-hydrodynamic simulations (e.g., Trac et al. 2008; Battaglia et al. 2013b; Kaurov 2016; Doussot et al. 2019) and semi-analytical/numerical models (e.g., Barkana & Loeb 2004; Alvarez et al. 2009; Majumdar et al. 2014). Combined with the density, velocity, and temperature fields, it has been used to model the imprints of reionization on the CMB (e.g., Battaglia et al. 2013a; Natarajan et al. 2013), 21 cm (e.g., La Plante et al. 2014; La Plante & Ntampaka 2019), and the Lyα forest (e.g., D'Aloisio et al. 2015, 2019).

We construct the reionization-redshift field for a given RadHydro simulation while it is running. For each gas cell, we save the redshift at which the majority of the mass is ionized in a three-dimensional array. Previously in Battaglia et al. (2013b), we studied both 50% and 90% ionization thresholds and find no significant differences at the resolutions of interest. Statistically, the auto-correlations and cross correlations of the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields for these two cases are almost identical. For the vast majority of gas cells, once they start to ionize, they quickly become almost fully ionized, but not entirely because of recombinations. Thus, we adopt a nominal 50% ionization value for the RadHydro simulations in SCORCH II (Doussot et al. 2019) and throughout this paper.

Figure 1 is a visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from the three RadHydro simulations. They show remarkable similarities even with episodic star formation ${\dot{\rho }}_{\star }({\boldsymbol{x}},z)$ and varying radiation escape fractions fesc(z). The ${z}_{\mathrm{re}}({\boldsymbol{x}})$ field is highly correlated with the matter density field ρ( x ) on large scales since the higher-density regions near sources are generally reionized earlier than the lower-density regions far from sources (e.g., Barkana & Loeb 2004; Trac et al. 2008). In Battaglia et al. (2013a), we find that these two fields are correlated down to Mpc scales and can be related using a first-order, scale-dependent bias in Fourier space.

Figure 1.

Figure 1. Visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from three RadHydro simulations (SCORCH II; Doussot et al. 2019). Each image is 512 × 512 pixels from a slice that is 50 × 50 h−1Mpc with a thickness of ∼100 h−1kpc. The three simulations show remarkable similarities even with episodic star formation and varying radiation escape fractions. On large scales, higher-density regions near sources are generally reionized earlier than lower-density regions far from sources.

Standard image High-resolution image

2.3.1. Cross Correlations

The reionization-redshift fields are quantitatively very similar when we account for the different reionization histories. To show this, we take the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields from Sims 0 and 2 and individually scale the redshift values for each grid cell without changing their spatial rank order such that they have the same ionization fraction ${\bar{x}}_{\mathrm{0,2}}(z)={\bar{x}}_{1}(z)$ as Sim 1. We are matching the abundance of ionized mass or volume at all redshifts, which we call abundance matching the reionization history (Section 3.4.2 for more details). Alvarez & Abel (2012) use a similar technique to rescale the reionization-redshift field in their semi-numerical method. Here we consider both mass-weighted ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ and volume-weighted ${\bar{x}}_{{\rm{i}},{\rm{V}}}(z)$ ionization fractions.

To compare the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields cell by cell, we define the fractional difference field,

Equation (3)

for Sim 0 or 2 relative to Sim 1 and then calculate the probability distribution function (PDF) of epsilonz. Figure 2 shows the distribution of fractional differences for the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields at the RT resolution lrt = 98 h−1kpc, which is approximately ten times smaller than the typical mesh cell size used in semi-numerical models. The difference distributions for the original Sims 0 and 2 are skewed because the reionization histories differ from that of Sim 1. After abundance matching, the distributions are approximately Gaussian with zero mean and small root-mean-square differences ${\sigma }_{{\rm{z}}}=\langle {\left({\epsilon }_{{\rm{z}}}({\boldsymbol{x}})-{\bar{\epsilon }}_{{\rm{z}}}\right)}^{2}{\rangle }^{1/2}\lesssim 0.01$. We find very similar results when matching either the mass-weighted or the volume-weighted ionization fractions.

Figure 2.

Figure 2. Distribution of fractional differences epsilonz in the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ for the RadHydro Sims 0 and 2 relative to Sim 1. Prior to abundance matching (solid), the distributions are skewed because of the different reionization histories. After abundance matching to have the same mass-weighted (M) or volume-weighted (V) ionization fractions, the distributions are approximately Gaussian with zero mean and small widths.

Standard image High-resolution image

To correlate the fields, we define the normalized redshift field,

Equation (4)

where ${\bar{z}}_{\mathrm{re}}$ is the volume-weighted average value, and Fourier transform to get δz( k ) for all three simulations. Generally, the two-point power spectrum and its dimensionless version are defined as

Equation (5)

Equation (6)

where i = j for auto-correlations and ij for cross correlations. For comparing two different fields, we quantify with the bias and cross correlation,

Equation (7)

Equation (8)

The field δi is said to be biased, unbiased, or underbiased relative to δj for bij > 1, = 1, and < 1, respectively. Similarly, the fields and their fluctuations are said to be correlated, uncorrelated, and anticorrelated for rij > 0, = 0, and <0, with perfect correlation or anticorrelation given by the limits rij = 1 and = −1, respectively.

Figure 3 shows the auto-power spectra for the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields prior to abundance matching. On larger scales, the dimensionless spectra similarly peak at k ≈ 0.3–0.4 h Mpc−1. As this is only a factor of 3 from the RadHydro simulation box limit ${k}_{\min }=2\pi /L$, the peak scale may possibly be underestimated. Nonetheless, this suggests that reionization simulations and semi-numerical models must have box lengths >20 h−1Mpc in order to capture the relevant large-scale fluctuations and correlations in the reionization-redshift field. The spectra similarly decline in power-law form on intermediate scales until k ≈ 10 h Mpc−1, followed by steeper declines on smaller scales down to the RT Nyquist frequency.

Figure 3.

Figure 3. Top: auto-power spectra of the RadHydro reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ prior to abundance matching. All of the spectra have a characteristic peak on larger scales and a power-law break on smaller scales. Center: the redshift bias bzz is nonunity when cross-correlating the original ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields, but is much closer to unity after abundance matching to have the same mass-weighted (M) or volume-weighted (V) ionization fractions. Bottom: the cross correlation rzz shows that the original fields are already very highly correlated on scales k ≲ 10 h Mpc−1, and abundance matching still preserves this correlation.

Standard image High-resolution image

We also cross-correlate Sims 0 and 2 relative to Sim 1 and plot the bias bzz and normalized cross correlation rzz in Figure 3. Using the normalized fields δz( x ) partially adjusts for the different redshift midpoints, but the shorter and longer durations result in lower and higher bias for the original Sims 0 and 2, respectively. After abundance matching, the bzz(k) curves are unity at the box scale and deviate by only ≲10% at the RT-resolution scale. The cross correlations show that the original fields are already very highly correlated, allowing abundance matching to be applied correctly. The rzz(k) curves are exactly unity at the largest scale and remain within a few percent on scales k ≲ 10 h Mpc−1.

All of these results lead to new ideas for AMBER: there is a spatial order to the reionization process, and abundance matching can be applied to a correlated field to accurately predict the reionization-redshift field.

3. AMBER

The AMBER code provides a novel semi-numerical method for modeling the cosmic dawn on large scales. The new algorithm is not based on the ESF for directly predicting the ionization fraction field (e.g., Furlanetto et al. 2004), but takes the novel approach of calculating the reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ assuming that the hydrogen gas encountering higher radiation intensity are photoionized earlier. Redshift values are assigned while matching the abundance of ionized mass according to a given mass-weighted ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$. The code has the unique advantage of allowing users to directly specify the reionization history through input parameters, a useful feature for theoretical and inference studies. In this section, we describe the major model components: reionization history (Section 3.1), Lagrangian perturbation theory (LPT) for the large-scale structure (Section 3.2), ESF for the halo mass density field (Section 3.3), and abundance matching for the reionization-redshift field (Section 3.4).

3.1. Reionization History

The reionization history ${\bar{x}}_{{\rm{i}}}(z)$ directly affects EoR observables. For example, the integrated Thomson optical depth and the evolution of the global 21 cm brightness temperature depend linearly on the ionized electron fraction ${\bar{x}}_{{\rm{e}}}(z)$ and neutral hydrogen fraction ${\bar{x}}_{\mathrm{HI}}(z)$, respectively. In Trac (2018), we show that the reionization histories of the RadHydro simulations can be effectively and accurately described by the redshift midpoint, duration, and asymmetry parameters. We will here generalize and improve the approach for AMBER.

The reionization history is quantified by the average ionized-hydrogen fraction, which can be mass-weighted or volume-weighted. We work with the mass-weighted version ${\bar{x}}_{{\rm{i}},{\rm{M}}}$ as the volume-averaged ionized-hydrogen number density is given by

Equation (9)

A common misconception is that the volume-averaged ionized density is instead proportional to the volume-weighted ionization fraction. See Chen et al. (2020) for clarification. We will typically work with mass-weighted fractions and volume-averaged densities and drop the subscripts to simplify the notation.

3.1.1. Midpoint, Duration, and Asymmetry

The redshift midpoint zmid, duration Δz, and asymmetry Az are input parameters, which in turn give three ionization points. For the early, middle, and late stages of reionization, let the redshifts zear > zmid > zlat correspond to the ionization fractions xear < xmid < xlat. We take zmid as the redshift midpoint and define the duration,

Equation (10)

and asymmetry,

Equation (11)

The other two redshifts are then given by

Equation (12)

Equation (13)

Instantaneous reionization models would correspond to Δz = 0, but reionization is an extended process with finite Δz > 0. Symmetric reionization histories would correspond to Az = 1, but reionization simulations typically find that the early stage of reionization takes longer than the later stage such that Az > 1.

Table 1 lists the midpoint, duration, and asymmetry parameters for the three RadHydro simulations. In Doussot et al. (2019), we take ${\bar{x}}_{\mathrm{mid}}=0.50$ and present two practical choices for the other ionization points. In the first case, we choose quartile ionization fractions $({\bar{x}}_{\mathrm{ear}},{\bar{x}}_{\mathrm{lat}})=(0.25,0.75)$, and Δz is analogous to a full width half max. In the second case, we take $({\bar{x}}_{\mathrm{ear}},{\bar{x}}_{\mathrm{lat}})=(0.05,0.95)$, and Δz more effectively quantifies the whole EoR. We will adopt the latter as the default convention throughout this paper. AMBER users can specify other values, but extreme choices (e.g., ${\bar{x}}_{\mathrm{ear}}\lesssim 0.01$, ${\bar{x}}_{\mathrm{lat}}\gtrsim 0.99$) are not recommended because the start and end of the EoR are not well defined and are difficult to determine precisely.

3.1.2. Analytical Interpolating Function

Once the midpoint, duration, and asymmetry parameters are specified and used to calculate three ionization points, we use an analytical interpolating function to calculate the reionization history. In Trac (2018), we use Lagrange interpolating functions to construct an analytical function for the ionization fraction xi(z) that exactly fits the ionization points. For three points, the interpolating polynomial is a quadratic, which has the advantage of being invertible but has the disadvantage of being nonmonotonic. A log transformation of the variables improves monotonicity over the relevant redshift range, but it is not an ideal solution.

In AMBER, we interpolate the three ionization points with a modified Weibull function (Weibull 1951),

Equation (14)

where the coefficients aw, bw, cw are all positive values. Note that aw corresponds to zend when xend = 1 and the max function ensures full ionization at lower redshifts. In Appendix A, we show that the coefficients can be easily determined by first solving a nonlinear equation for cw and then substituting its value into algebraic equations for the other two coefficients. We find that solutions exist for the asymmetry range Az ≲ 15, which is more than sufficient for parameter-space studies. We also experimented with a generalized logistic function (Richards 1959), but find that it is limited to Az ≲ 5. The Weibull function can be analytically inverted,

Equation (15)

which is a useful feature for our abundance matching technique.

Table 1 lists the Weibull coefficients for the three RadHydro simulations. The offset aw determines when the function reaches unity, and it decreases when reionization ends later. The divisor bw controls the stretch in redshift, and it increases with the duration Δz. The power cw controls the steepness of the function, and it is inversely related to the asymmetry Az. In general, changing one of the reionization shape parameters while holding the other two fixed affects all three Weibull coefficients.

Figure 4 compares the evolution of the mass-weighted ionization fraction xi(z) from the RadHydro simulations with the Weibull functions. The analytical parameterizations are excellent matches to the simulated reionization histories. The typical differences are only $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.01$, while the maximum differences of $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.02$ are found near the end of the EoR, which is not accurately captured in reionization simulations and semi-analytical models. We find that the Weibull function gives slightly better fits than Lagrange interpolation and is valid for a larger range of parameter space.

Figure 4.

Figure 4. Top: the evolution of the mass-weighted ionization fraction with redshift from the RadHydro simulations (solid) and the Weibull functions (dashed). The analytical parameterizations accurately capture the redshift-asymmetric form of the simulated reionization histories. Bottom: the typical differences in ionization fractions are $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.01$, while the maximum differences of $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.02$ are typically found near the end of the EoR, which are uncertain in the simulations.

Standard image High-resolution image

3.2. Lagrangian Perturbation Theory

LPT is used to generate initial conditions for cosmological simulations and semi-analytical methods. LPT is also used in semi-numerical models of reionization to efficiently model the evolved matter distribution in the moderately nonlinear regime, because particles can be simply advanced to any given redshift without having to iteratively perform expensive force calculations and time integration like in N-body and hydro simulations. Furthermore, on moderately nonlinear scales, the dark matter and gas distributions are highly correlated and assumed to exactly trace each other. We will here test these assumptions and techniques for evolving the large-scale structure and constructing the density and velocity fields in AMBER.

In first-order LPT, otherwise known as the Zel'dovich Approximation (Zel'Dovich 1970), each particle moves in a straight line with displacement and velocity,

Equation (16)

Equation (17)

where q is the Lagrangian coordinate, ϕ( q ) is the Lagrangian potential, D(z) is the density linear growth factor, f(z) is the velocity linear growth factor, and H(z) is the Hubble function. In second-order LPT (e.g., Bouchet et al. 1995; Scoccimarro 1998), each particle moves in a slightly curved trajectory with displacement and velocity,

Equation (18)

Equation (19)

where ϕi ( q ), Di (z), and fi (z) are the usual terms calculated at the ith-order. The spatial information is contained in the Lagrangian potentials, which only have to be computed once, while the time evolution is governed by the growth factors, which are more straightforward to calculate.

For modeling the EoR, we find that 2LPT is more accurate than 1LPT at length scales ≲1 Mpc and redshifts z ≳ 5. Third-order LPT (e.g., Bouchet et al. 1995; Scoccimarro 1998) and augmented LPT (e.g., Kitaura & Hess 2013; Neyrinck 2016) have been shown to offer only small improvements at lower redshifts far after the end of the EoR. Hence, we will generally use second-order theory and will simply refer to it as LPT throughout the remainder of this paper.

3.2.1. Density Fields

Density fields are constructed from the LPT particles using techniques from particle-mesh (PM) methods. In N-body simulations, particles are added to a Cartesian mesh to create a density field for solving Poisson's equation with Fast Fourier Transforms (FFT). In large-scale structure analyses, galaxies or halos are added to a grid to create a density field for calculating the power spectrum. Similar techniques are also used in other semi-numerical models of reionization, but we will describe an improved implementation for AMBER.

There is a hierarchy of particle assignment schemes, and the first three are called the nearest grid point (NGP), cloud-in-cell (CIC), and triangular-shaped cells (TSC). The mass assignment process interpolates the discrete distribution of particles and can be interpreted as convolving and sampling the underlying density field ρ( x ) to produce a discretized density field,

Equation (20)

where Wpm( x ) is the PM assignment window function. See Hockney & Eastwood (1988) for a seminal review.

The particle assignments introduce effects such as aliasing, smoothing, and shot noise that have to be accounted for (e.g., Jing 2005). While the latter two effects can have analytical corrections, the first is especially problematic. Hockney & Eastwood (1988) have suggested that interlacing two or more staggered meshes can reduce aliasing. Interlacing has been used to improve force calculations in N-body simulations (Couchman 1991), produce more accurate overdensity fields for power spectra estimation (Sefusatti et al. 2016), and recently implemented in the nbodykit package (Hand et al. 2018).

We use the interlacing technique to produce more accurate and robust density fields in AMBER. We construct the first density field ρ1( x ) as normal; while for the second density mesh ρ2( x ), the particle positions are shifted by half of a grid cell spacing in each direction by adding the vector Δ x = (lc/2, lc/2, lc/2), which is equivalent to shifting the mesh by the opposite vector. In Fourier space, the two transformed fields are combined into a single, effective field as

Equation (21)

The twiddle factor ei k ·Δ x multiplied to the second transformed field accounts for the shifted particle positions. We also deconvolve the effects of smoothing by dividing with the Fourier transform of the assignment window function,

Equation (22)

where p = 1, 2, and 3 for the NGP, CIC, and TSC schemes, respectively.

For comparison, we construct the AMBER and RadHydro density fields at two resolutions, using a 643 mesh with cell size lc = 0.8 h−1Mpc and a 5123 mesh with cell size lc = 0.1 h−1Mpc. The fiducial lower resolution is more typical of semi-numerical models, while the higher resolution allows us to test the limits of LPT. The AMBER density fields are efficiently made using equal numbers of LPT particles as mesh cells. The RadHydro density fields are made by simply binning the 20483 dark matter particles and the 20483 gas cells down to the appropriate size meshes.

Figure 5 is a visualization of the matter density fields ρ( x ) at redshift z = 8 from RadHydro and AMBER. The LPT particles are assigned using the TSC scheme with interlacing and deconvolution. Both images are shown with the higher-resolution pixels in order to see the large-scale structure more clearly, but have been averaged along the line of sight at the lower resolution. The RadHydro matter and gas (not shown) density fields are visually indistinguishable, while the AMBER density field also has close resemblance, but minor smoothing is visible in the small-scale, high-density regions. At the lower resolution, the AMBER and RadHydro fields are indistinguishable, though the images appear quite pixelated.

Figure 5.

Figure 5. Visualization of the matter density fields ρ( x )/〈ρ〉 at redshift z = 8 from RadHydro (left) and AMBER (right). Each image is 512 × 512 pixels from a slice that is 50 × 50 h−1Mpc with a thickness of ∼1 h−1Mpc. The RadHydro matter and gas (not shown) distributions are visually indistinguishable on the scales of interest. The AMBER density field, made by assigning the LPT particles using the TSC scheme with interlacing and deconvolution, also closely resembles the RadHydro results.

Standard image High-resolution image

Figure 6 compares the one-point PDF of the relative density ρ/〈ρ〉 = 1 + δ at redshift z = 8. The RadHydro matter and gas distributions are in very good agreement even at the higher resolution. For AMBER, the TSC assignment scheme with interlacing and deconvolution gives the best agreement, and we only show this case to avoid overcrowding the plot. The results are also in very good agreement at the fiducial resolution, but not at the higher resolution. LPT calculated at the second order or an even higher order cannot capture the nonlinear shell-crossing on smaller scales and therefore underresolves high-density, collapsed regions. The disagreement in underdense regions is not concerning, because it is due to the much smaller number of LPT particles and the different assignment and deconvolution process for AMBER. We would find the same effects for RadHydro if we use a lower-resolution simulation and had not done the simple binning.

Figure 6.

Figure 6. One-point probability distribution functions of the density fields at redshift z = 8 from RadHydro and AMBER. The RadHydro matter and gas distributions are identical at the fiducial lower resolution and in very good agreement at the higher resolution. For AMBER, using LPT with the TSC scheme also gives very good agreement at the fiducial lower resolution. The differences in the high-density tails of the PDFs are due to reduced gravitational collapse and imperfect deconvolution for LPT.

Standard image High-resolution image

Figure 7 shows the density auto-power spectra ${{\rm{\Delta }}}_{\mathrm{dd}}^{2}(k)$ at redshift z = 8 and cross correlations relative to the RadHydro matter overdensity δm. All of the dimensionless power spectra continue to increase in amplitude toward smaller scales, typical for cold dark-matter-dominated models. For the fiducial lower resolution, the power reaches a maximum value of only ${{\rm{\Delta }}}_{\mathrm{dd}}^{2}\approx 0.2$. As this is still in the quasi-linear regime, 2LPT is an accurate and efficient approach for modeling the mass distribution and density field.

Figure 7.

Figure 7. Top: auto-power spectra of the density fields at redshift z = 8 from RadHydro and AMBER. Center: for RadHydro, the density bias bdd of the gas relative to the matter is unity for k ≲ 10 h Mpc−1, and the suppression on smaller scales is due to Jeans smoothing from the gas pressure. For AMBER, the TSC assignment scheme with interlacing and deconvolution gives more correct power than the NGP and CIC schemes, which tend to overshoot near the Nyquist frequency because of aliasing. Bottom: the cross correlation rdd of the gas relative to the matter is unity nearly down to the smallest scale probed. Assigning LPT particles with the TSC scheme significantly improves the cross correlations.

Standard image High-resolution image

The RadHydro matter and gas perfectly trace each other with bias bdd = 1 and cross correlation rdd = 1 for k ≲ 10 h Mpc−1. The suppression in the gas bias on smaller scales is due to the Jeans smoothing from the gas pressure that opposes gravitational collapse. For AMBER, the LPT bias starts out at unity on the largest scales, is still within a few percent for k ≳ 1 h Mpc−1, but drops to ∼0.9 by k ∼ 10 h Mpc−1. However, the normalized cross correlation is still closer to unity down to smaller scales, showing that the density perturbations are highly in-phased even though the amplitudes may differ. We find that using the TSC assignment scheme with interlacing and deconvolution gives more accurate auto-power and cross-power spectra. The NGP and CIC results tend to be overbiased and undercorrelated near the Nyquist frequency kNy = πx because of aliasing. Our results are similar to the previous findings by Sefusatti et al. (2016), who present a thorough analysis of how interlacing can reduce aliasing and improve power spectrum estimation.

3.2.2. Velocity Fields

Velocity fields are constructed similarly to the density fields by assigning the LPT particles to interlaced meshes. For a given mesh cell, the velocity v is a weighted average of the individual velocities v i from all overlapping particles,

Equation (23)

where wi and wtot are the individual and total mass assignment weights, respectively. With the NGP assignment scheme, we find that some cells can have no overlapping particles and therefore undefined velocities. Both the CIC and TSC schemes work better in practice with no empty cells and missing velocities for our moderately clustered distribution of LPT particles.

Figure 8 is a visualization of the line-of-sight component of the velocity fields v( x ) at redshift z = 8 from RadHydro and AMBER. The velocity images are constructed using the same approach as for the density images (Figure 5). The shown images with the higher-resolution pixels are visually similar and have an even closer resemblance with lower-resolution pixels. The RadHydro and AMBER velocity fields show large coherence lengths that are actually underestimated because of the moderate simulation box length L = 50 h−1Mpc. We will discuss this interesting feature in more detail below.

Figure 8.

Figure 8. Visualization of the line-of-sight component of the velocity fields v ( x ) at redshift z = 8 from RadHydro (left) and AMBER (right). Each image is 512 × 512 pixels from a slice that is 50 × 50 h−1Mpc with a thickness of ∼1 h−1Mpc. The RadHydro and AMBER velocity fields both show large coherence lengths that are actually underestimated by the moderate simulation box length.

Standard image High-resolution image

Figure 9 compares the one-point PDF of the velocity components v ⊂ (vx, vy, vz), at redshift z = 8. The RadHydro matter and gas velocity distributions are in excellent agreement, even more so than their density distributions. All of the distributions are Gaussian, and the widths decrease slightly by a few percent for the fiducial lower resolution. For AMBER, the CIC and TSC assignment schemes with interlacing and deconvolution give similarly accurate results, and we only show the latter to avoid overcrowding the plot. The differences in the tails of the PDFs are due to the RadHydro simulations having slightly faster velocities because of virialization in rare, collapsed regions.

Figure 9.

Figure 9. One-point probability distribution functions of the velocity components at redshift z = 8 from RadHydro and AMBER. The RadHydro matter and gas distributions are nearly identical Gaussians, while the AMBER LPT results are also in very good agreement at both resolutions. The differences in the tails of the PDFs are due to nonpresent virialization and imperfect deconvolution for LPT.

Standard image High-resolution image

Figure 10 shows the velocity auto-power spectra Δvv(k) and the cross correlations relative to the RadHydro matter velocity vm. We individually compare and average over each velocity component v ⊂ (vx, vy, vz). The velocity power spectrum declines in amplitude toward smaller scales, unlike the density power spectrum (Figure 7). In linear theory, the transformed velocity field is related to the matter overdensity field as v ( k ) ∝ ( k /k2)δ( k ), and therefore the power is predominantly coming from larger scales. This explains the large coherence length seen in the velocity images (Figure 8). While radiation-hydrodynamic simulations are still computationally too expensive, semi-numerical methods are sufficiently efficient to afford large box sizes of ≳ 500 h−1Mpc, which is necessary to capture the large-scale power and accurately model the velocity fields.

Figure 10.

Figure 10. Top: auto-power spectra of the velocity fields at redshift z = 8 from RadHydro and AMBER. Center: for RadHydro, the velocity bias bvv of the gas relative to the matter is unity for k ≲ 20 h Mpc−1. For AMBER, the CIC and TSC assignment schemes with interlacing and deconvolution give similarly accurate bias, but the NGP scheme can produce undefined velocities that lead to incorrect bias. Bottom: the cross correlation rvv of the gas relative to the matter is unity nearly down to the smallest scale probed. Assigning LPT particles with the TSC scheme significantly improves the cross correlations.

Standard image High-resolution image

The RadHydro matter and gas velocities perfectly trace each other with bias bvv = 1 and cross correlation rvv = 1 for k ≲ 20 h Mpc−1, even to smaller scales compared to their densities (Figure 7). For AMBER, the NGP assignment scheme gives incorrect bias and poor stochasticity when we simply set the velocities to zero where ever it is undefined. The CIC and TSC assignment schemes have similarly accurate bias, but the latter has preferably better cross correlation. For the velocity bias, the lower-resolution results diverge from the higher-resolution ones at larger scales compared to the density bias. The velocity field is a weighted average of the particle velocities (Equation (23)), and a simple deconvolution using the mass assignment window function (Equation (22)) is only approximately correct.

In AMBER, we adopt the TSC assignment scheme with interlacing and deconvolution as the default method for constructing density and velocity fields from the LPT particles. This approach reduces aliasing and smoothing, and produces the most accurate one-point distributions and two-point correlations. Accurate density and velocity fields are essential components for modeling EoR observables. The 21 cm signal and Lyα forest both depend on the neutral hydrogen density nHI( x ) and line-of-sight velocity field vlos( x ). For the CMB, the patchy Thomson optical depth and patchy KSZ effect depend on the electron number density ne( x ) and line-of-sight momentum ne( x )vlos( x ), respectively.

3.3. Excursion Set Formalism

The ESF (Bond et al. 1991) is used to model the collapsed mass fraction and the halo mass function in the extended Press-Schecter theory (EPS). ESF has the advantage of being orders of magnitude faster than an N-body simulation and halo finding, but it is known to only produce approximately correct results. In the majority of semi-numerical methods for reionization, it is used to model both the collapsed mass and ionization fraction field (e.g., Furlanetto et al. 2004). For AMBER, we use ESF only for modeling the halo mass density field.

We will first summarize ESF and then test its accuracy for modeling the halo mass density field in AMBER. The linearly extrapolated overdensity field δ0 is filtered to produce

Equation (24)

Equation (25)

where the window function Wf is usually taken to be a sharp k-space filter,

Equation (26)

Equation (27)

or a spherical tophat filter,

Equation (28)

Equation (29)

The convolution is rapidly computed in Fourier space after transforming with highly optimized FFTs.

In a large-scale region with filtered overdensity δf, the collapsed fraction of matter in the halos above a minimum mass Mmin is given by the EPS result (Lacey & Cole 1993),

Equation (30)

The variance of the linear density fluctuations, smoothed on mass scale Ms, is calculated as

Equation (31)

where ${{\rm{\Delta }}}_{\mathrm{lin}}^{2}(k)$ is the dimensionless linear power spectrum, and Wth(k) is the spherical tophat filter with comoving radius ${R}_{{\rm{s}}}={\left[{M}_{{\rm{s}}}/(4\pi {\bar{\rho }}_{0}/3)\right]}^{1/3}$. The corresponding collapsed overdensity barrier is given by

Equation (32)

where δcrit ≈ 1.686 is the spherical collapse value, and D(z) is the linear growth factor. Note that the redshift dependence only explicitly shows up in Equation (32), while the other functions and fields have already been linearly extrapolated to z = 0.

In the original version of ESF (e.g., Bond et al. 1991; Lacey & Cole 1993), the filtering is done in Lagrangian ( q ) space on the initial overdensity field. In an alternative version that is used in some semi-numerical models of reionization (e.g., Mesinger et al. 2011; Zahn et al. 2011), the filtering is done in Eulerian ( x ) space on the evolved density field. We will refer to these versions as ESF-L and ESF-E, respectively.

When the filter window function Wf is chosen to be a spherical tophat function, it is natural to set the filter radius Rth to be equal to the comoving radius Rs in the variance. For the sharp k-space function, it is unclear how to choose the filter radius Rsk. Following Lacey & Cole (1993), we will simply choose ${R}_{\mathrm{sk}}={\left[{M}_{{\rm{s}}}/(6\pi {\bar{\rho }}_{0})\right]}^{1/3}={\left[2/(9\pi )\right]}^{1/3}{R}_{{\rm{s}}}$.

For ellipsoidal collapse, Sheth et al. (2001) propose a moving barrier δc(σ, z) with three coefficients that can be changed in order to match the halo mass functions from N-body simulations. Their moving barrier is used for determining when halos collapse following the peak-patch approach of Bond & Myers (1996). However, it is not self-consistent to simply use their δc(σ, z) in Equation (30) to calculate the collapsed fraction. Furthermore, the proposed functional form for δc(σ, z) may not be valid for all halo mass functions (e.g., Robertson et al. 2009). The Sheth & Tormen (1999) halo mass function corresponding to this moving barrier does not agree with the more recent halo mass functions for the EoR in SCORCH I (Trac et al. 2015). Therefore, we will use the spherical overdensity δc(z) rather than the ellipsoidal collapse barrier.

3.3.1. Halo Mass Density Fields

We construct halo mass density fields with ESF and compare them against results from a high-resolution N-body simulation. For reference, we use the halo catalog from SCORCH II (Section 2.1) and select all halos above a minimum mass ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$, which corresponds to the atomic-cooling limit for galaxy formation at redshift z ≈ 8. To assign the halo mass to interlaced meshes, we choose the CIC scheme as it gives both reduced aliasing and strong correlation with the matter density field. The NGP scheme has the most aliasing, while the TSC scheme has the most stochasticity as it spreads the small halos over too many larger grid cells.

For ESF-L, we calculate the collapsed fraction and mass in Lagrangian space,

Equation (33)

and for LPT particles of mass mp, move the particles to their comoving positions x (Equation (18)), and then assign the collapsed mass to the interlaced meshes (Equation (21)) to obtain the halo mass density field ρh( x ). For ESF-E, we calculate the collapsed fraction field in Eulerian space, and then the halo mass density field is calculated as

Equation (34)

where ${\bar{\rho }}_{{\rm{m}}}$ is the average matter density, and the growth factor D(z) reverses the linear extrapolation done earlier.

ESF requires that the mass smoothing scale Ms be greater than the minimum halo mass Mmin in Equation (30). The smoothing scale also cannot be lower than the particle mass resolution. For the fiducial lower resolution with 643 particles on a 643 mesh in the 50 h−1Mpc box, the particle mass is mp = 4.0 × 1010 h−1 M, and the cubical cell size is lc = 0.8 h−1Mpc. Therefore, we will choose and vary the smoothing scales such that Msmp and ${R}_{{\rm{s}}}={\left[{M}_{{\rm{s}}}/(4\pi {\bar{\rho }}_{0}/3)\right]}^{1/3}\gtrsim 0.5\ {h}^{-1}\mathrm{Mpc}$.

Figure 11 is a visualization of the halo mass density fields ρh( x )/〈ρh〉 for ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ at z = 8. Here, we normalize each density field using the same mean 〈ρh〉 taken from the N-body result to preserve the relative differences in ρh( x ). We also use the fiducial smoothing scale Ms = mp. The halo images appear more pixelated than the corresponding density and velocity images (Figures 5, 8) because of the lower resolution shown here. The ESF-L images with the sharp k-space and tophat filters are remarkably similar to the N-body result, but the ESF-E images are noticeably different. ESF-E produces relatively higher halo densities in prominent collapsed regions and lower values elsewhere, and this larger range of density contrast can be traced back to using the evolved density field rather than the initial density field for the filtering process.

Figure 11.

Figure 11. Visualization of the halo mass density fields ρh( x )/〈ρh〉 for ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ at z = 8 from an N-body simulation and with the Lagrangian (L) and Eulerian (E) implementations of the excursion set formalism (ESF). For ESF-L and ESF-E, we study both sharp k-space (sk) and tophat (th) filters. Each image is 64 × 64 pixels from a slice that is 50 × 50 h−1Mpc with a thickness of ∼1 h−1Mpc. The ESF-L images appear very similar to the N-body result, but the ESF-E images show higher halo mass densities in collapsed regions.

Standard image High-resolution image

Figure 12 shows the halo auto-power spectra ${{\rm{\Delta }}}_{\mathrm{hh}}^{2}(k)$ and cross correlations. The halo power spectrum is approximately power law with slope $n=d\mathrm{ln}P/d\mathrm{ln}k\approx -1.5$ at z = 8. In comparison with the matter power spectrum (Figure 7), we find the scale-dependent halo-matter bias bhm(k) with a large-scale value of ≈3.7. Similarly, the halo-matter cross correlation rhm(k) is close to unity on the largest scales, but drops below 0.9 for k ≳ 1 h Mpc−1. Therefore, we cannot accurately model the halo mass density field assuming linear bias, even if we allow for scale dependence.

Figure 12.

Figure 12. Top: auto-power spectra of the halo mass density fields for ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ at z = 8 from an N-body simulation and from ESF-L and ESF-E with sharp k-space (sk) and tophat (th) filters. Center: relative to the N-body halos, ESF-L has better halo bias bhh that is closer to unity than ESF-E. Bottom: ESF-L also has better halo cross correlation rhh that is closer to unity than ESF-E.

Standard image High-resolution image

In cross correlation with the N-body halo mass density field, the ESF-L with the sharp k-space filter gives the best agreement, closely followed by with the tophat filter. Both the bias bhh(k) and cross correlation rhh(k) differ from unity by ≲0.05 down to the smallest scale shown. The strong agreement is remarkable given that we have adopted nominal choices for the filter radius Rsk and collapse barrier δc without any fine-tuning. ESF-E with either the sharp k-space filter or the tophat filter gives poorer agreement with the N-body results. The larger amplitude in the power spectrum and bias is consistent with the larger range in the density contrast seen in the halo density images (Figure 11).

We now vary the minimum halo mass Mmin and smoothing mass Ms and quantify the mass-weighted average collapse fraction 〈fcoll〉 in Figure 13. As Mmin is increased above the atomic-cooling limit while fixing Ms = 4 × 1010 h−1 M and Rs = 0.5 h−1Mpc, we again find that ESF-L with the sharp k-space filter gives the best agreement, followed by with the tophat filter. However, ESF-E overpredicts the average collapse fraction by a factor of ≳3, with increasingly larger differences toward higher Mmin. Note that we should expect small differences in the average collapse fraction. In the N-body halo finder (Trac et al. 2015), the halo mass M200 is defined such that the average halo density ${\bar{\rho }}_{{\rm{h}}}$ is 200 times the average matter density ${\bar{\rho }}_{{\rm{m}}}(z)$, which is also equal the critical density ${\bar{\rho }}_{\mathrm{crit}}(z)$ at high redshifts. In ESF, the halo mass is not clearly defined; although in spherical collapse theory, the average halo density is usually 18π2 ≈ 180 times the critical density at high redshifts or in Einstein-de Sitter cosmologies.

Figure 13.

Figure 13. Left: the mass-weighted average collapse fraction of matter in halos above a minimum mass Mmin at fixed ESF smoothing radius Rs = 0.5 h−1Mpc. ESF-L with the sharp k-space filter gives the best agreement with the N-body halo results, while ESF-E significantly overpredicts the average collapse fraction. Right: the variation in the average collapse fraction with Rs at fixed ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$. The ESF-L values only vary by ≲10%, while the ESF-E results are highly sensitive to the choice of smoothing and filtering scales.

Standard image High-resolution image

As Ms and Rs are increased while fixing ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$, we find that ESF-L gives reliably accurate and stable values for the average collapsed fraction that are within 10% of the N-body halo value. However, ESF-E gives highly variable results that are only in agreement at large smoothing radius Rs ≳ 10 h−1Mpc. Similarly, we also find that the power and bias change as the smoothing is varied, which is not found for ESF-L.

The sensitivity of ESF-E to the smoothing and filtering scales is especially troublesome for the semi-numerical models of reionization that are based on this approach. For a given ionized region, the filtering scale is adaptively varied until the number of ionizing photons equals the number of hydrogen atoms. For the entire modeled volume, there is a distribution of filtering scales rather than just one value, and as a consequence, there is no one value for the average collapse fraction and large-scale halo bias. Furthermore, as one changes the reionization parameters, the distribution of filtering scales will change, and so too will the halo mass density field and power spectrum, both of which should be independent of the reionization history.

In AMBER, we adopt the ESF-L with the sharp k-space filter as the default choice for modeling the halo mass density field. In the next section, we will discuss that our abundance matching technique does not depend on the overall average density 〈ρh〉 nor the overall normalization of the halo bias bhm. In future work, we can calibrate the EPS collapsed fraction relation (Equation (30)) using the N-body simulations and halo catalogs from SCORCH I and II. Accurate halo abundance and density fields are required to properly model high-redshift galaxies and quasars as radiation sources.

3.4. Abundance Matching

The new idea for AMBER is that there is a spatial order to the reionization process, and abundance matching can be applied to a correlated field to accurately predict the reionization-redshift field (Section 2.3) such that the reionization history follows a given mass-weighted ionization fraction (Section 3.1). We emphasize that the abundance matching technique does not depend on the overall normalization of the correlated field of interest. In Trac et al. (2008), we find that the higher-density regions near sources are generally reionized earlier than the lower-density regions far from sources in our RadHydro simulations. Following up in Battaglia et al. (2013b), we find that the density and reionization-redshift fields are correlated down to Mpc scales. While two natural choices for abundance matching are the matter density field (Section 3.2.1) and the halo mass density field (Section 3.3.1), the radiation field is more directly relevant to the photoionization of hydrogen.

3.4.1. Radiation Field

The radiation field is specified by the hydrogen photoionization rate,

Equation (35)

where Jν is the specific intensity, σHI is the hydrogen photoionization cross-section, and νHI is the Lyman limit frequency. In our RadHydro simulations, the RT is calculated using an accurate but expensive raytracing algorithm (Trac & Cen 2007). However, this is an unnecessarily costly approach for calculating the radiation field just for abundance matching, which does not depend on the overall normalization. In AMBER, we model the specific intensity field as

Equation (36)

where Sν is the source function, the 1/(4π r2) term is the inverse-square law for flux, and the ${e}^{-r/{l}_{\mathrm{mfp}}}$ term is the transmitted fraction over the radiation mean free path lmfp.

The source field is related to the specific luminosity density and can be expressed as

Equation (37)

where ρsfr is the star formation rate density, epsilonν is the radiative energy per unit star formation rate per frequency, and fesc is the radiation escape fraction (e.g., Madau et al. 1999; Robertson et al. 2010). The star formation rate density field is often modeled using the halo mass density field,

Equation (38)

where fstar and τstar are the star formation efficiency and timescale, respectively (e.g., Cen & Ostriker 1992; Trac & Cen 2007). The source field is proportional to the halo mass density field, while the normalization depends on the product of several astrophysical parameters and functions (fesc, fstar, τstar, epsilonν ), which may have redshift dependence that further alters the reionization history. In semi-numerical methods based on the ESF, this combination is often represented by a single efficiency parameter ζ, which is generally assumed to be constant for simplicity. With our abundance matching technique, we do not need to specify the astrophysical quantities to calculate the normalization of the the source field, but instead get to cast them and their effects in terms of the the redshift midpoint, duration, and asymmetry parameters that directly set the reionization history (Section 3.1).

In AMBER, we adopt the common assumption Sν ( x ) ∝ ρhalo( x ) and ignore the overall normalization for the source field as it is not necessary for abundance matching. The radiation field is computed quickly using FFTs to perform the convolution in Equation (36). In practice, we can abundance match the specific intensity field instead of the photoionization rate field, as the integration over frequency only changes the overall normalization for the latter.

We use an effective mean free path to account for the attenuation of the radiation field. Our use of a mean free path parameter is more physical than the maximum bubble size imposed in some semi-numerical models based on the ESF. It is more self-consistent with RT theory (e.g., Madau et al. 1999) and in better agreement with observational measurements (e.g., Worseck et al. 2014) extrapolated to higher redshifts for the EoR. It also helps to mitigate uncertainties on radiation sinks. Recently, Davies & Furlanetto (2021) have also introduced an ionizing photon mean free path as an improved alternative to the maximum filtering scale in ESF methods like 21cmFAST. Their implementation is different in that lmfp ultimately shows up in their scale-dependent ionization criterion, whereas we use it to directly compute an attenuated radiation field.

3.4.2. Reionization-redshift Fields

The reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ is assumed to be correlated with a given field, either the matter density field, the halo mass density field, or the radiation field. A region with higher matter density, halo density, or radiation intensity is considered to be photoionized earlier and has a higher reionization redshift. The abundance matching technique assigns redshift values such that the reionization history follows a given mass-weighted ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$, specified with the redshift midpoint, duration, and asymmetry parameters and interpolated with a Weibull function (Section 3.1).

We perform the abundance matching on a correlated field at a single redshift for computational efficiency, but it can also be done tomographically using multiple redshift intervals. The correlated field is first constructed at the redshift midpoint zmid, and the data array is ranked in descending order using a parallel quicksort. For the nth rank order cell out of a total of N, all cells with indices mn are considered ionized. The reionization redshift zn is calculated by equating the cumulative mass fraction with the mass-weighted ionization fraction:

Equation (39)

where the matter overdensity for the mth cell is linearly extrapolated from the redshift midpoint,

Equation (40)

The linear growth is appropriate for the modest overdensities in lower-resolution semi-numerical models, and because we have chosen the redshift midpoint as the pivot.

We construct the abundance matching models using the matter density field (AM-M), halo mass density field (AM-H), and radiation intensity fields (AM-R), and compare them with the RadHydro reionization-redshift fields (Section 2.3). For the AM-M models, we use the TSC particle assignment scheme with interlacing and deconvolution to construct the matter density field (Section 3.2.1). For the AM-H models, we use the ESF-L with sharp k − space filter for constructing the halo mass density field and set the minimum halo mass to ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ (Section 3.3.1). While this corresponds to the threshold for atomic-cooling halos, the RadHydro simulations are more complex because of the nonmonotonic star formation efficiency and episodic star formation (Trac et al. 2015; Doussot et al. 2019). For the AM-R models, we vary the radiation mean free path to find the best agreement between the reionization-redshift fields.

Figure 14 is a visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from the RadHydro simulations and abundance matching models. While all of the models have the same reionization history for a given simulation, their reionization-redshift fields are visibly different. The AM-R images most closely resemble the simulation results and correctly show that large-scale regions near radiation sources are generally reionized earlier than large-scale regions far from sources. However, the AM-M and AM-H images appear too grainy. The low-density regions just outside of collapsed regions are assigned low redshifts and considered to be reionized late despite their close proximity to the radiation sources. We can possibly improve these results by smoothing the matter density and halo density fields prior to abundance matching.

Figure 14.

Figure 14. Visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from the three RadHydro simulations (left) and corresponding abundance matching models using matter density (AM-M), halo mass density (AM-H), and radiation intensity (AM-R) fields. Each image is 64 × 64 pixels from a slice that is 50 × 50 h−1Mpc with a thickness of ∼1 h−1Mpc. Abundance matching using the radiation field more correctly captures that large-scale regions near sources are generally reionized earlier than large-scale regions far from sources, and is the adopted approach in AMBER.

Standard image High-resolution image

Figure 15 shows the reionization-redshift auto-power spectra ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ and cross correlations relative to the RadHydro normalized redshift δz (Equation (4)). We only show Sims 0 and 2 for clarity. The AM-R power spectra are most similar to the simulation results, rising in power until a characteristic peak scale and then declining in amplitude with k. However, the AM-M and AM-H power spectra generally always increase with k, similar to the mass density (Figure 7) and halo density power spectra (Figure 12). The AM-R models are the least biased with bzz ≈ 1 for k ≲ 1 h Mpc−1, while the AM-M and AM-H biases significantly deviate from unity. Furthermore, the AM-R models have improved cross correlation rzz that approach unity on large scales. The bias and stochasticity on small scales can be attributed to three main factors. In the RadHydro simulations, the episodic star formation and spatially varying mean free path are not accounted for. Furthermore, the simulations and models have differences in smoothing near the grid scale.

Figure 15.

Figure 15. Top: auto-power spectra of the reionization-redshift fields from the RadHydro simulations and corresponding abundance matching models using the matter density (AM-M), halo mass density (AM-H), and radiation intensity (AM-R) fields. The AM-R spectra have a characteristic peak similar to the simulations. Center: the redshift bias bzz between the models and simulations for AM-R are much closer to unity than for AM-M and AM-H. Bottom: the AM-R models have improved cross correlation rzz that approach unity on large scales compared to AM-M and AM-H.

Standard image High-resolution image

In AMBER, we choose to abundance match using the radiation field evaluated at the redshift midpoint. In future work, we can improve the spatial accuracy by incorporating varying mean free paths (e.g., Cain et al. 2021; Davies & Furlanetto 2021). We can also perform the abundance matching tomographically using multiple redshifts.

4. Methods Comparison

We compare AMBER against two other semi-numerical methods using RadHydro Sim 1 as reference. We continue to use the fiducial resolution with 643 mesh with cell size lc = 0.8 h−1Mpc, which is comparable to the typical resolution for the semi-numerical methods. In this section, we first summarize the methods (Sections 4.1 and 4.2), and then compare their reionization histories (Section 4.3) and reionization-redshift fields (Section 4.4).

4.1. 21cmFAST

The 21cmFAST code (Mesinger et al. 2011), like the majority of semi-numerical methods, is based on the ESF for reionization (e.g., Furlanetto et al. 2004). It generates the density, ionization, velocity, and spin temperature fields to compute the 21 cm brightness temperature given the astrophysical parameters that control the ionizing, Lyα, and X-ray backgrounds. This method has been used widely to model and study the EoR through the 21 cm signal (e.g Greig & Mesinger 2017) and CMB (e.g., Mesinger et al. 2012).

We use the original version with three free parameters: ionizing efficiency ζ, minimum halo mass Mmin, and maximum filter scale Rmax, because this parameterization is common among ESF codes and has been extensively tested and applied. Park et al. (2019) have developed a new version with eight free parameters, allowing additional parameterization of high-redshift galaxy properties. Davies & Furlanetto (2021) have proposed improving Rmax with two physically motivated prescriptions of the mean free path of ionizing photons. To our knowledge, the newer versions have not yet been tested against RT simulations, and searching the high-dimensional parameter space is beyond the scope of our basic comparison.

By default, 21cmFAST uses 64 times as many 1LPT particles as mesh cells for constructing density and velocity fields. For this comparison, 2563 particles are assigned to a 643 mesh using the NGP scheme, but there is significant smoothing near the grid scale because deconvolution is not performed. To achieve the same effective resolution, 21cmFAST would require at least 8 times as many mesh cells, 512 times as many particles, and over 500 times as much memory as AMBER. It also uses the alternative ESF-E version for the collapsed fraction (Section 3.3.1), but it does not explicitly compute the halo mass density field.

4.2. Reionization on Large Scales

The Reionization on Large Scales (RLS; Battaglia et al. 2013b) method is not based on the ESF, but is a parametric approach motivated by and calibrated with previous RadHydro simulations. It provides a parametric approach to map higher-resolution, smaller-volume radiation-hydrodynamic simulations onto lower-resolution, larger-volume N-body simulations. This method has been applied to model and study the patchy Thomson optical depth (Natarajan et al. 2013), patchy KSZ effect (Battaglia et al. 2013a), and the 21 cm lightcone effect (La Plante et al. 2014).

The reionization-redshift field is found to be highly correlated with the matter density field down to Mpc scales and related using a first-order, scale-dependent bias. In Fourier space, the matter overdensity field δ( k ) at the mean redshift $\bar{z}$ is first smoothed with a spherical tophat filter Wth(k) with radius Rth = 1 h−1Mpc and then multiplied by a bias function bzm(k) to produce the normalized redshift field,

Equation (41)

The bias function is parameterized as

Equation (42)

where the normalization is fixed at bRLS = 1/δc = 0.593 based on analytical models (Barkana & Loeb 2004), while the scale kRLS and slope aRLS are allowed to vary. Finally, the reionization-redshift field is given by

Equation (43)

where the mean redshift $\bar{z}$ is the volume-averaged value of the reionization-redshift field, and it is not exactly equal to the redshift midpoint zmid of either the mass-weighted or volume-weighted ionization fractions. While the RLS parameters can be varied to change the reionization history and process, there is no direct connection between them and the redshift midpoint, duration, asymmetry, minimum halo mass, and radiation mean free path.

4.3. Reionization History

Figure 16 compares the mass-weighted ionization fractions ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ and volume-weighted ionization fractions ${\bar{x}}_{{\rm{i}},{\rm{V}}}(z)$ from the models with the simulation. For AMBER, we directly input the redshift midpoint zmid = 7.85, duration Δz = 4.73, and asymmetry Az = 2.35 parameter values. These are calculated for RadHydro Sim 1 at the fiducial resolution using Equations (10) and (11) on ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$. While AMBER uses the mass-weighted parameters by construction, it almost exactly reproduces both the mass-weighted ionization fraction with $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.01$, and volume-weighted ionization fraction with $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.03$.

Figure 16.

Figure 16. Left: the evolution of the mass-weighted ionization fractions ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ and differences ${\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}$ for AMBER, 21cmFAST, and RLS models compared to RadHydro Sim 1. AMBER has almost exact agreement, 21cmFAST-b and RLS-b have similarly good agreement, while 21cmFAST-a and RLS-a have the worst agreement. Right: the volume-weighted ionization fractions ${\bar{x}}_{{\rm{i}},{\rm{V}}}(z)$ are lower than ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ at any given redshift, as higher-density regions near sources are generally reionized earlier than lower-density regions on large scales.

Standard image High-resolution image

There are two models for 21cmFAST. In the first model (21cmFAST-a), we vary the ionization efficiency and find the best-fit ζ = 7.5 for matching zmid. The minimum halo mass is kept fixed at ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ to be consistent with the SCORCH galaxy models used in the RadHydro simulations. This model has a more extended reionization history, overestimates the duration by 35%, underestimates the asymmetry by 6%, and has $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.10$ and $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.17$. In the second model (21cmFAST-b), we find best-fit ζ = 22.9 and ${M}_{\min }=1.8\times {10}^{9}\ {h}^{-1}{M}_{\odot }$ for matching both zmid and Δz. This model underestimates the asymmetry by 14% and has $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.04$ and $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.06$. Note that the best-fit Mmin is now ≈20 times larger and inconsistent with that in the RadHydro simulations. This complicates the interpretation of the inferred minimum halo mass for galaxy formation when comparing the observations with the original version of 21cmFAST and other similar ESF models. The newer version (Park et al. 2019) with additional parameterization for high-redshift galaxy properties will likely provide better agreement, but it requires varying eight free parameters.

There are also two models for the RLS method. The first model (RLS-a) strictly follows Battaglia et al. (2013b). We fit the bias data from RadHydro Sim 1 and obtain kRLS = 2.09 h Mpc−1 and aRLS = 1.71 when bRLS is set to bB13 = 0.593. In addition, we choose $\bar{z}=7.84$ to be the same value from the simulation. The fitted function underestimates the large-scale bias, and the best-fit kRLS and aRLS are significantly different from the fiducial values of kB13 = 0.185 h Mpc−1 and aB13 = 0.564 in Battaglia et al. (2013b). This model increases zmid by 0.19, underestimates Δz by 17%, underestimates Az by 25%, and has $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.11$ and $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.09$. In the second model (RLS-b), we fit for all three bias parameters and obtain bRLS = 0.895, kRLS =0.325 h Mpc−1, and aRLS = 0.816. While kRLS and aRLS are now more similar to kB13 and aB13, the normalization bRLS is 51% larger than the previously adopted bB13. We also choose $\bar{z}=7.61$ in order to match zmid, but note the difference of 0.24 between them. This model underestimates Δz by 16%, underestimates Az by 30%, and has $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.08$ and $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.07$. Note that the RLS method tends to produce more symmetric reionization histories, and it is difficult to produce larger Az values just by varying the free parameters.

4.4. Reionization-redshift Field

Figure 17 is a visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$, and Figure 18 shows the reionization-redshift auto-power spectra ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ and cross correlations relative to RadHydro Sim 1. For AMBER, we vary the radiation mean free path and obtain lmfp = 3.2 h−1Mpc when matching the auto-power spectrum on the largest scales. This produces a reionization-redshift field that is visually and statistically in the strongest agreement out of all the models. The redshift bias bzz(k) and cross correlation rzz(k) show that AMBER is both the least biased and most correlated with the RadHydro simulation.

Figure 17.

Figure 17. Visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from RadHydro Sim 1, AMBER, 21cmFAST, and RLS. Each image is 64 × 64 pixels from a slice that is 50 × 50 h−1Mpc with a thickness of ∼1 h−1Mpc. The AMBER image most closely resembles RadHydro. The 21cmFAST-b image has more stochasticity, while RLS-b has more smoothing.

Standard image High-resolution image
Figure 18.

Figure 18. Top: auto-power spectra of the reionization-redshift fields from RadHydro Sim 1, AMBER, 21cmFAST, and RLS. All of the spectra peak at a characteristic scale that depends on the redshift midpoint zmid. Center: the redshift bias bzz between the models and simulation shows that AMBER and RLS-b are the least biased. Bottom: the cross correlations rzz show that AMBER is the most correlated, followed by RLS, while 21cmFAST is the most stochastic.

Standard image High-resolution image

For 21cmFAST, we can also vary the ESF maximum filter scale Rmax. It is recommended that ${R}_{\max }\geqslant 20\ {h}^{-1}\mathrm{Mpc}$, but we are limited to ${R}_{\max }\leqslant 25\ {h}^{-1}\mathrm{Mpc}$. The radius of the smoothing sphere cannot be larger than half of the periodic box length; otherwise it would wrap around itself. We find no improvement over this narrow range of filtering scales. 21cmFAST-b does better than 21cmFAST-a, because we are able to match both zmid and Δz. Even then it differs more from RadHydro than either AMBER or RLS. The reionization-redshift image shows noticeable differences near biased sources, and this coincides with the redshift bias bzz(k) being larger than unity by >0.2 on the largest scale. The redshift cross correlation rzz(k) drops to 0.5 by k ∼ 1 h Mpc−1 and to zero at the grid Nyquist frequency. The stochasticity may be due to using the NGP scheme for the matter density field, the ESF-E version for the collapsed fraction, and the semi-numerical approach for the ionization fraction field. Again, the newer version of 21cmFAST will likely provide better agreement, but it requires searching an eight-dimensional parameter space.

The RLS method is a parametric approach to constructing the reionization-redshift field, so it is not surprising to see that it is visually and statistically in good agreement. On large scales, RSL-b does better than RLS-a because the bias normalization is allowed to vary beyond the fixed value of bB13 = 0.593. On small scales, both models have significant smoothing because of the spherical tophat filter applied to the overdensity field. In Battaglia et al. (2013b), this implementation choice was made because the linear, deterministic biasing relation, δz( k ) ∝ δ( k ), breaks down on scales k ≳ 1 h Mpc−1. RLS has more stochasticity than AMBER, but less than 21cmFAST. The gradual deviation of the redshift cross correlation rzz(k) from unity reflects the decreasing correlation between the reionization-redshift and matter density fields toward smaller scales.

More studies and comparisons are required to understand the advantages, disadvantages, and appropriate utilization of the various semi-numerical methods. We emphasize that a variety of independent approaches are crucial for theoretical and inference studies of the EoR.

5. Results

We now present some basic results from AMBER and leave further applications for upcoming work. For example, Chen et al. (in prep) will model and study the patchy KSZ effect, which is a promising probe of the EoR. In this section, we study the parameter dependence of the reionization-redshift field (Section 5.1) by varying the redshift midpoint, duration, asymmetry, minimum halo mass, and radiation mean free path.

We run AMBER models, each with 20483 particles and 20483 cells in a comoving box of side length L = 2 h−1Gpc. For the fiducial model, we use zmid = 8.0, Δz = 4.0, Az = 3.0, ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$, and lmfp = 3.0 h−1Mpc. In addition, we include a lower and higher value when we vary each parameter one at a time. Each simulation takes approximately 20 CPU hours when run in parallel with 32 to 64 cores. We discuss the code scaling performance in Appendix B.

5.1. Reionization-redshift Fields

Figure 19 is a visualization of the reionization-redshift fields. Each image is 100 × 100 pixels from a subsection that is 100 × 100 h−1Mpc with a thickness of ∼1 h−1Mpc. We use a divergent color scheme centered at the fiducial redshift midpoint zmid = 8.0 to distinguish between regions that are ionized earlier and later. In general, the higher-density regions near radiation sources are reionized earlier than the lower-density regions farther away on large scales. As a result, a larger fraction of the volume is seen with ${z}_{\mathrm{re}}\lt {z}_{\mathrm{mid}}$, and the redshift midpoint of the volume-weighted ionization fraction comes later.

Figure 19.

Figure 19. Visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ when the redshift midpoint zmid, duration Δz, asymmetry Az, minimum halo mass Mmin, and radiation mean free path lmfp are varied. Each column shows when a given parameter is varied, while the middle row shows to the same fiducial model. The divergent color scheme centered at the fiducial zmid = 8.0 is used to distinguish between regions that are ionized earlier and later. Each image is 100 × 100 pixels from a subsection that is 100 × 100 h−1Mpc with a thickness of ∼1 h−1Mpc.

Standard image High-resolution image

Figure 20 shows the one-point PDF of the normalized redshift δz (Equation (4)), reionization-redshift auto-power spectra, and cross correlations relative to the matter density field at the redshift midpoint. In general, the PDF (δz) are skewed Gaussians except for the symmetric case with Az = 1.0. The reionization-redshift power spectrum Δzz(k) first rises in power until a characteristic peak scale and then declines in amplitude with k. The redshift-matter bias bzm(k) asymptotically approaches a constant value toward large scales and declines monotonically with k.

Figure 20.

Figure 20. Top: probability distribution functions of the normalized redshift δz when the redshift midpoint zmid, duration Δz, asymmetry Az, minimum halo mass Mmin, and radiation mean free path lmfp are varied. Middle: auto-power spectra of the reionization-redshift fields. Bottom: the redshift-matter bias bzm approaches a constant value on large scales and decreases in value on small scales.

Standard image High-resolution image

5.1.1. Redshift Midpoint

As the redshift midpoint decreases over the range 7.0 ≤ zmid ≤ 9.0, we see, overall, later reionization in Figure 19. The one-point PDF(δz) becomes wider, the two-point power spectrum ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ increases overall in amplitude, but the redshift-matter bias bzm(k) remains almost constant in Figure 20. In Equation (4) for the normalized redshift δz, the numerator (${z}_{\mathrm{re}}-{\bar{z}}_{\mathrm{re}}$) remains approximately constant, but the denominator ($1+{\bar{z}}_{\mathrm{re}}$) decreases. Therefore, δz grows by a factor of ${\bar{a}}_{\mathrm{re}}=1/(1+{\bar{z}}_{\mathrm{re}})$, whereas the matter overdensity δ grows by a factor D(zmid) ≈ 1/(1 + zmid). We find that the normalized redshift grows similarly to the matter overdensity as the bias is only ≈5% lower for zmid = 7.0 than for zmid = 9.0. Note that, while the radiation field also grows like the matter density, the abundance matching is insensitive to the overall amplitude.

5.1.2. Duration

As the duration increases over the range 2.0 ≤ Δz ≤ 6.0, we see a larger range of values in the reionization-redshift images. The PDF(δz) has larger variance, and the power spectrum increases overall in amplitude. Note that the variance is given by ${\sigma }_{{\rm{z}}}^{2}=\langle | {\delta }_{{\rm{z}}}({\boldsymbol{x}}){| }^{2}\rangle =\int {{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)d\mathrm{ln}k$. Therefore, we expect the large-scale bias to scale with σz, which in turn scales with Δz. We find that the large-scale bias changes by a factor ≈ 3.1 when the duration increases from Δz = 2.0 to Δz = 6.0.

The large-scale bias can differ significantly from the single value of bB13 = 1/δc = 0.593 adopted in the RLS method (Battaglia et al. 2013b). Barkana & Loeb (2004) derived this value assuming that fluctuations in the reionization-redshift field are solely determined by fluctuations in the collapsed mass density field from EPS. Relating the growth of ionized regions to the growth of dark matter halos would correspond to a particular duration value. However, the reionization history can depend on other nonconstant factors, and the large-scale bias can have a range of values in general.

5.1.3. Asymmetry

As the asymmetry increases over the range 1.0 ≤ Az ≤ 5.0, the images show a smaller range of redshift values for the regions reionized after the midpoint and a larger range of redshift values for the regions reionized earlier than the midpoint. The PDF(δz) becomes skewed, and the power spectrum changes shape. There is less power on large scales and more power on small scales as the characteristic peak shifts to smaller scales. For a larger Az at fixed duration, the first portion (zmid < z < zear) of the reionization history becomes longer, while the second portion (zlat < z < zmid) becomes shorter. We have just seen that increasing (decreasing) the duration increases (decreases) the overall power. We find that lengthening the first half of the portion slightly increases the power on small scales while shortening the second portion more significantly decreases the power on large scales. Correspondingly, the large-scale bias decreases with more asymmetry in the reionization history.

5.1.4. Minimum Halo Mass

Over the minimum halo mass range ${10}^{7}\leqslant {M}_{\min }/[{h}^{-1}{M}_{\odot }]\,\leqslant {10}^{10}$, there are only minor changes to the reionization-redshift field. The PDF(δz) only has minor changes since the redshift midpoint, duration, and asymmetry are kept fixed. There is minor suppression in ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ and bzm(k) on small scales. For larger Mmin, the collapsed fraction decreases and the halo bias increases. Correspondingly, this decreases the amplitude and increases the bias of the radiation intensity field (Equation (36)). However, the abundance matching technique is insensitive to the overall normalization of Jν and to the overall normalization of the fluctuating component δJ. The abundance matching only depends on the relative ranking of the field values.

On one hand, varying the minimum halo mass and halo density field will generally change the reionization history and process if astrophysical parameters such as the star formation efficiency, photon production efficiency, and radiation escape fraction are kept fixed. On the other hand, varying these astrophysical terms, which can be functions of mass and redshift, can offset the changes from varying Mmin and produce the same reionization history. With AMBER, we find that when the redshift midpoint, duration, asymmetry, and radiation mean free path are kept fixed, then the reionization-redshift field has only minor dependence on the minimum halo mass. For certain applications, it may then be possible to ignore Mmin and reduce the number of free parameters.

5.1.5. Radiation Mean Free Path

As the radiation mean free path increases over the range 1.0 ≤ lmfp/[h−1Mpc] ≤ 5.0, we see more spatial coherence in the reionization-redshift images. For a larger lmfp, there is a larger coherence length and less small-scale structure in the radiation intensity field used for abundance matching. The PDF(δz) only has minor changes since the redshift midpoint, duration, and asymmetry are kept fixed. The reionization-redshift power spectrum changes shape though, having more power on large scales and less on small scales. Increasing lmfp produces relatively larger ionized regions, resulting in the characteristic peak shifting to larger scales.

In the RLS method, Battaglia et al. (2013b) change the duration by varying aRLS and kRLS. But without varying bRLS, they also change the correlation length of ionized regions at the same time. More specifically, they obtain smaller correlation lengths for longer durations. With AMBER, we find longer Δz increases the bias, while shorter lmfp decreases the bias. This explains the inverse relationship between the duration and correlation length when bRLS is fixed in the RLS method. AMBER has the flexibility to change Δz and lmfp independently.

6. Conclusion

AMBER is a semi-numerical code for modeling the cosmic dawn and EoR. The new algorithm is not based on the ESF for directly predicting the ionization fraction field, but takes the novel approach of calculating the reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ assuming that the hydrogen gas encountering higher radiation intensity is photoionized earlier. Redshift values are assigned while matching the abundance of ionized mass according to a given mass-weighted ionization fraction xi(z). The code has the unique advantage of allowing users to directly specify the reionization history through the redshift midpoint zmid, duration Δz, and asymmetry Az input parameters (Trac 2018). The reionization process is further controlled through the minimum halo mass Mmin for galaxy formation and the radiation mean free path lmfp for RT.

We implement improved methods for modeling the large-scale structure and constructing density, velocity, halo, and radiation fields, which are essential components for modeling reionization observables. 2LPT (versus 1LPT) is used to efficiently evolve the matter distribution in the moderately nonlinear regime. The TSC (versus NGP or CIC) PM assignment scheme with interlacing and deconvolution produces density and velocity fields that are in excellent agreement with RadHydro simulations (Doussot et al. 2019). ESF-L (versus ESF-E) accurately and robustly produces halo mass density fields compared to N-body simulations (Trac et al. 2015). The radiation intensity field is computed quickly using FFTs to convolve the source field with the RT kernel. The interlacing and deconvolution reduce aliasing and smoothing, and thereby improve power spectrum estimation (e.g., Sefusatti et al. 2016). These methods can also be used to improve other semi-numerical and RT algorithms. The AMBER program modules can be adapted and plugged into other codes.

We compare AMBER with two other semi-numerical methods, 21cmFAST (Mesinger et al. 2011) and the RLS method (Battaglia et al. 2013b), and find that our code more accurately reproduces the results from RadHydro simulations. AMBER has the flexibility to directly match the simulated reionization history xi(z), while the other two methods require running multiple models before finding the best fit. It also produces reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ that are the least biased and most correlated with the simulations. The RLS method has more smoothing, while 21cmFAST has more stochasticity. More studies and comparisons are required to understand the advantages, disadvantages, and appropriate utilization of the various semi-numerical methods. We reiterate that a variety of independent approaches are crucial for theoretical and inference studies of the EoR.

We study the parameter dependence of the reionization-redshift field by varying the redshift midpoint, duration, asymmetry, minimum halo mass, and radiation mean free path. As zmid decreases, there is overall later reionization, the PDF(δz) becomes wider, the power spectrum ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ increases overall in amplitude, but the redshift-matter bias bzm(k) remains approximately constant. As Δz increases, there is more extended reionization, the redshift distribution becomes wider, and both the power spectrum and bias increase overall in amplitude. As Az increases, the reionization history becomes more asymmetric, the redshift distribution is more skewed, and the power spectrum changes shape. Interestingly, the reionization-redshift field is only weakly sensitive to Mmin when the other parameters are fixed. As lmfp increases, there is more spatial coherence and less small-scale structure in the radiation intensity and reionization-redshift fields, and the power spectrum changes shape due to the relatively larger ionized regions.

AMBER is initially designed for theoretical and inference studies in which the primary interest is controlling or constraining the reionization history and process. With the reionization-redshift field, the evolution of the neutral hydrogen and ionized electron densities can be readily computed. As a first application, Chen et al. (in prep) will model and study the imprint of reionization on the CMB through the patchy KSZ effect, which is a promising probe of the EoR. Complementary, the connection between radiation sources and sinks, and the reionization history can be understood using analytical models for the evolution of the mass-weighted ionization fraction (e.g., Chen et al. 2020) and radiation-hydrodynamic simulations that solve the complex physics.

In future work, we will include additional physics that will enable more realistic predictions of EoR observables. To improve the source models, we will calibrate the ESF halo-collapsed fraction against N-body simulations, and include high-redshift galaxies through abundance matching (e.g., Trac et al. 2015). To improve the radiation and reionization-redshift fields, we will incorporate spatially varying radiation mean free paths (e.g., Cain et al. 2021; Davies & Furlanetto 2021), and perform the ionization abundance matching tomographically using multiple redshifts. We will compute the Lyα and X-ray radiation fields for the 21 cm signal (e.g., Mesinger et al. 2011; Semelin et al. 2017) and calculate the thermal history for the Lyα forest (e.g., Upton Sanderbeck et al. 2016; D'Aloisio et al. 2019).

AMBER is currently parallelized to run on multicore, shared-memory nodes. It is over four orders of magnitude faster than radiation-hydrodynamic simulations and will efficiently enable large-volume models, full-sky mock observations, and parameter-space studies. The code will be made publicly available to facilitate and transform studies of the EoR.

We thank Nick Gnedin for initial discussions that motivated the development of AMBER. We thank Anson D'Aloisio, Matt McQuinn, and the anonymous referee for providing comments on the manuscript. We also thank Nick Battaglia, Sasha Kaurov, Emiliano Sefusatti, and Qirong Zhu for helpful discussions. The computational work was performed on the Bridges-2 system at the Pittsburgh Supercomputing Center (PSC) using allocations ast190014p and phy210042p. We thank PSC staff Ken Hackworth and T.J. Olesky for computing support. H.T. acknowledges support from the NSF AI Institute: Planning: Physics of the Future, NSF PHY2020295. R.C. acknowledges support from NSF AST2007390.

Appendix A: Weibull Function

The reionization history ${\bar{x}}_{{\rm{i}}}(z)$ can be more accurately fit with a flexible Weibull function (Equation (14)) than a symmetric tanh function (e.g., Lewis 2008). The Weibull coefficients can be determined by first solving a nonlinear equation for cw and then substituting its value into algebraic equations for aw and bw. There are several variations of the root-finding equation, for example:

Equation (A1)

where

Equation (A2)

which can be solved iteratively using Newton's method with finite differences instead of analytical derivatives (i.e., secant method). The other two coefficients have algebraic solutions, for example:

Equation (A3)

Equation (A4)

Note that the equations above can also be written in terms of the redshift midpoint, duration, and asymmetry parameters by replacing zear and zlat with Δz and Az using Equations (12) and (13).

Appendix B: Scaling Performance

AMBER is written in modern Fortran and parallelized using OpenMP to run on multicore, shared-memory nodes. We perform the following scaling tests on Bridges-2 at the Pittsburgh Supercomputing Center (PSC). An Extreme Memory (EM) node has 4 Intel Xeon Platinum 8260M "Cascade Lake" CPUs, 24 cores per CPU, 96 cores per node, and a total of 4 TB of memory.

Figure 21 shows the results of the scaling tests. For the strong scaling test, AMBER is run with Npart = 20483 particles and equal number of mesh cells in a comoving box of side length L = 2 h−1Gpc. We vary the number of cores Ncore from 1 to 64 and measure the wall time $t({N}_{\mathrm{core}})$. The speedup is defined as

Equation (B1)

and we obtain S = 5.4 (21.8) for ${N}_{\mathrm{core}}=8\ (64)$. The deviation from the ideal speedup is mostly due to our parallel quicksort, which is not fully optimized yet. The serial sorting takes about 20% of the total wall time, while the parallel sorting takes 29% (47%) for ${N}_{\mathrm{core}}=8\ (64)$. Nonetheless, a typical AMBER model can be quickly run in under an hour of wall time with 32 to 64 cores.

Figure 21.

Figure 21. Top: strong scaling test with a fixed number of Npart = 20483 particles for each AMBER run. The measured wall time t is compared against the ideal scaling. Bottom: weak scaling test with a particle/core ratio ${N}_{\mathrm{part}}/{N}_{\mathrm{core}}\approx {512}^{3}$.

Standard image High-resolution image

For the weak scaling test, we use a particle/core ratio of ${N}_{\mathrm{part}}/{N}_{\mathrm{core}}\approx {512}^{3}$. We vary Ncore from 1 to 64 and change Npart accordingly. The efficiency is defined as

Equation (B2)

and we obtain E = 61% (30%) for ${N}_{\mathrm{core}}=8\ (64)$. Again, the decrease in efficiency is mostly due to our nonoptimal quicksort. The serial sorting takes about 14% of the total wall time, while the parallel sorting takes 27% (47%) for ${N}_{\mathrm{core}}=8\ (64)$. We will improve the parallelization of the algorithm in future work.

Please wait… references are loading.
10.3847/1538-4357/ac5116