AMBER: A Semi-numerical Abundance Matching Box for the Epoch of Reionization

Hy Trac; Nianyi Chen; Ian Holst; Marcelo A. Alvarez; Renyue Cen

doi:10.3847/1538-4357/ac5116

1. Introduction

Cosmic dawn is a fascinating period in the first billion years of the Universe and is a frontier topic for both theoretical and observational explorations and discoveries. The reionization of hydrogen by the first stars, galaxies, and quasars drastically converts the cold and neutral gas into a warm and highly ionized medium. Galaxies most likely provided the bulk of the ionizing photons (e.g., Bouwens et al. 2015a; Finkelstein et al. 2015), but there may have been early contributions from Population III stars (e.g., Wu et al. 2021) and late contributions from active galactic nuclei (e.g., Madau & Haardt 2015). On large scales, the higher-density regions near radiation sources are generally reionized earlier than the lower-density regions far from sources in this inhomogeneous process (e.g., Barkana & Loeb 2004; Trac et al. 2008).

Our current understanding of when the Epoch of Reionization (EoR) occurred primarily comes from "A Tale of Two Optical Depths." The Thomson optical depth τ (cosmic microwave background, or CMB, photons scattering with free electrons) and the Gunn-Peterson optical depth τ_GP (Lyα photons scattering with neutral hydrogen) have long provided two major observational constraints. However, the latest measurements tell a starkly different tale than the early observations (e.g., Fan et al. 2002; Kogut et al. 2003). Planck Collaboration et al. (2020) recently inferred τ = 0.054 ± 0.007 from measurements of the CMB temperature and polarization angular power spectra, implying a late reionization midpoint at redshift z ≈ 7.7 ± 0.6 (e.g., Glazer et al. 2018). There are now multiple evidence of dark Lyα troughs extending down to z ≈ 5.5 in the spectra of high-redshift quasars (e.g., Becker et al. 2015; Eilers et al. 2018; Bosman et al. 2018, 2021). This suggests that reionization could have ended at z < 6 (e.g., Keating et al. 2020; Nasir & D'Aloisio 2020), later than previously assumed. When and how the EoR occurred exactly (i.e., the reionization history and process) are of primary interest.

There is a renewed emphasis on using radiation-hydrodynamic simulations (e.g., Gnedin 2014; Norman et al. 2015; Ocvirk et al. 2016; Semelin et al. 2017; Finlator et al. 2018; Doussot et al. 2019; D'Aloisio et al. 2020; Katz et al. 2021) to model the complex astrophysics of sources and sinks on small scales and using semi-analytical/numerical methods (e.g., Furlanetto et al. 2004; Mesinger & Furlanetto 2007; Zahn et al. 2007; Alvarez et al. 2009; Choudhury et al. 2009; Santos et al. 2010; Mesinger et al. 2011; Battaglia et al. 2013b; Hassan et al. 2016; Xu et al. 2017; Hutter 2018) to make theoretical predictions and mock observations on large scales. Radiative transfer (RT) simulations are usually limited to small box sizes L ≲ 100 h⁻¹Mpc and a single one typically requires hundreds of thousands to millions of CPU hours to run on a supercomputer (e.g., Trac & Gnedin 2011). Fast and accurate approaches with larger box sizes are essential for parameter-space studies. These two approaches in tandem provide our best option in making forward progress in synergy with observations.

A variety of independent approaches are crucial for comparison with and interpretation of observations of the 21 cm signal, CMB, Lyα forest, and intensity mapping. However, there has been a lack of development and diversity in semi-analytical/numerical methods. The majority of codes, like the often-used 21 cmFAST (Mesinger et al. 2011), are based on the excursion set formalism (ESF; e.g., Bond et al. 1991; Furlanetto et al. 2004). In Zahn et al. (2011), we compare two RT simulations (McQuinn et al. 2007; Trac & Cen 2007) with two ESF methods (Mesinger & Furlanetto 2007; Mesinger et al. 2011). While the simulations statistically agree within 10% at all redshifts, the models agree within 50% only when compared at the same ionization fraction. There are larger statistical differences when compared at the same redshifts because of the disagreement in reionization histories. Previous semi-numerical methods have input astrophysical parameters such as the star formation efficiency, photon production efficiency, and radiation escape fraction. The output reionization history is an end product after many complex and uncertain calculations, and is not easily controlled. There should be flexibility to directly choose the reionization history as this is the primary question for studies of the EoR.

In this paper, we present a new semi-numerical Abundance Matching Box for the Epoch of Reionization (AMBER) code for modeling the cosmic dawn on large scales. AMBER allows users to directly specify the reionization history through input parameters, a useful feature for theoretical and inference studies that is not currently in other semi-numerical methods. Section 2 summarizes the Simulations and Constructions of the Reionization of Cosmic Hydrogen (SCORCH) project that is used to motivate and calibrate AMBER. Section 3 describes the semi-numerical methods in AMBER, including the novel technique for matching the abundance of ionized mass or volume. Section 4 compares AMBER against two other semi-numerical methods, and Section 5 presents some basic results. Further applications will be presented in upcoming work. For example, Chen et al. (in prep) will model and study the patchy kinetic Sunyaev–Zel'dovich (KSZ) effect, which is a promising probe of the EoR. We conclude in Section 6 and add supplementary material in the Appendix.

2. SCORCH

We first summarize the SCORCH project (Trac et al. 2015; Doussot et al. 2019; Chen et al. 2020) that is used to motivate and calibrate AMBER. SCORCH is designed to provide cosmological simulations, theoretical predictions, and mock observations to facilitate more accurate comparisons with current and future observations. In this section, we describe the N-body simulations (Section 2.1), the radiation-hydrodynamic simulations (Section 2.2), and the reionization-redshift fields (Section 2.3). The simulations are based on the concordance cosmological parameters: Ω_b = 0.045, Ω_m = 0.3, Ω_Λ = 0.7, h = 0.7, σ₈ = 0.8, and n_s = 0.96.

2.1. N-body Simulations

In SCORCH I (Trac et al. 2015), we run 22 high-resolution N-body simulations to quantify the abundance of dark matter halos as a function mass M, accretion rate $\dot{M}$ , and redshift z during the EoR. The N-body simulations are run using an updated particle-particle-mesh (P³M) code with a hybrid halo finder. Box sizes in the range 10 ≤ L/(h⁻¹Mpc) ≤ 400 are chosen to focus on the atomic-cooling halos, and two realizations of each box size are run to reduce sample variance. Each simulation contains N_dm = 2048³ dark matter particles and has a particle mass resolution ${m}_{{\rm{p}}}=8.72\times {10}^{6}{\left(L/100\right)}^{3}\ {h}^{-1}{M}_{\odot }$ .

We quantify and fit the halo mass function dn/dM, mass accretion-rate relation $\dot{M}(M,z)$ , and halo accretion-rate function ${dn}/d\dot{M}$ for the redshift range z ≥ 6. The new fit for the halo mass function is 20%–40% more accurate compared to Tinker et al. (2008) at the high-mass end hosting in currently observable galaxies. We also model and study the galaxy-halo connection by abundance matching observed high-redshift galaxies with simulated dark matter halos. See Trac et al. (2015) for more details.

In SCORCH II (Doussot et al. 2019), we also run a high-resolution P³M simulation with N_dm = 3072³ dark matter particles in a comoving box of side length L = 50 h⁻¹Mpc to generate halo and galaxy catalogs for the radiation-hydrodynamic simulations. A halo finder is run on the fly every 20 million cosmic years to locate dark matter halos and build merger trees. With a particle mass resolution of m_p = 3.59 × 10⁵ h⁻¹ M_⊙, we can reliably measure halo quantities such as mass and accretion rate down to the atomic-cooling limit for galaxy formation (T ∼ 10⁴ K, M ∼ 10⁸ h⁻¹ M_⊙).

2.2. Radiation-hydrodynamic Simulations

In SCORCH II (Doussot et al. 2019), we run three radiation-hydrodynamic simulations with the same cosmic initial conditions, same galaxy luminosity functions, but with different radiation escape fraction f_esc(z) models. The simulations are designed to have the same Thomson optical depth τ ≈ 0.06, consistent with recent CMB observations (Planck Collaboration et al. 2020), and similar midpoints of reionization 7.5 ≲ z ≲ 8, but with a different evolution of the ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$ .

The radiation-hydrodynamic simulations are run with the RadHydro code, which combines N-body and hydrodynamic algorithms (Trac & Pen 2004) with an adaptive raytracing algorithm (Trac & Cen 2007) to directly and simultaneously solve collisionless dark matter dynamics, collisional gas dynamics, and RT of ionizing photons. The raytracing algorithm has adaptive splitting and merging to improve resolution and scaling. Each simulation has N_dm = 2048³ dark matter particles, N_gas = 2048³ gas cells, N_rt = 512³ RT cells, and up to 12 billion adaptive rays in a comoving box of side length 50 h⁻¹Mpc. RadHydro has been used to simulate both hydrogen and helium reionization (e.g., Trac et al. 2008; Battaglia et al. 2013b; La Plante et al. 2017; D'Aloisio et al. 2020).

We model high-redshift galaxies using an updated approach that allows us to systematically control the galaxy distributions in the simulations while matching the observed luminosity functions from the Hubble Space Telescope (e.g., Bouwens et al. 2015b; Finkelstein et al. 2015). We populate dark matter halos with galaxies by abundance matching the number densities,

$\begin{eqnarray}&&{n}_{\mathrm{galaxy}}(\gt {L}_{\mathrm{UV}},z)={n}_{\mathrm{halo}}(\gt \dot{M},z),\end{eqnarray} \tag{ 1 }$

where L_UV is the galaxy UV luminosity and $\dot{M}$ is the halo mass accretion rate. Connecting the mass accretion rate to the star formation rate allows us to model and study the episodic nature of high-redshift galaxy formation.

The radiation escape fraction is allowed to vary with redshift in a power-law relation,

$\begin{eqnarray}&&{f}_{\mathrm{esc}}(z)={f}_{8}{\left(\displaystyle \frac{1+z}{9}\right)}^{{a}_{8}},\end{eqnarray} \tag{ 2 }$

where f₈ is the average escape fraction at z = 8, and a₈ is the power-law slope. Sim (a₈ = ) 0 has constant f_esc, and reionization starts latest but ends earliest out of the three models. Sim 1 has f_esc(z) varying linearly with 1 + z and is an intermediate model. Sim 2 has f_esc(z) varying quadratically, and reionization starts earliest but ends latest.

Table 1 summarizes the parameters for the three RadHydro simulations. Some additional parameters will be presented later in the paper: the reionization history parameters z_mid, Δ_z, and A_z in Section 3.1.1; the Weibull coefficients a_w, b_w, and c_w in Section 3.1.2; and the radiation mean free path l_mfp in Section 3.4.1. Each simulation takes approximately half a million CPU hours to run on a supercomputer, and they are infeasible for parameter-space studies. See SCORCH I and II (Trac et al. 2015; Doussot et al. 2019) for more details.

Table 1. RadHydro and AMBER Parameters

Model	L (h⁻¹Mpc)	N_dm	N_gas	N_rt	f₈	a₈	τ	z_mid	Δ_z	A_z	a_w	b_w	c_w	l_mfp
Sim 0	50	2048³	2048³	512³	0.15	0	0.060	7.95	4.68	2.90	6.62	1.85	1.14	3.0
Sim 1					0.13	1	0.060	7.91	5.45	2.69	6.24	2.26	1.20	3.0
Sim 2					0.11	2	0.060	7.83	6.54	2.33	5.54	3.01	1.33	3.0

Download table as: ASCII Typeset image

2.3. Reionization-redshift Fields

The reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ quantifies the timing of reionization as a function of space and has been measured in radiation-hydrodynamic simulations (e.g., Trac et al. 2008; Battaglia et al. 2013b; Kaurov 2016; Doussot et al. 2019) and semi-analytical/numerical models (e.g., Barkana & Loeb 2004; Alvarez et al. 2009; Majumdar et al. 2014). Combined with the density, velocity, and temperature fields, it has been used to model the imprints of reionization on the CMB (e.g., Battaglia et al. 2013a; Natarajan et al. 2013), 21 cm (e.g., La Plante et al. 2014; La Plante & Ntampaka 2019), and the Lyα forest (e.g., D'Aloisio et al. 2015, 2019).

We construct the reionization-redshift field for a given RadHydro simulation while it is running. For each gas cell, we save the redshift at which the majority of the mass is ionized in a three-dimensional array. Previously in Battaglia et al. (2013b), we studied both 50% and 90% ionization thresholds and find no significant differences at the resolutions of interest. Statistically, the auto-correlations and cross correlations of the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields for these two cases are almost identical. For the vast majority of gas cells, once they start to ionize, they quickly become almost fully ionized, but not entirely because of recombinations. Thus, we adopt a nominal 50% ionization value for the RadHydro simulations in SCORCH II (Doussot et al. 2019) and throughout this paper.

Figure 1 is a visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from the three RadHydro simulations. They show remarkable similarities even with episodic star formation ${\dot{\rho }}_{\star }({\boldsymbol{x}},z)$ and varying radiation escape fractions f_esc(z). The ${z}_{\mathrm{re}}({\boldsymbol{x}})$ field is highly correlated with the matter density field ρ( x ) on large scales since the higher-density regions near sources are generally reionized earlier than the lower-density regions far from sources (e.g., Barkana & Loeb 2004; Trac et al. 2008). In Battaglia et al. (2013a), we find that these two fields are correlated down to Mpc scales and can be related using a first-order, scale-dependent bias in Fourier space.

2.3.1. Cross Correlations

The reionization-redshift fields are quantitatively very similar when we account for the different reionization histories. To show this, we take the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields from Sims 0 and 2 and individually scale the redshift values for each grid cell without changing their spatial rank order such that they have the same ionization fraction ${\bar{x}}_{\mathrm{0,2}}(z)={\bar{x}}_{1}(z)$ as Sim 1. We are matching the abundance of ionized mass or volume at all redshifts, which we call abundance matching the reionization history (Section 3.4.2 for more details). Alvarez & Abel (2012) use a similar technique to rescale the reionization-redshift field in their semi-numerical method. Here we consider both mass-weighted ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ and volume-weighted ${\bar{x}}_{{\rm{i}},{\rm{V}}}(z)$ ionization fractions.

To compare the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields cell by cell, we define the fractional difference field,

$\begin{eqnarray}&&{\epsilon }_{{\rm{z}}}({\boldsymbol{x}})\equiv \displaystyle \frac{[1+{z}_{\mathrm{re}}({\boldsymbol{x}})]-[1+{z}_{1}({\boldsymbol{x}})]}{1+{z}_{1}({\boldsymbol{x}})},\end{eqnarray} \tag{ 3 }$

for Sim 0 or 2 relative to Sim 1 and then calculate the probability distribution function (PDF) of _z. Figure 2 shows the distribution of fractional differences for the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields at the RT resolution l_rt = 98 h⁻¹kpc, which is approximately ten times smaller than the typical mesh cell size used in semi-numerical models. The difference distributions for the original Sims 0 and 2 are skewed because the reionization histories differ from that of Sim 1. After abundance matching, the distributions are approximately Gaussian with zero mean and small root-mean-square differences ${\sigma }_{{\rm{z}}}=\langle {\left({\epsilon }_{{\rm{z}}}({\boldsymbol{x}})-{\bar{\epsilon }}_{{\rm{z}}}\right)}^{2}{\rangle }^{1/2}\lesssim 0.01$ . We find very similar results when matching either the mass-weighted or the volume-weighted ionization fractions.

epsilon — **Figure 2.** Distribution of fractional differences _z in the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ for the RadHydro Sims 0 and 2 relative to Sim 1. Prior to abundance matching (solid), the distributions are skewed because of the different reionization histories. After abundance matching to have the same mass-weighted (M) or volume-weighted (V) ionization fractions, the distributions are approximately Gaussian with zero mean and small widths.
Download figure:
Standard image High-resolution image

To correlate the fields, we define the normalized redshift field,

$\begin{eqnarray}&&{\delta }_{{\rm{z}}}({\boldsymbol{x}})\equiv \displaystyle \frac{[1+{z}_{\mathrm{re}}({\boldsymbol{x}})]-[1+{\bar{z}}_{\mathrm{re}}]}{1+{\bar{z}}_{\mathrm{re}}},\end{eqnarray} \tag{ 4 }$

where ${\bar{z}}_{\mathrm{re}}$ is the volume-weighted average value, and Fourier transform to get δ_z( k ) for all three simulations. Generally, the two-point power spectrum and its dimensionless version are defined as

$\begin{eqnarray}&&{P}_{{ij}}(k)\equiv \langle {\delta }_{i}({\boldsymbol{k}}){\delta }_{j}({\boldsymbol{k}})\rangle ,\end{eqnarray} \tag{ 5 }$

$\begin{eqnarray}&&{{\rm{\Delta }}}_{{ij}}^{2}(k)\equiv \displaystyle \frac{{k}^{3}}{2{\pi }^{2}}{P}_{{ij}}(k),\end{eqnarray} \tag{ 6 }$

where i = j for auto-correlations and i ≠ j for cross correlations. For comparing two different fields, we quantify with the bias and cross correlation,

$\begin{eqnarray}&&{b}_{{ij}}(k)\equiv \sqrt{\displaystyle \frac{{P}_{{ii}}(k)}{{P}_{{jj}}(k)}},\end{eqnarray} \tag{ 7 }$

$\begin{eqnarray}&&{r}_{{ij}}(k)\equiv \displaystyle \frac{{P}_{{ij}}(k)}{\sqrt{{P}_{{ii}}(k){P}_{{jj}}(k)}}.\end{eqnarray} \tag{ 8 }$

The field δ_i is said to be biased, unbiased, or underbiased relative to δ_j for b_ij > 1, = 1, and < 1, respectively. Similarly, the fields and their fluctuations are said to be correlated, uncorrelated, and anticorrelated for r_ij > 0, = 0, and <0, with perfect correlation or anticorrelation given by the limits r_ij = 1 and = −1, respectively.

Figure 3 shows the auto-power spectra for the ${z}_{\mathrm{re}}({\boldsymbol{x}})$ fields prior to abundance matching. On larger scales, the dimensionless spectra similarly peak at k ≈ 0.3–0.4 h Mpc⁻¹. As this is only a factor of 3 from the RadHydro simulation box limit ${k}_{\min }=2\pi /L$ , the peak scale may possibly be underestimated. Nonetheless, this suggests that reionization simulations and semi-numerical models must have box lengths >20 h⁻¹Mpc in order to capture the relevant large-scale fluctuations and correlations in the reionization-redshift field. The spectra similarly decline in power-law form on intermediate scales until k ≈ 10 h Mpc⁻¹, followed by steeper declines on smaller scales down to the RT Nyquist frequency.

We also cross-correlate Sims 0 and 2 relative to Sim 1 and plot the bias b_zz and normalized cross correlation r_zz in Figure 3. Using the normalized fields δ_z( x ) partially adjusts for the different redshift midpoints, but the shorter and longer durations result in lower and higher bias for the original Sims 0 and 2, respectively. After abundance matching, the b_zz(k) curves are unity at the box scale and deviate by only ≲10% at the RT-resolution scale. The cross correlations show that the original fields are already very highly correlated, allowing abundance matching to be applied correctly. The r_zz(k) curves are exactly unity at the largest scale and remain within a few percent on scales k ≲ 10 h Mpc⁻¹.

All of these results lead to new ideas for AMBER: there is a spatial order to the reionization process, and abundance matching can be applied to a correlated field to accurately predict the reionization-redshift field.

3. AMBER

The AMBER code provides a novel semi-numerical method for modeling the cosmic dawn on large scales. The new algorithm is not based on the ESF for directly predicting the ionization fraction field (e.g., Furlanetto et al. 2004), but takes the novel approach of calculating the reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ assuming that the hydrogen gas encountering higher radiation intensity are photoionized earlier. Redshift values are assigned while matching the abundance of ionized mass according to a given mass-weighted ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$ . The code has the unique advantage of allowing users to directly specify the reionization history through input parameters, a useful feature for theoretical and inference studies. In this section, we describe the major model components: reionization history (Section 3.1), Lagrangian perturbation theory (LPT) for the large-scale structure (Section 3.2), ESF for the halo mass density field (Section 3.3), and abundance matching for the reionization-redshift field (Section 3.4).

3.1. Reionization History

The reionization history ${\bar{x}}_{{\rm{i}}}(z)$ directly affects EoR observables. For example, the integrated Thomson optical depth and the evolution of the global 21 cm brightness temperature depend linearly on the ionized electron fraction ${\bar{x}}_{{\rm{e}}}(z)$ and neutral hydrogen fraction ${\bar{x}}_{\mathrm{HI}}(z)$ , respectively. In Trac (2018), we show that the reionization histories of the RadHydro simulations can be effectively and accurately described by the redshift midpoint, duration, and asymmetry parameters. We will here generalize and improve the approach for AMBER.

The reionization history is quantified by the average ionized-hydrogen fraction, which can be mass-weighted or volume-weighted. We work with the mass-weighted version ${\bar{x}}_{{\rm{i}},{\rm{M}}}$ as the volume-averaged ionized-hydrogen number density is given by

$\begin{eqnarray}&&{\bar{n}}_{\mathrm{HII},{\rm{V}}}={\bar{x}}_{{\rm{i}},{\rm{M}}}{\bar{n}}_{{\rm{H}},{\rm{V}}}.\end{eqnarray} \tag{ 9 }$

A common misconception is that the volume-averaged ionized density is instead proportional to the volume-weighted ionization fraction. See Chen et al. (2020) for clarification. We will typically work with mass-weighted fractions and volume-averaged densities and drop the subscripts to simplify the notation.

3.1.1. Midpoint, Duration, and Asymmetry

The redshift midpoint z_mid, duration Δ_z, and asymmetry A_z are input parameters, which in turn give three ionization points. For the early, middle, and late stages of reionization, let the redshifts z_ear > z_mid > z_lat correspond to the ionization fractions x_ear < x_mid < x_lat. We take z_mid as the redshift midpoint and define the duration,

$\begin{eqnarray}&&{{\rm{\Delta }}}_{{\rm{z}}}\equiv {z}_{\mathrm{ear}}-{z}_{\mathrm{lat}},\end{eqnarray} \tag{ 10 }$

and asymmetry,

$\begin{eqnarray}&&{A}_{{\rm{z}}}\equiv \displaystyle \frac{{z}_{\mathrm{ear}}-{z}_{\mathrm{mid}}}{{z}_{\mathrm{mid}}-{z}_{\mathrm{lat}}}.\end{eqnarray} \tag{ 11 }$

The other two redshifts are then given by

$\begin{eqnarray}&&{z}_{\mathrm{ear}}={z}_{\mathrm{mid}}+\displaystyle \frac{{{\rm{\Delta }}}_{{\rm{z}}}{A}_{{\rm{z}}}}{1+{A}_{{\rm{z}}}},\end{eqnarray} \tag{ 12 }$

$\begin{eqnarray}&&{z}_{\mathrm{lat}}={z}_{\mathrm{mid}}-\displaystyle \frac{{{\rm{\Delta }}}_{{\rm{z}}}{A}_{{\rm{z}}}}{1+{A}_{{\rm{z}}}}={z}_{\mathrm{ear}}-{{\rm{\Delta }}}_{{\rm{z}}}.\end{eqnarray} \tag{ 13 }$

Instantaneous reionization models would correspond to Δ_z = 0, but reionization is an extended process with finite Δ_z > 0. Symmetric reionization histories would correspond to A_z = 1, but reionization simulations typically find that the early stage of reionization takes longer than the later stage such that A_z > 1.

Table 1 lists the midpoint, duration, and asymmetry parameters for the three RadHydro simulations. In Doussot et al. (2019), we take ${\bar{x}}_{\mathrm{mid}}=0.50$ and present two practical choices for the other ionization points. In the first case, we choose quartile ionization fractions $({\bar{x}}_{\mathrm{ear}},{\bar{x}}_{\mathrm{lat}})=(0.25,0.75)$ , and Δ_z is analogous to a full width half max. In the second case, we take $({\bar{x}}_{\mathrm{ear}},{\bar{x}}_{\mathrm{lat}})=(0.05,0.95)$ , and Δ_z more effectively quantifies the whole EoR. We will adopt the latter as the default convention throughout this paper. AMBER users can specify other values, but extreme choices (e.g., ${\bar{x}}_{\mathrm{ear}}\lesssim 0.01$ , ${\bar{x}}_{\mathrm{lat}}\gtrsim 0.99$ ) are not recommended because the start and end of the EoR are not well defined and are difficult to determine precisely.

3.1.2. Analytical Interpolating Function

Once the midpoint, duration, and asymmetry parameters are specified and used to calculate three ionization points, we use an analytical interpolating function to calculate the reionization history. In Trac (2018), we use Lagrange interpolating functions to construct an analytical function for the ionization fraction x_i(z) that exactly fits the ionization points. For three points, the interpolating polynomial is a quadratic, which has the advantage of being invertible but has the disadvantage of being nonmonotonic. A log transformation of the variables improves monotonicity over the relevant redshift range, but it is not an ideal solution.

In AMBER, we interpolate the three ionization points with a modified Weibull function (Weibull 1951),

$\begin{eqnarray}&&{\bar{x}}_{{\rm{i}}}(z)=\exp \left[-\max {\left(\displaystyle \frac{z-{a}_{{\rm{w}}}}{{b}_{{\rm{w}}}},0\right)}^{{c}_{{\rm{w}}}}\right],\end{eqnarray} \tag{ 14 }$

where the coefficients a_w, b_w, c_w are all positive values. Note that a_w corresponds to z_end when x_end = 1 and the max function ensures full ionization at lower redshifts. In Appendix A, we show that the coefficients can be easily determined by first solving a nonlinear equation for c_w and then substituting its value into algebraic equations for the other two coefficients. We find that solutions exist for the asymmetry range A_z ≲ 15, which is more than sufficient for parameter-space studies. We also experimented with a generalized logistic function (Richards 1959), but find that it is limited to A_z ≲ 5. The Weibull function can be analytically inverted,

$\begin{eqnarray}&&z({\bar{x}}_{{\rm{i}}})={a}_{{\rm{w}}}+{b}_{{\rm{w}}}{\left|\mathrm{ln}{\bar{x}}_{{\rm{i}}}\right|}^{1/{c}_{{\rm{w}}}},\end{eqnarray} \tag{ 15 }$

which is a useful feature for our abundance matching technique.

Table 1 lists the Weibull coefficients for the three RadHydro simulations. The offset a_w determines when the function reaches unity, and it decreases when reionization ends later. The divisor b_w controls the stretch in redshift, and it increases with the duration Δ_z. The power c_w controls the steepness of the function, and it is inversely related to the asymmetry A_z. In general, changing one of the reionization shape parameters while holding the other two fixed affects all three Weibull coefficients.

Figure 4 compares the evolution of the mass-weighted ionization fraction x_i(z) from the RadHydro simulations with the Weibull functions. The analytical parameterizations are excellent matches to the simulated reionization histories. The typical differences are only $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.01$ , while the maximum differences of $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.02$ are found near the end of the EoR, which is not accurately captured in reionization simulations and semi-analytical models. We find that the Weibull function gives slightly better fits than Lagrange interpolation and is valid for a larger range of parameter space.

**Figure 4.** Top: the evolution of the mass-weighted ionization fraction with redshift from the RadHydro simulations (solid) and the Weibull functions (dashed). The analytical parameterizations accurately capture the redshift-asymmetric form of the simulated reionization histories. Bottom: the typical differences in ionization fractions are $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.01$ , while the maximum differences of $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}}}| \lesssim 0.02$ are typically found near the end of the EoR, which are uncertain in the simulations.
Download figure:
Standard image High-resolution image

3.2. Lagrangian Perturbation Theory

LPT is used to generate initial conditions for cosmological simulations and semi-analytical methods. LPT is also used in semi-numerical models of reionization to efficiently model the evolved matter distribution in the moderately nonlinear regime, because particles can be simply advanced to any given redshift without having to iteratively perform expensive force calculations and time integration like in N-body and hydro simulations. Furthermore, on moderately nonlinear scales, the dark matter and gas distributions are highly correlated and assumed to exactly trace each other. We will here test these assumptions and techniques for evolving the large-scale structure and constructing the density and velocity fields in AMBER.

In first-order LPT, otherwise known as the Zel'dovich Approximation (Zel'Dovich 1970), each particle moves in a straight line with displacement and velocity,

$\begin{eqnarray}&&{\boldsymbol{x}}={\boldsymbol{q}}-D(z){\rm{\nabla }}\phi ({\boldsymbol{q}}),\end{eqnarray} \tag{ 16 }$

$\begin{eqnarray}&&{\boldsymbol{v}}=-D(z)f(z)H(z){\rm{\nabla }}\phi ({\boldsymbol{q}}),\end{eqnarray} \tag{ 17 }$

where q is the Lagrangian coordinate, ϕ( q ) is the Lagrangian potential, D(z) is the density linear growth factor, f(z) is the velocity linear growth factor, and H(z) is the Hubble function. In second-order LPT (e.g., Bouchet et al. 1995; Scoccimarro 1998), each particle moves in a slightly curved trajectory with displacement and velocity,

$\begin{eqnarray}&&{\boldsymbol{x}}={\boldsymbol{q}}-{D}_{1}{\rm{\nabla }}{\phi }_{1}+{D}_{2}{\rm{\nabla }}{\phi }_{2},\end{eqnarray} \tag{ 18 }$

$\begin{eqnarray}&&{\boldsymbol{v}}=-{D}_{1}{f}_{1}H{\rm{\nabla }}{\phi }_{1}+{D}_{2}{f}_{2}H{\rm{\nabla }}{\phi }_{2},\end{eqnarray} \tag{ 19 }$

where ϕ_i( q ), D_i(z), and f_i(z) are the usual terms calculated at the ith-order. The spatial information is contained in the Lagrangian potentials, which only have to be computed once, while the time evolution is governed by the growth factors, which are more straightforward to calculate.

For modeling the EoR, we find that 2LPT is more accurate than 1LPT at length scales ≲1 Mpc and redshifts z ≳ 5. Third-order LPT (e.g., Bouchet et al. 1995; Scoccimarro 1998) and augmented LPT (e.g., Kitaura & Hess 2013; Neyrinck 2016) have been shown to offer only small improvements at lower redshifts far after the end of the EoR. Hence, we will generally use second-order theory and will simply refer to it as LPT throughout the remainder of this paper.

3.2.1. Density Fields

Density fields are constructed from the LPT particles using techniques from particle-mesh (PM) methods. In N-body simulations, particles are added to a Cartesian mesh to create a density field for solving Poisson's equation with Fast Fourier Transforms (FFT). In large-scale structure analyses, galaxies or halos are added to a grid to create a density field for calculating the power spectrum. Similar techniques are also used in other semi-numerical models of reionization, but we will describe an improved implementation for AMBER.

There is a hierarchy of particle assignment schemes, and the first three are called the nearest grid point (NGP), cloud-in-cell (CIC), and triangular-shaped cells (TSC). The mass assignment process interpolates the discrete distribution of particles and can be interpreted as convolving and sampling the underlying density field ρ( x ) to produce a discretized density field,

$\begin{eqnarray}&&\rho ({{\boldsymbol{x}}}_{{\rm{n}}})=\int \rho ({\boldsymbol{x}}){W}_{\mathrm{pm}}({{\boldsymbol{x}}}_{{\rm{n}}}-{\boldsymbol{x}}){d}^{3}x,\end{eqnarray} \tag{ 20 }$

where W_pm( x ) is the PM assignment window function. See Hockney & Eastwood (1988) for a seminal review.

The particle assignments introduce effects such as aliasing, smoothing, and shot noise that have to be accounted for (e.g., Jing 2005). While the latter two effects can have analytical corrections, the first is especially problematic. Hockney & Eastwood (1988) have suggested that interlacing two or more staggered meshes can reduce aliasing. Interlacing has been used to improve force calculations in N-body simulations (Couchman 1991), produce more accurate overdensity fields for power spectra estimation (Sefusatti et al. 2016), and recently implemented in the nbodykit package (Hand et al. 2018).

We use the interlacing technique to produce more accurate and robust density fields in AMBER. We construct the first density field ρ₁( x ) as normal; while for the second density mesh ρ₂( x ), the particle positions are shifted by half of a grid cell spacing in each direction by adding the vector Δ x = (l_c/2, l_c/2, l_c/2), which is equivalent to shifting the mesh by the opposite vector. In Fourier space, the two transformed fields are combined into a single, effective field as

$\begin{eqnarray}&&\rho ({\boldsymbol{k}})=\displaystyle \frac{\left[{\rho }_{1}({\boldsymbol{k}})+{e}^{i{\boldsymbol{k}}\cdot {\boldsymbol{\Delta }}{\boldsymbol{x}}}{\rho }_{2}({\boldsymbol{k}})\right]/2}{{W}_{\mathrm{pm}}({\boldsymbol{k}})}.\end{eqnarray} \tag{ 21 }$

The twiddle factor e^{i

k
·Δ

x} multiplied to the second transformed field accounts for the shifted particle positions. We also deconvolve the effects of smoothing by dividing with the Fourier transform of the assignment window function,

$\begin{eqnarray}&&{W}_{\mathrm{pm}}({\boldsymbol{k}})={\left[\displaystyle \frac{\sin ({k}_{{\rm{x}}}{l}_{{\rm{c}}}/2)\sin ({k}_{{\rm{y}}}{l}_{{\rm{c}}}/2)\sin ({k}_{{\rm{z}}}{l}_{{\rm{c}}}/2)}{({k}_{{\rm{x}}}{l}_{{\rm{c}}}/2)({k}_{{\rm{y}}}{l}_{{\rm{c}}}/2)({k}_{{\rm{z}}}{l}_{{\rm{c}}}/2)}\right]}^{p},\end{eqnarray} \tag{ 22 }$

where p = 1, 2, and 3 for the NGP, CIC, and TSC schemes, respectively.

For comparison, we construct the AMBER and RadHydro density fields at two resolutions, using a 64³ mesh with cell size l_c = 0.8 h⁻¹Mpc and a 512³ mesh with cell size l_c = 0.1 h⁻¹Mpc. The fiducial lower resolution is more typical of semi-numerical models, while the higher resolution allows us to test the limits of LPT. The AMBER density fields are efficiently made using equal numbers of LPT particles as mesh cells. The RadHydro density fields are made by simply binning the 2048³ dark matter particles and the 2048³ gas cells down to the appropriate size meshes.

Figure 5 is a visualization of the matter density fields ρ( x ) at redshift z = 8 from RadHydro and AMBER. The LPT particles are assigned using the TSC scheme with interlacing and deconvolution. Both images are shown with the higher-resolution pixels in order to see the large-scale structure more clearly, but have been averaged along the line of sight at the lower resolution. The RadHydro matter and gas (not shown) density fields are visually indistinguishable, while the AMBER density field also has close resemblance, but minor smoothing is visible in the small-scale, high-density regions. At the lower resolution, the AMBER and RadHydro fields are indistinguishable, though the images appear quite pixelated.

Figure 6 compares the one-point PDF of the relative density ρ/〈ρ〉 = 1 + δ at redshift z = 8. The RadHydro matter and gas distributions are in very good agreement even at the higher resolution. For AMBER, the TSC assignment scheme with interlacing and deconvolution gives the best agreement, and we only show this case to avoid overcrowding the plot. The results are also in very good agreement at the fiducial resolution, but not at the higher resolution. LPT calculated at the second order or an even higher order cannot capture the nonlinear shell-crossing on smaller scales and therefore underresolves high-density, collapsed regions. The disagreement in underdense regions is not concerning, because it is due to the much smaller number of LPT particles and the different assignment and deconvolution process for AMBER. We would find the same effects for RadHydro if we use a lower-resolution simulation and had not done the simple binning.

Figure 7 shows the density auto-power spectra ${{\rm{\Delta }}}_{\mathrm{dd}}^{2}(k)$ at redshift z = 8 and cross correlations relative to the RadHydro matter overdensity δ_m. All of the dimensionless power spectra continue to increase in amplitude toward smaller scales, typical for cold dark-matter-dominated models. For the fiducial lower resolution, the power reaches a maximum value of only ${{\rm{\Delta }}}_{\mathrm{dd}}^{2}\approx 0.2$ . As this is still in the quasi-linear regime, 2LPT is an accurate and efficient approach for modeling the mass distribution and density field.

The RadHydro matter and gas perfectly trace each other with bias b_dd = 1 and cross correlation r_dd = 1 for k ≲ 10 h Mpc⁻¹. The suppression in the gas bias on smaller scales is due to the Jeans smoothing from the gas pressure that opposes gravitational collapse. For AMBER, the LPT bias starts out at unity on the largest scales, is still within a few percent for k ≳ 1 h Mpc⁻¹, but drops to ∼0.9 by k ∼ 10 h Mpc⁻¹. However, the normalized cross correlation is still closer to unity down to smaller scales, showing that the density perturbations are highly in-phased even though the amplitudes may differ. We find that using the TSC assignment scheme with interlacing and deconvolution gives more accurate auto-power and cross-power spectra. The NGP and CIC results tend to be overbiased and undercorrelated near the Nyquist frequency k_Ny = π/Δx because of aliasing. Our results are similar to the previous findings by Sefusatti et al. (2016), who present a thorough analysis of how interlacing can reduce aliasing and improve power spectrum estimation.

3.2.2. Velocity Fields

Velocity fields are constructed similarly to the density fields by assigning the LPT particles to interlaced meshes. For a given mesh cell, the velocity v is a weighted average of the individual velocities v _i from all overlapping particles,

$\begin{eqnarray}&&{\boldsymbol{v}}=\sum _{i}\displaystyle \frac{{w}_{i}}{{w}_{\mathrm{tot}}}{{\boldsymbol{v}}}_{i}=\displaystyle \frac{\sum {w}_{i}{{\boldsymbol{v}}}_{i}}{\sum {w}_{i}},\end{eqnarray} \tag{ 23 }$

where w_i and w_tot are the individual and total mass assignment weights, respectively. With the NGP assignment scheme, we find that some cells can have no overlapping particles and therefore undefined velocities. Both the CIC and TSC schemes work better in practice with no empty cells and missing velocities for our moderately clustered distribution of LPT particles.

Figure 8 is a visualization of the line-of-sight component of the velocity fields v( x ) at redshift z = 8 from RadHydro and AMBER. The velocity images are constructed using the same approach as for the density images (Figure 5). The shown images with the higher-resolution pixels are visually similar and have an even closer resemblance with lower-resolution pixels. The RadHydro and AMBER velocity fields show large coherence lengths that are actually underestimated because of the moderate simulation box length L = 50 h⁻¹Mpc. We will discuss this interesting feature in more detail below.

Figure 9 compares the one-point PDF of the velocity components v ⊂ (v_x, v_y, v_z), at redshift z = 8. The RadHydro matter and gas velocity distributions are in excellent agreement, even more so than their density distributions. All of the distributions are Gaussian, and the widths decrease slightly by a few percent for the fiducial lower resolution. For AMBER, the CIC and TSC assignment schemes with interlacing and deconvolution give similarly accurate results, and we only show the latter to avoid overcrowding the plot. The differences in the tails of the PDFs are due to the RadHydro simulations having slightly faster velocities because of virialization in rare, collapsed regions.

Figure 10 shows the velocity auto-power spectra Δ_vv(k) and the cross correlations relative to the RadHydro matter velocity v_m. We individually compare and average over each velocity component v ⊂ (v_x, v_y, v_z). The velocity power spectrum declines in amplitude toward smaller scales, unlike the density power spectrum (Figure 7). In linear theory, the transformed velocity field is related to the matter overdensity field as v ( k ) ∝ ( k /k²)δ( k ), and therefore the power is predominantly coming from larger scales. This explains the large coherence length seen in the velocity images (Figure 8). While radiation-hydrodynamic simulations are still computationally too expensive, semi-numerical methods are sufficiently efficient to afford large box sizes of ≳ 500 h⁻¹Mpc, which is necessary to capture the large-scale power and accurately model the velocity fields.

The RadHydro matter and gas velocities perfectly trace each other with bias b_vv = 1 and cross correlation r_vv = 1 for k ≲ 20 h Mpc⁻¹, even to smaller scales compared to their densities (Figure 7). For AMBER, the NGP assignment scheme gives incorrect bias and poor stochasticity when we simply set the velocities to zero where ever it is undefined. The CIC and TSC assignment schemes have similarly accurate bias, but the latter has preferably better cross correlation. For the velocity bias, the lower-resolution results diverge from the higher-resolution ones at larger scales compared to the density bias. The velocity field is a weighted average of the particle velocities (Equation (23)), and a simple deconvolution using the mass assignment window function (Equation (22)) is only approximately correct.

In AMBER, we adopt the TSC assignment scheme with interlacing and deconvolution as the default method for constructing density and velocity fields from the LPT particles. This approach reduces aliasing and smoothing, and produces the most accurate one-point distributions and two-point correlations. Accurate density and velocity fields are essential components for modeling EoR observables. The 21 cm signal and Lyα forest both depend on the neutral hydrogen density n_HI( x ) and line-of-sight velocity field v_los( x ). For the CMB, the patchy Thomson optical depth and patchy KSZ effect depend on the electron number density n_e( x ) and line-of-sight momentum n_e( x )v_los( x ), respectively.

3.3. Excursion Set Formalism

The ESF (Bond et al. 1991) is used to model the collapsed mass fraction and the halo mass function in the extended Press-Schecter theory (EPS). ESF has the advantage of being orders of magnitude faster than an N-body simulation and halo finding, but it is known to only produce approximately correct results. In the majority of semi-numerical methods for reionization, it is used to model both the collapsed mass and ionization fraction field (e.g., Furlanetto et al. 2004). For AMBER, we use ESF only for modeling the halo mass density field.

We will first summarize ESF and then test its accuracy for modeling the halo mass density field in AMBER. The linearly extrapolated overdensity field δ₀ is filtered to produce

$\begin{eqnarray}&&{\delta }_{{\rm{f}}}({\boldsymbol{x}})=\int {\delta }_{0}({{\boldsymbol{x}}}^{{\prime} }){W}_{{\rm{f}}}({\boldsymbol{x}}-{{\boldsymbol{x}}}^{{\prime} }){d}^{3}{x}^{{\prime} },\end{eqnarray} \tag{ 24 }$

$\begin{eqnarray}&&{\delta }_{{\rm{f}}}({\boldsymbol{k}})={\delta }_{0}({\boldsymbol{k}}){W}_{{\rm{f}}}({\boldsymbol{k}}),\end{eqnarray} \tag{ 25 }$

where the window function W_f is usually taken to be a sharp k-space filter,

$\begin{eqnarray}&&{W}_{\mathrm{sk}}(r)=\displaystyle \frac{1}{2{\pi }^{2}{r}^{2}}\left[\sin (r/{R}_{\mathrm{sk}})-(r/{R}_{\mathrm{sk}})\cos (r/{R}_{\mathrm{sk}})\right],\end{eqnarray} \tag{ 26 }$

$\begin{eqnarray}{W}_{\mathrm{sk}}(k)=\left\{\begin{array}{ll}1\quad & {{kR}}_{\mathrm{sk}}\leqslant 1\\ 0 & \mathrm{otherwise}\end{array}\right.,\end{eqnarray} \tag{ 27 }$

or a spherical tophat filter,

$\begin{eqnarray}{W}_{\mathrm{th}}(r)=\left\{\begin{array}{ll}3/(4\pi {R}_{\mathrm{th}}^{3})\quad & r\leqslant {R}_{\mathrm{th}}\\ 0 & \mathrm{otherwise}\end{array}\right.,\end{eqnarray} \tag{ 28 }$

$\begin{eqnarray}&&{W}_{\mathrm{th}}(k)=\displaystyle \frac{3}{{\left({{kR}}_{\mathrm{th}}\right)}^{3}}\left[\sin ({{kR}}_{\mathrm{th}})-({{kR}}_{\mathrm{th}})\cos ({{kR}}_{\mathrm{th}})\right].\end{eqnarray} \tag{ 29 }$

The convolution is rapidly computed in Fourier space after transforming with highly optimized FFTs.

In a large-scale region with filtered overdensity δ_f, the collapsed fraction of matter in the halos above a minimum mass M_min is given by the EPS result (Lacey & Cole 1993),

$\begin{eqnarray}&&{f}_{\mathrm{coll}}=\mathrm{erfc}\left[\displaystyle \frac{{\delta }_{{\rm{c}}}(z)-{\delta }_{{\rm{f}}}}{\sqrt{2[{\sigma }^{2}({M}_{\min })-{\sigma }^{2}({M}_{{\rm{s}}})]}}\right].\end{eqnarray} \tag{ 30 }$

The variance of the linear density fluctuations, smoothed on mass scale M_s, is calculated as

$\begin{eqnarray}&&{\sigma }^{2}({M}_{{\rm{s}}})=\int {{\rm{\Delta }}}_{\mathrm{lin}}^{2}(k)| {W}_{\mathrm{th}}(k){| }^{2}d\mathrm{ln}k,\end{eqnarray} \tag{ 31 }$

where ${{\rm{\Delta }}}_{\mathrm{lin}}^{2}(k)$ is the dimensionless linear power spectrum, and W_th(k) is the spherical tophat filter with comoving radius ${R}_{{\rm{s}}}={\left[{M}_{{\rm{s}}}/(4\pi {\bar{\rho }}_{0}/3)\right]}^{1/3}$ . The corresponding collapsed overdensity barrier is given by

$\begin{eqnarray}&&{\delta }_{{\rm{c}}}(z)=\displaystyle \frac{{\delta }_{\mathrm{crit}}}{D(z)},\end{eqnarray} \tag{ 32 }$

where δ_crit ≈ 1.686 is the spherical collapse value, and D(z) is the linear growth factor. Note that the redshift dependence only explicitly shows up in Equation (32), while the other functions and fields have already been linearly extrapolated to z = 0.

In the original version of ESF (e.g., Bond et al. 1991; Lacey & Cole 1993), the filtering is done in Lagrangian ( q ) space on the initial overdensity field. In an alternative version that is used in some semi-numerical models of reionization (e.g., Mesinger et al. 2011; Zahn et al. 2011), the filtering is done in Eulerian ( x ) space on the evolved density field. We will refer to these versions as ESF-L and ESF-E, respectively.

When the filter window function W_f is chosen to be a spherical tophat function, it is natural to set the filter radius R_th to be equal to the comoving radius R_s in the variance. For the sharp k-space function, it is unclear how to choose the filter radius R_sk. Following Lacey & Cole (1993), we will simply choose ${R}_{\mathrm{sk}}={\left[{M}_{{\rm{s}}}/(6\pi {\bar{\rho }}_{0})\right]}^{1/3}={\left[2/(9\pi )\right]}^{1/3}{R}_{{\rm{s}}}$ .

For ellipsoidal collapse, Sheth et al. (2001) propose a moving barrier δ_c(σ, z) with three coefficients that can be changed in order to match the halo mass functions from N-body simulations. Their moving barrier is used for determining when halos collapse following the peak-patch approach of Bond & Myers (1996). However, it is not self-consistent to simply use their δ_c(σ, z) in Equation (30) to calculate the collapsed fraction. Furthermore, the proposed functional form for δ_c(σ, z) may not be valid for all halo mass functions (e.g., Robertson et al. 2009). The Sheth & Tormen (1999) halo mass function corresponding to this moving barrier does not agree with the more recent halo mass functions for the EoR in SCORCH I (Trac et al. 2015). Therefore, we will use the spherical overdensity δ_c(z) rather than the ellipsoidal collapse barrier.

3.3.1. Halo Mass Density Fields

We construct halo mass density fields with ESF and compare them against results from a high-resolution N-body simulation. For reference, we use the halo catalog from SCORCH II (Section 2.1) and select all halos above a minimum mass ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ , which corresponds to the atomic-cooling limit for galaxy formation at redshift z ≈ 8. To assign the halo mass to interlaced meshes, we choose the CIC scheme as it gives both reduced aliasing and strong correlation with the matter density field. The NGP scheme has the most aliasing, while the TSC scheme has the most stochasticity as it spreads the small halos over too many larger grid cells.

For ESF-L, we calculate the collapsed fraction and mass in Lagrangian space,

$\begin{eqnarray}&&{m}_{\mathrm{coll}}({\boldsymbol{q}})={f}_{\mathrm{coll}}({\boldsymbol{q}}){m}_{{\rm{p}}},\end{eqnarray} \tag{ 33 }$

and for LPT particles of mass m_p, move the particles to their comoving positions x (Equation (18)), and then assign the collapsed mass to the interlaced meshes (Equation (21)) to obtain the halo mass density field ρ_h( x ). For ESF-E, we calculate the collapsed fraction field in Eulerian space, and then the halo mass density field is calculated as

$\begin{eqnarray}&&{\rho }_{{\rm{h}}}({\boldsymbol{x}})={f}_{\mathrm{coll}}({\boldsymbol{x}}){\bar{\rho }}_{{\rm{m}}}[1+D(z){\delta }_{{\rm{f}}}({\boldsymbol{x}})],\end{eqnarray} \tag{ 34 }$

where ${\bar{\rho }}_{{\rm{m}}}$ is the average matter density, and the growth factor D(z) reverses the linear extrapolation done earlier.

ESF requires that the mass smoothing scale M_s be greater than the minimum halo mass M_min in Equation (30). The smoothing scale also cannot be lower than the particle mass resolution. For the fiducial lower resolution with 64³ particles on a 64³ mesh in the 50 h⁻¹Mpc box, the particle mass is m_p = 4.0 × 10¹⁰ h⁻¹ M_⊙, and the cubical cell size is l_c = 0.8 h⁻¹Mpc. Therefore, we will choose and vary the smoothing scales such that M_s ≥ m_p and ${R}_{{\rm{s}}}={\left[{M}_{{\rm{s}}}/(4\pi {\bar{\rho }}_{0}/3)\right]}^{1/3}\gtrsim 0.5\ {h}^{-1}\mathrm{Mpc}$ .

Figure 11 is a visualization of the halo mass density fields ρ_h( x )/〈ρ_h〉 for ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ at z = 8. Here, we normalize each density field using the same mean 〈ρ_h〉 taken from the N-body result to preserve the relative differences in ρ_h( x ). We also use the fiducial smoothing scale M_s = m_p. The halo images appear more pixelated than the corresponding density and velocity images (Figures 5, 8) because of the lower resolution shown here. The ESF-L images with the sharp k-space and tophat filters are remarkably similar to the N-body result, but the ESF-E images are noticeably different. ESF-E produces relatively higher halo densities in prominent collapsed regions and lower values elsewhere, and this larger range of density contrast can be traced back to using the evolved density field rather than the initial density field for the filtering process.

Figure 12 shows the halo auto-power spectra ${{\rm{\Delta }}}_{\mathrm{hh}}^{2}(k)$ and cross correlations. The halo power spectrum is approximately power law with slope $n=d\mathrm{ln}P/d\mathrm{ln}k\approx -1.5$ at z = 8. In comparison with the matter power spectrum (Figure 7), we find the scale-dependent halo-matter bias b_hm(k) with a large-scale value of ≈3.7. Similarly, the halo-matter cross correlation r_hm(k) is close to unity on the largest scales, but drops below 0.9 for k ≳ 1 h Mpc⁻¹. Therefore, we cannot accurately model the halo mass density field assuming linear bias, even if we allow for scale dependence.

In cross correlation with the N-body halo mass density field, the ESF-L with the sharp k-space filter gives the best agreement, closely followed by with the tophat filter. Both the bias b_hh(k) and cross correlation r_hh(k) differ from unity by ≲0.05 down to the smallest scale shown. The strong agreement is remarkable given that we have adopted nominal choices for the filter radius R_sk and collapse barrier δ_c without any fine-tuning. ESF-E with either the sharp k-space filter or the tophat filter gives poorer agreement with the N-body results. The larger amplitude in the power spectrum and bias is consistent with the larger range in the density contrast seen in the halo density images (Figure 11).

We now vary the minimum halo mass M_min and smoothing mass M_s and quantify the mass-weighted average collapse fraction 〈f_coll〉 in Figure 13. As M_min is increased above the atomic-cooling limit while fixing M_s = 4 × 10¹⁰ h⁻¹ M_⊙ and R_s = 0.5 h⁻¹Mpc, we again find that ESF-L with the sharp k-space filter gives the best agreement, followed by with the tophat filter. However, ESF-E overpredicts the average collapse fraction by a factor of ≳3, with increasingly larger differences toward higher M_min. Note that we should expect small differences in the average collapse fraction. In the N-body halo finder (Trac et al. 2015), the halo mass M₂₀₀ is defined such that the average halo density ${\bar{\rho }}_{{\rm{h}}}$ is 200 times the average matter density ${\bar{\rho }}_{{\rm{m}}}(z)$ , which is also equal the critical density ${\bar{\rho }}_{\mathrm{crit}}(z)$ at high redshifts. In ESF, the halo mass is not clearly defined; although in spherical collapse theory, the average halo density is usually 18π² ≈ 180 times the critical density at high redshifts or in Einstein-de Sitter cosmologies.

As M_s and R_s are increased while fixing ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ , we find that ESF-L gives reliably accurate and stable values for the average collapsed fraction that are within 10% of the N-body halo value. However, ESF-E gives highly variable results that are only in agreement at large smoothing radius R_s ≳ 10 h⁻¹Mpc. Similarly, we also find that the power and bias change as the smoothing is varied, which is not found for ESF-L.

The sensitivity of ESF-E to the smoothing and filtering scales is especially troublesome for the semi-numerical models of reionization that are based on this approach. For a given ionized region, the filtering scale is adaptively varied until the number of ionizing photons equals the number of hydrogen atoms. For the entire modeled volume, there is a distribution of filtering scales rather than just one value, and as a consequence, there is no one value for the average collapse fraction and large-scale halo bias. Furthermore, as one changes the reionization parameters, the distribution of filtering scales will change, and so too will the halo mass density field and power spectrum, both of which should be independent of the reionization history.

In AMBER, we adopt the ESF-L with the sharp k-space filter as the default choice for modeling the halo mass density field. In the next section, we will discuss that our abundance matching technique does not depend on the overall average density 〈ρ_h〉 nor the overall normalization of the halo bias b_hm. In future work, we can calibrate the EPS collapsed fraction relation (Equation (30)) using the N-body simulations and halo catalogs from SCORCH I and II. Accurate halo abundance and density fields are required to properly model high-redshift galaxies and quasars as radiation sources.

3.4. Abundance Matching

The new idea for AMBER is that there is a spatial order to the reionization process, and abundance matching can be applied to a correlated field to accurately predict the reionization-redshift field (Section 2.3) such that the reionization history follows a given mass-weighted ionization fraction (Section 3.1). We emphasize that the abundance matching technique does not depend on the overall normalization of the correlated field of interest. In Trac et al. (2008), we find that the higher-density regions near sources are generally reionized earlier than the lower-density regions far from sources in our RadHydro simulations. Following up in Battaglia et al. (2013b), we find that the density and reionization-redshift fields are correlated down to Mpc scales. While two natural choices for abundance matching are the matter density field (Section 3.2.1) and the halo mass density field (Section 3.3.1), the radiation field is more directly relevant to the photoionization of hydrogen.

3.4.1. Radiation Field

The radiation field is specified by the hydrogen photoionization rate,

$\begin{eqnarray}&&{{\rm{\Gamma }}}_{\mathrm{HI}}({\boldsymbol{x}})=4\pi {\int }_{{\nu }_{\mathrm{HI}}}^{\infty }\displaystyle \frac{{J}_{\nu }(\nu ,{\boldsymbol{x}})}{h\nu }{\sigma }_{\mathrm{HI}}(\nu )d\nu ,\end{eqnarray} \tag{ 35 }$

where J_ν is the specific intensity, σ_HI is the hydrogen photoionization cross-section, and ν_HI is the Lyman limit frequency. In our RadHydro simulations, the RT is calculated using an accurate but expensive raytracing algorithm (Trac & Cen 2007). However, this is an unnecessarily costly approach for calculating the radiation field just for abundance matching, which does not depend on the overall normalization. In AMBER, we model the specific intensity field as

$\begin{eqnarray}&&{J}_{\nu }(\nu ,{\boldsymbol{x}})=\int \displaystyle \frac{{S}_{\nu }(\nu ,{{\boldsymbol{x}}}^{{\prime} })}{4\pi | {\boldsymbol{x}}-{{\boldsymbol{x}}}^{{\prime} }{| }^{2}}\exp \left(-\displaystyle \frac{| {\boldsymbol{x}}-{{\boldsymbol{x}}}^{{\prime} }| }{{l}_{\mathrm{mfp}}}\right){d}^{3}{x}^{{\prime} },\end{eqnarray} \tag{ 36 }$

where S_ν is the source function, the 1/(4π r²) term is the inverse-square law for flux, and the ${e}^{-r/{l}_{\mathrm{mfp}}}$ term is the transmitted fraction over the radiation mean free path l_mfp.

The source field is related to the specific luminosity density and can be expressed as

$\begin{eqnarray}&&{S}_{\nu }(\nu ,{\boldsymbol{x}})={f}_{\mathrm{esc}}{\epsilon }_{\nu }(\nu ){\rho }_{\mathrm{sfr}}({\boldsymbol{x}}),\end{eqnarray} \tag{ 37 }$

where ρ_sfr is the star formation rate density, _ν is the radiative energy per unit star formation rate per frequency, and f_esc is the radiation escape fraction (e.g., Madau et al. 1999; Robertson et al. 2010). The star formation rate density field is often modeled using the halo mass density field,

$\begin{eqnarray}&&{\rho }_{\mathrm{sfr}}({\boldsymbol{x}})=\displaystyle \frac{{f}_{\mathrm{star}}{\rho }_{\mathrm{halo}}({\boldsymbol{x}})}{{\tau }_{\mathrm{star}}},\end{eqnarray} \tag{ 38 }$

where f_star and τ_star are the star formation efficiency and timescale, respectively (e.g., Cen & Ostriker 1992; Trac & Cen 2007). The source field is proportional to the halo mass density field, while the normalization depends on the product of several astrophysical parameters and functions (f_esc, f_star, τ_star, _ν), which may have redshift dependence that further alters the reionization history. In semi-numerical methods based on the ESF, this combination is often represented by a single efficiency parameter ζ, which is generally assumed to be constant for simplicity. With our abundance matching technique, we do not need to specify the astrophysical quantities to calculate the normalization of the the source field, but instead get to cast them and their effects in terms of the the redshift midpoint, duration, and asymmetry parameters that directly set the reionization history (Section 3.1).

In AMBER, we adopt the common assumption S_ν( x ) ∝ ρ_halo( x ) and ignore the overall normalization for the source field as it is not necessary for abundance matching. The radiation field is computed quickly using FFTs to perform the convolution in Equation (36). In practice, we can abundance match the specific intensity field instead of the photoionization rate field, as the integration over frequency only changes the overall normalization for the latter.

We use an effective mean free path to account for the attenuation of the radiation field. Our use of a mean free path parameter is more physical than the maximum bubble size imposed in some semi-numerical models based on the ESF. It is more self-consistent with RT theory (e.g., Madau et al. 1999) and in better agreement with observational measurements (e.g., Worseck et al. 2014) extrapolated to higher redshifts for the EoR. It also helps to mitigate uncertainties on radiation sinks. Recently, Davies & Furlanetto (2021) have also introduced an ionizing photon mean free path as an improved alternative to the maximum filtering scale in ESF methods like 21cmFAST. Their implementation is different in that l_mfp ultimately shows up in their scale-dependent ionization criterion, whereas we use it to directly compute an attenuated radiation field.

3.4.2. Reionization-redshift Fields

The reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ is assumed to be correlated with a given field, either the matter density field, the halo mass density field, or the radiation field. A region with higher matter density, halo density, or radiation intensity is considered to be photoionized earlier and has a higher reionization redshift. The abundance matching technique assigns redshift values such that the reionization history follows a given mass-weighted ionization fraction ${\bar{x}}_{{\rm{i}}}(z)$ , specified with the redshift midpoint, duration, and asymmetry parameters and interpolated with a Weibull function (Section 3.1).

We perform the abundance matching on a correlated field at a single redshift for computational efficiency, but it can also be done tomographically using multiple redshift intervals. The correlated field is first constructed at the redshift midpoint z_mid, and the data array is ranked in descending order using a parallel quicksort. For the nth rank order cell out of a total of N, all cells with indices m ≤ n are considered ionized. The reionization redshift z_n is calculated by equating the cumulative mass fraction with the mass-weighted ionization fraction:

$\begin{eqnarray}&&\displaystyle \frac{\sum _{m=1}^{n}1+{\delta }_{m}({z}_{n})}{\sum _{m=1}^{N}1+{\delta }_{m}({z}_{n})}={\bar{x}}_{{\rm{i}}}({z}_{n}),\end{eqnarray} \tag{ 39 }$

where the matter overdensity for the mth cell is linearly extrapolated from the redshift midpoint,

$\begin{eqnarray}&&{\delta }_{m}({z}_{n})=\displaystyle \frac{D({z}_{n})}{D({z}_{\mathrm{mid}})}{\delta }_{m}({z}_{\mathrm{mid}}).\end{eqnarray} \tag{ 40 }$

The linear growth is appropriate for the modest overdensities in lower-resolution semi-numerical models, and because we have chosen the redshift midpoint as the pivot.

We construct the abundance matching models using the matter density field (AM-M), halo mass density field (AM-H), and radiation intensity fields (AM-R), and compare them with the RadHydro reionization-redshift fields (Section 2.3). For the AM-M models, we use the TSC particle assignment scheme with interlacing and deconvolution to construct the matter density field (Section 3.2.1). For the AM-H models, we use the ESF-L with sharp k − space filter for constructing the halo mass density field and set the minimum halo mass to ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ (Section 3.3.1). While this corresponds to the threshold for atomic-cooling halos, the RadHydro simulations are more complex because of the nonmonotonic star formation efficiency and episodic star formation (Trac et al. 2015; Doussot et al. 2019). For the AM-R models, we vary the radiation mean free path to find the best agreement between the reionization-redshift fields.

Figure 14 is a visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ from the RadHydro simulations and abundance matching models. While all of the models have the same reionization history for a given simulation, their reionization-redshift fields are visibly different. The AM-R images most closely resemble the simulation results and correctly show that large-scale regions near radiation sources are generally reionized earlier than large-scale regions far from sources. However, the AM-M and AM-H images appear too grainy. The low-density regions just outside of collapsed regions are assigned low redshifts and considered to be reionized late despite their close proximity to the radiation sources. We can possibly improve these results by smoothing the matter density and halo density fields prior to abundance matching.

Figure 15 shows the reionization-redshift auto-power spectra ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ and cross correlations relative to the RadHydro normalized redshift δ_z (Equation (4)). We only show Sims 0 and 2 for clarity. The AM-R power spectra are most similar to the simulation results, rising in power until a characteristic peak scale and then declining in amplitude with k. However, the AM-M and AM-H power spectra generally always increase with k, similar to the mass density (Figure 7) and halo density power spectra (Figure 12). The AM-R models are the least biased with b_zz ≈ 1 for k ≲ 1 h Mpc⁻¹, while the AM-M and AM-H biases significantly deviate from unity. Furthermore, the AM-R models have improved cross correlation r_zz that approach unity on large scales. The bias and stochasticity on small scales can be attributed to three main factors. In the RadHydro simulations, the episodic star formation and spatially varying mean free path are not accounted for. Furthermore, the simulations and models have differences in smoothing near the grid scale.

In AMBER, we choose to abundance match using the radiation field evaluated at the redshift midpoint. In future work, we can improve the spatial accuracy by incorporating varying mean free paths (e.g., Cain et al. 2021; Davies & Furlanetto 2021). We can also perform the abundance matching tomographically using multiple redshifts.

4. Methods Comparison

We compare AMBER against two other semi-numerical methods using RadHydro Sim 1 as reference. We continue to use the fiducial resolution with 64³ mesh with cell size l_c = 0.8 h⁻¹Mpc, which is comparable to the typical resolution for the semi-numerical methods. In this section, we first summarize the methods (Sections 4.1 and 4.2), and then compare their reionization histories (Section 4.3) and reionization-redshift fields (Section 4.4).

4.1. 21cmFAST

The 21cmFAST code (Mesinger et al. 2011), like the majority of semi-numerical methods, is based on the ESF for reionization (e.g., Furlanetto et al. 2004). It generates the density, ionization, velocity, and spin temperature fields to compute the 21 cm brightness temperature given the astrophysical parameters that control the ionizing, Lyα, and X-ray backgrounds. This method has been used widely to model and study the EoR through the 21 cm signal (e.g Greig & Mesinger 2017) and CMB (e.g., Mesinger et al. 2012).

We use the original version with three free parameters: ionizing efficiency ζ, minimum halo mass M_min, and maximum filter scale R_max, because this parameterization is common among ESF codes and has been extensively tested and applied. Park et al. (2019) have developed a new version with eight free parameters, allowing additional parameterization of high-redshift galaxy properties. Davies & Furlanetto (2021) have proposed improving R_max with two physically motivated prescriptions of the mean free path of ionizing photons. To our knowledge, the newer versions have not yet been tested against RT simulations, and searching the high-dimensional parameter space is beyond the scope of our basic comparison.

By default, 21cmFAST uses 64 times as many 1LPT particles as mesh cells for constructing density and velocity fields. For this comparison, 256³ particles are assigned to a 64³ mesh using the NGP scheme, but there is significant smoothing near the grid scale because deconvolution is not performed. To achieve the same effective resolution, 21cmFAST would require at least 8 times as many mesh cells, 512 times as many particles, and over 500 times as much memory as AMBER. It also uses the alternative ESF-E version for the collapsed fraction (Section 3.3.1), but it does not explicitly compute the halo mass density field.

4.2. Reionization on Large Scales

The Reionization on Large Scales (RLS; Battaglia et al. 2013b) method is not based on the ESF, but is a parametric approach motivated by and calibrated with previous RadHydro simulations. It provides a parametric approach to map higher-resolution, smaller-volume radiation-hydrodynamic simulations onto lower-resolution, larger-volume N-body simulations. This method has been applied to model and study the patchy Thomson optical depth (Natarajan et al. 2013), patchy KSZ effect (Battaglia et al. 2013a), and the 21 cm lightcone effect (La Plante et al. 2014).

The reionization-redshift field is found to be highly correlated with the matter density field down to Mpc scales and related using a first-order, scale-dependent bias. In Fourier space, the matter overdensity field δ( k ) at the mean redshift $\bar{z}$ is first smoothed with a spherical tophat filter W_th(k) with radius R_th = 1 h⁻¹Mpc and then multiplied by a bias function b_zm(k) to produce the normalized redshift field,

$\begin{eqnarray}&&{\delta }_{{\rm{z}}}({\boldsymbol{k}})={b}_{\mathrm{zm}}(k)\delta ({\boldsymbol{k}}){W}_{\mathrm{th}}(k).\end{eqnarray} \tag{ 41 }$

The bias function is parameterized as

$\begin{eqnarray}&&{b}_{\mathrm{zm}}(k)=\displaystyle \frac{{b}_{\mathrm{RLS}}}{{\left(1+k/{k}_{\mathrm{RLS}}\right)}^{{a}_{\mathrm{RLS}}}},\end{eqnarray} \tag{ 42 }$

where the normalization is fixed at b_RLS = 1/δ_c = 0.593 based on analytical models (Barkana & Loeb 2004), while the scale k_RLS and slope a_RLS are allowed to vary. Finally, the reionization-redshift field is given by

$\begin{eqnarray}&&{z}_{\mathrm{re}}({\boldsymbol{x}})=\bar{z}+(1+\bar{z}){\delta }_{{\rm{z}}}({\boldsymbol{x}}),\end{eqnarray} \tag{ 43 }$

where the mean redshift $\bar{z}$ is the volume-averaged value of the reionization-redshift field, and it is not exactly equal to the redshift midpoint z_mid of either the mass-weighted or volume-weighted ionization fractions. While the RLS parameters can be varied to change the reionization history and process, there is no direct connection between them and the redshift midpoint, duration, asymmetry, minimum halo mass, and radiation mean free path.

4.3. Reionization History

Figure 16 compares the mass-weighted ionization fractions ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ and volume-weighted ionization fractions ${\bar{x}}_{{\rm{i}},{\rm{V}}}(z)$ from the models with the simulation. For AMBER, we directly input the redshift midpoint z_mid = 7.85, duration Δ_z = 4.73, and asymmetry A_z = 2.35 parameter values. These are calculated for RadHydro Sim 1 at the fiducial resolution using Equations (10) and (11) on ${\bar{x}}_{{\rm{i}},{\rm{M}}}(z)$ . While AMBER uses the mass-weighted parameters by construction, it almost exactly reproduces both the mass-weighted ionization fraction with $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.01$ , and volume-weighted ionization fraction with $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.03$ .

There are two models for 21cmFAST. In the first model (21cmFAST-a), we vary the ionization efficiency and find the best-fit ζ = 7.5 for matching z_mid. The minimum halo mass is kept fixed at ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ to be consistent with the SCORCH galaxy models used in the RadHydro simulations. This model has a more extended reionization history, overestimates the duration by 35%, underestimates the asymmetry by 6%, and has $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.10$ and $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.17$ . In the second model (21cmFAST-b), we find best-fit ζ = 22.9 and ${M}_{\min }=1.8\times {10}^{9}\ {h}^{-1}{M}_{\odot }$ for matching both z_mid and Δ_z. This model underestimates the asymmetry by 14% and has $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.04$ and $| {\rm{\Delta }}{\bar{x}}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.06$ . Note that the best-fit M_min is now ≈20 times larger and inconsistent with that in the RadHydro simulations. This complicates the interpretation of the inferred minimum halo mass for galaxy formation when comparing the observations with the original version of 21cmFAST and other similar ESF models. The newer version (Park et al. 2019) with additional parameterization for high-redshift galaxy properties will likely provide better agreement, but it requires varying eight free parameters.

There are also two models for the RLS method. The first model (RLS-a) strictly follows Battaglia et al. (2013b). We fit the bias data from RadHydro Sim 1 and obtain k_RLS = 2.09 h Mpc⁻¹ and a_RLS = 1.71 when b_RLS is set to b_B13 = 0.593. In addition, we choose $\bar{z}=7.84$ to be the same value from the simulation. The fitted function underestimates the large-scale bias, and the best-fit k_RLS and a_RLS are significantly different from the fiducial values of k_B13 = 0.185 h Mpc⁻¹ and a_B13 = 0.564 in Battaglia et al. (2013b). This model increases z_mid by 0.19, underestimates Δ_z by 17%, underestimates A_z by 25%, and has $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.11$ and $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.09$ . In the second model (RLS-b), we fit for all three bias parameters and obtain b_RLS = 0.895, k_RLS =0.325 h Mpc⁻¹, and a_RLS = 0.816. While k_RLS and a_RLS are now more similar to k_B13 and a_B13, the normalization b_RLS is 51% larger than the previously adopted b_B13. We also choose $\bar{z}=7.61$ in order to match z_mid, but note the difference of 0.24 between them. This model underestimates Δ_z by 16%, underestimates A_z by 30%, and has $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{M}}}{| }_{\max }=0.08$ and $| {\rm{\Delta }}{x}_{{\rm{i}},{\rm{V}}}{| }_{\max }=0.07$ . Note that the RLS method tends to produce more symmetric reionization histories, and it is difficult to produce larger A_z values just by varying the free parameters.

4.4. Reionization-redshift Field

Figure 17 is a visualization of the reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ , and Figure 18 shows the reionization-redshift auto-power spectra ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ and cross correlations relative to RadHydro Sim 1. For AMBER, we vary the radiation mean free path and obtain l_mfp = 3.2 h⁻¹Mpc when matching the auto-power spectrum on the largest scales. This produces a reionization-redshift field that is visually and statistically in the strongest agreement out of all the models. The redshift bias b_zz(k) and cross correlation r_zz(k) show that AMBER is both the least biased and most correlated with the RadHydro simulation.

**Figure 18.** Top: auto-power spectra of the reionization-redshift fields from RadHydro Sim 1, AMBER, 21cmFAST, and RLS. All of the spectra peak at a characteristic scale that depends on the redshift midpoint z_mid. Center: the redshift bias b_zz between the models and simulation shows that AMBER and RLS-b are the least biased. Bottom: the cross correlations r_zz show that AMBER is the most correlated, followed by RLS, while 21cmFAST is the most stochastic.
Download figure:
Standard image High-resolution image

For 21cmFAST, we can also vary the ESF maximum filter scale R_max. It is recommended that ${R}_{\max }\geqslant 20\ {h}^{-1}\mathrm{Mpc}$ , but we are limited to ${R}_{\max }\leqslant 25\ {h}^{-1}\mathrm{Mpc}$ . The radius of the smoothing sphere cannot be larger than half of the periodic box length; otherwise it would wrap around itself. We find no improvement over this narrow range of filtering scales. 21cmFAST-b does better than 21cmFAST-a, because we are able to match both z_mid and Δ_z. Even then it differs more from RadHydro than either AMBER or RLS. The reionization-redshift image shows noticeable differences near biased sources, and this coincides with the redshift bias b_zz(k) being larger than unity by >0.2 on the largest scale. The redshift cross correlation r_zz(k) drops to 0.5 by k ∼ 1 h Mpc⁻¹ and to zero at the grid Nyquist frequency. The stochasticity may be due to using the NGP scheme for the matter density field, the ESF-E version for the collapsed fraction, and the semi-numerical approach for the ionization fraction field. Again, the newer version of 21cmFAST will likely provide better agreement, but it requires searching an eight-dimensional parameter space.

The RLS method is a parametric approach to constructing the reionization-redshift field, so it is not surprising to see that it is visually and statistically in good agreement. On large scales, RSL-b does better than RLS-a because the bias normalization is allowed to vary beyond the fixed value of b_B13 = 0.593. On small scales, both models have significant smoothing because of the spherical tophat filter applied to the overdensity field. In Battaglia et al. (2013b), this implementation choice was made because the linear, deterministic biasing relation, δ_z( k ) ∝ δ( k ), breaks down on scales k ≳ 1 h Mpc⁻¹. RLS has more stochasticity than AMBER, but less than 21cmFAST. The gradual deviation of the redshift cross correlation r_zz(k) from unity reflects the decreasing correlation between the reionization-redshift and matter density fields toward smaller scales.

More studies and comparisons are required to understand the advantages, disadvantages, and appropriate utilization of the various semi-numerical methods. We emphasize that a variety of independent approaches are crucial for theoretical and inference studies of the EoR.

5. Results

We now present some basic results from AMBER and leave further applications for upcoming work. For example, Chen et al. (in prep) will model and study the patchy KSZ effect, which is a promising probe of the EoR. In this section, we study the parameter dependence of the reionization-redshift field (Section 5.1) by varying the redshift midpoint, duration, asymmetry, minimum halo mass, and radiation mean free path.

We run AMBER models, each with 2048³ particles and 2048³ cells in a comoving box of side length L = 2 h⁻¹Gpc. For the fiducial model, we use z_mid = 8.0, Δ_z = 4.0, A_z = 3.0, ${M}_{\min }={10}^{8}\ {h}^{-1}{M}_{\odot }$ , and l_mfp = 3.0 h⁻¹Mpc. In addition, we include a lower and higher value when we vary each parameter one at a time. Each simulation takes approximately 20 CPU hours when run in parallel with 32 to 64 cores. We discuss the code scaling performance in Appendix B.

5.1. Reionization-redshift Fields

Figure 19 is a visualization of the reionization-redshift fields. Each image is 100 × 100 pixels from a subsection that is 100 × 100 h⁻¹Mpc with a thickness of ∼1 h⁻¹Mpc. We use a divergent color scheme centered at the fiducial redshift midpoint z_mid = 8.0 to distinguish between regions that are ionized earlier and later. In general, the higher-density regions near radiation sources are reionized earlier than the lower-density regions farther away on large scales. As a result, a larger fraction of the volume is seen with ${z}_{\mathrm{re}}\lt {z}_{\mathrm{mid}}$ , and the redshift midpoint of the volume-weighted ionization fraction comes later.

Figure 20 shows the one-point PDF of the normalized redshift δ_z (Equation (4)), reionization-redshift auto-power spectra, and cross correlations relative to the matter density field at the redshift midpoint. In general, the PDF (δ_z) are skewed Gaussians except for the symmetric case with A_z = 1.0. The reionization-redshift power spectrum Δ_zz(k) first rises in power until a characteristic peak scale and then declines in amplitude with k. The redshift-matter bias b_zm(k) asymptotically approaches a constant value toward large scales and declines monotonically with k.

5.1.1. Redshift Midpoint

As the redshift midpoint decreases over the range 7.0 ≤ z_mid ≤ 9.0, we see, overall, later reionization in Figure 19. The one-point PDF(δ_z) becomes wider, the two-point power spectrum ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ increases overall in amplitude, but the redshift-matter bias b_zm(k) remains almost constant in Figure 20. In Equation (4) for the normalized redshift δ_z, the numerator ( ${z}_{\mathrm{re}}-{\bar{z}}_{\mathrm{re}}$ ) remains approximately constant, but the denominator ( $1+{\bar{z}}_{\mathrm{re}}$ ) decreases. Therefore, δ_z grows by a factor of ${\bar{a}}_{\mathrm{re}}=1/(1+{\bar{z}}_{\mathrm{re}})$ , whereas the matter overdensity δ grows by a factor D(z_mid) ≈ 1/(1 + z_mid). We find that the normalized redshift grows similarly to the matter overdensity as the bias is only ≈5% lower for z_mid = 7.0 than for z_mid = 9.0. Note that, while the radiation field also grows like the matter density, the abundance matching is insensitive to the overall amplitude.

5.1.2. Duration

As the duration increases over the range 2.0 ≤ Δ_z ≤ 6.0, we see a larger range of values in the reionization-redshift images. The PDF(δ_z) has larger variance, and the power spectrum increases overall in amplitude. Note that the variance is given by ${\sigma }_{{\rm{z}}}^{2}=\langle | {\delta }_{{\rm{z}}}({\boldsymbol{x}}){| }^{2}\rangle =\int {{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)d\mathrm{ln}k$ . Therefore, we expect the large-scale bias to scale with σ_z, which in turn scales with Δ_z. We find that the large-scale bias changes by a factor ≈ 3.1 when the duration increases from Δ_z = 2.0 to Δ_z = 6.0.

The large-scale bias can differ significantly from the single value of b_B13 = 1/δ_c = 0.593 adopted in the RLS method (Battaglia et al. 2013b). Barkana & Loeb (2004) derived this value assuming that fluctuations in the reionization-redshift field are solely determined by fluctuations in the collapsed mass density field from EPS. Relating the growth of ionized regions to the growth of dark matter halos would correspond to a particular duration value. However, the reionization history can depend on other nonconstant factors, and the large-scale bias can have a range of values in general.

5.1.3. Asymmetry

As the asymmetry increases over the range 1.0 ≤ A_z ≤ 5.0, the images show a smaller range of redshift values for the regions reionized after the midpoint and a larger range of redshift values for the regions reionized earlier than the midpoint. The PDF(δ_z) becomes skewed, and the power spectrum changes shape. There is less power on large scales and more power on small scales as the characteristic peak shifts to smaller scales. For a larger A_z at fixed duration, the first portion (z_mid < z < z_ear) of the reionization history becomes longer, while the second portion (z_lat < z < z_mid) becomes shorter. We have just seen that increasing (decreasing) the duration increases (decreases) the overall power. We find that lengthening the first half of the portion slightly increases the power on small scales while shortening the second portion more significantly decreases the power on large scales. Correspondingly, the large-scale bias decreases with more asymmetry in the reionization history.

5.1.4. Minimum Halo Mass

Over the minimum halo mass range ${10}^{7}\leqslant {M}_{\min }/[{h}^{-1}{M}_{\odot }]\,\leqslant {10}^{10}$ , there are only minor changes to the reionization-redshift field. The PDF(δ_z) only has minor changes since the redshift midpoint, duration, and asymmetry are kept fixed. There is minor suppression in ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ and b_zm(k) on small scales. For larger M_min, the collapsed fraction decreases and the halo bias increases. Correspondingly, this decreases the amplitude and increases the bias of the radiation intensity field (Equation (36)). However, the abundance matching technique is insensitive to the overall normalization of J_ν and to the overall normalization of the fluctuating component δ_J. The abundance matching only depends on the relative ranking of the field values.

On one hand, varying the minimum halo mass and halo density field will generally change the reionization history and process if astrophysical parameters such as the star formation efficiency, photon production efficiency, and radiation escape fraction are kept fixed. On the other hand, varying these astrophysical terms, which can be functions of mass and redshift, can offset the changes from varying M_min and produce the same reionization history. With AMBER, we find that when the redshift midpoint, duration, asymmetry, and radiation mean free path are kept fixed, then the reionization-redshift field has only minor dependence on the minimum halo mass. For certain applications, it may then be possible to ignore M_min and reduce the number of free parameters.

5.1.5. Radiation Mean Free Path

As the radiation mean free path increases over the range 1.0 ≤ l_mfp/[h⁻¹Mpc] ≤ 5.0, we see more spatial coherence in the reionization-redshift images. For a larger l_mfp, there is a larger coherence length and less small-scale structure in the radiation intensity field used for abundance matching. The PDF(δ_z) only has minor changes since the redshift midpoint, duration, and asymmetry are kept fixed. The reionization-redshift power spectrum changes shape though, having more power on large scales and less on small scales. Increasing l_mfp produces relatively larger ionized regions, resulting in the characteristic peak shifting to larger scales.

In the RLS method, Battaglia et al. (2013b) change the duration by varying a_RLS and k_RLS. But without varying b_RLS, they also change the correlation length of ionized regions at the same time. More specifically, they obtain smaller correlation lengths for longer durations. With AMBER, we find longer Δ_z increases the bias, while shorter l_mfp decreases the bias. This explains the inverse relationship between the duration and correlation length when b_RLS is fixed in the RLS method. AMBER has the flexibility to change Δ_z and l_mfp independently.

6. Conclusion

AMBER is a semi-numerical code for modeling the cosmic dawn and EoR. The new algorithm is not based on the ESF for directly predicting the ionization fraction field, but takes the novel approach of calculating the reionization-redshift field ${z}_{\mathrm{re}}({\boldsymbol{x}})$ assuming that the hydrogen gas encountering higher radiation intensity is photoionized earlier. Redshift values are assigned while matching the abundance of ionized mass according to a given mass-weighted ionization fraction x_i(z). The code has the unique advantage of allowing users to directly specify the reionization history through the redshift midpoint z_mid, duration Δ_z, and asymmetry A_z input parameters (Trac 2018). The reionization process is further controlled through the minimum halo mass M_min for galaxy formation and the radiation mean free path l_mfp for RT.

We implement improved methods for modeling the large-scale structure and constructing density, velocity, halo, and radiation fields, which are essential components for modeling reionization observables. 2LPT (versus 1LPT) is used to efficiently evolve the matter distribution in the moderately nonlinear regime. The TSC (versus NGP or CIC) PM assignment scheme with interlacing and deconvolution produces density and velocity fields that are in excellent agreement with RadHydro simulations (Doussot et al. 2019). ESF-L (versus ESF-E) accurately and robustly produces halo mass density fields compared to N-body simulations (Trac et al. 2015). The radiation intensity field is computed quickly using FFTs to convolve the source field with the RT kernel. The interlacing and deconvolution reduce aliasing and smoothing, and thereby improve power spectrum estimation (e.g., Sefusatti et al. 2016). These methods can also be used to improve other semi-numerical and RT algorithms. The AMBER program modules can be adapted and plugged into other codes.

We compare AMBER with two other semi-numerical methods, 21cmFAST (Mesinger et al. 2011) and the RLS method (Battaglia et al. 2013b), and find that our code more accurately reproduces the results from RadHydro simulations. AMBER has the flexibility to directly match the simulated reionization history x_i(z), while the other two methods require running multiple models before finding the best fit. It also produces reionization-redshift fields ${z}_{\mathrm{re}}({\boldsymbol{x}})$ that are the least biased and most correlated with the simulations. The RLS method has more smoothing, while 21cmFAST has more stochasticity. More studies and comparisons are required to understand the advantages, disadvantages, and appropriate utilization of the various semi-numerical methods. We reiterate that a variety of independent approaches are crucial for theoretical and inference studies of the EoR.

We study the parameter dependence of the reionization-redshift field by varying the redshift midpoint, duration, asymmetry, minimum halo mass, and radiation mean free path. As z_mid decreases, there is overall later reionization, the PDF(δ_z) becomes wider, the power spectrum ${{\rm{\Delta }}}_{\mathrm{zz}}^{2}(k)$ increases overall in amplitude, but the redshift-matter bias b_zm(k) remains approximately constant. As Δ_z increases, there is more extended reionization, the redshift distribution becomes wider, and both the power spectrum and bias increase overall in amplitude. As A_z increases, the reionization history becomes more asymmetric, the redshift distribution is more skewed, and the power spectrum changes shape. Interestingly, the reionization-redshift field is only weakly sensitive to M_min when the other parameters are fixed. As l_mfp increases, there is more spatial coherence and less small-scale structure in the radiation intensity and reionization-redshift fields, and the power spectrum changes shape due to the relatively larger ionized regions.

AMBER is initially designed for theoretical and inference studies in which the primary interest is controlling or constraining the reionization history and process. With the reionization-redshift field, the evolution of the neutral hydrogen and ionized electron densities can be readily computed. As a first application, Chen et al. (in prep) will model and study the imprint of reionization on the CMB through the patchy KSZ effect, which is a promising probe of the EoR. Complementary, the connection between radiation sources and sinks, and the reionization history can be understood using analytical models for the evolution of the mass-weighted ionization fraction (e.g., Chen et al. 2020) and radiation-hydrodynamic simulations that solve the complex physics.

In future work, we will include additional physics that will enable more realistic predictions of EoR observables. To improve the source models, we will calibrate the ESF halo-collapsed fraction against N-body simulations, and include high-redshift galaxies through abundance matching (e.g., Trac et al. 2015). To improve the radiation and reionization-redshift fields, we will incorporate spatially varying radiation mean free paths (e.g., Cain et al. 2021; Davies & Furlanetto 2021), and perform the ionization abundance matching tomographically using multiple redshifts. We will compute the Lyα and X-ray radiation fields for the 21 cm signal (e.g., Mesinger et al. 2011; Semelin et al. 2017) and calculate the thermal history for the Lyα forest (e.g., Upton Sanderbeck et al. 2016; D'Aloisio et al. 2019).

AMBER is currently parallelized to run on multicore, shared-memory nodes. It is over four orders of magnitude faster than radiation-hydrodynamic simulations and will efficiently enable large-volume models, full-sky mock observations, and parameter-space studies. The code will be made publicly available to facilitate and transform studies of the EoR.

We thank Nick Gnedin for initial discussions that motivated the development of AMBER. We thank Anson D'Aloisio, Matt McQuinn, and the anonymous referee for providing comments on the manuscript. We also thank Nick Battaglia, Sasha Kaurov, Emiliano Sefusatti, and Qirong Zhu for helpful discussions. The computational work was performed on the Bridges-2 system at the Pittsburgh Supercomputing Center (PSC) using allocations ast190014p and phy210042p. We thank PSC staff Ken Hackworth and T.J. Olesky for computing support. H.T. acknowledges support from the NSF AI Institute: Planning: Physics of the Future, NSF PHY2020295. R.C. acknowledges support from NSF AST2007390.

Appendix A: Weibull Function

The reionization history ${\bar{x}}_{{\rm{i}}}(z)$ can be more accurately fit with a flexible Weibull function (Equation (14)) than a symmetric tanh function (e.g., Lewis 2008). The Weibull coefficients can be determined by first solving a nonlinear equation for c_w and then substituting its value into algebraic equations for a_w and b_w. There are several variations of the root-finding equation, for example:

$\begin{eqnarray}&&f({c}_{{\rm{w}}})={\left(\displaystyle \frac{\mathrm{ln}{\bar{x}}_{\mathrm{ear}}}{\mathrm{ln}{\bar{x}}_{\mathrm{mid}}}\right)}^{1/{c}_{{\rm{w}}}}-\left(\displaystyle \frac{{z}_{\mathrm{ear}}-a}{{z}_{\mathrm{mid}}-a}\right)=0,\end{eqnarray} \tag{ A1 }$

where

$\begin{eqnarray}&&a=\displaystyle \frac{{z}_{\mathrm{lat}}| \mathrm{ln}{x}_{\mathrm{mid}}{| }^{1/{c}_{{\rm{w}}}}-{z}_{\mathrm{mid}}| \mathrm{ln}{x}_{\mathrm{lat}}{| }^{1/{c}_{{\rm{w}}}}}{| \mathrm{ln}{x}_{\mathrm{mid}}{| }^{1/{c}_{{\rm{w}}}}-| \mathrm{ln}{x}_{\mathrm{lat}}{| }^{1/{c}_{{\rm{w}}}}},\end{eqnarray} \tag{ A2 }$

which can be solved iteratively using Newton's method with finite differences instead of analytical derivatives (i.e., secant method). The other two coefficients have algebraic solutions, for example:

$\begin{eqnarray}&&{a}_{{\rm{w}}}=\displaystyle \frac{{z}_{\mathrm{ear}}| \mathrm{ln}{x}_{\mathrm{lat}}{| }^{1/{c}_{{\rm{w}}}}-{z}_{\mathrm{lat}}| \mathrm{ln}{x}_{\mathrm{ear}}{| }^{1/{c}_{{\rm{w}}}}}{| \mathrm{ln}{x}_{\mathrm{lat}}{| }^{1/{c}_{{\rm{w}}}}-| \mathrm{ln}{x}_{\mathrm{ear}}{| }^{1/{c}_{{\rm{w}}}}},\end{eqnarray} \tag{ A3 }$

$\begin{eqnarray}&&{b}_{{\rm{w}}}=\displaystyle \frac{{z}_{\mathrm{mid}}-{a}_{{\rm{w}}}}{| \mathrm{ln}{x}_{\mathrm{mid}}{| }^{1/{c}_{{\rm{w}}}}}.\end{eqnarray} \tag{ A4 }$

Note that the equations above can also be written in terms of the redshift midpoint, duration, and asymmetry parameters by replacing z_ear and z_lat with Δ_z and A_z using Equations (12) and (13).

Appendix B: Scaling Performance

AMBER is written in modern Fortran and parallelized using OpenMP to run on multicore, shared-memory nodes. We perform the following scaling tests on Bridges-2 at the Pittsburgh Supercomputing Center (PSC). An Extreme Memory (EM) node has 4 Intel Xeon Platinum 8260M "Cascade Lake" CPUs, 24 cores per CPU, 96 cores per node, and a total of 4 TB of memory.

Figure 21 shows the results of the scaling tests. For the strong scaling test, AMBER is run with N_part = 2048³ particles and equal number of mesh cells in a comoving box of side length L = 2 h⁻¹Gpc. We vary the number of cores N_core from 1 to 64 and measure the wall time $t({N}_{\mathrm{core}})$ . The speedup is defined as

$\begin{eqnarray}&&S=\displaystyle \frac{t(1)}{t({N}_{\mathrm{core}})},\end{eqnarray} \tag{ B1 }$

and we obtain S = 5.4 (21.8) for ${N}_{\mathrm{core}}=8\ (64)$ . The deviation from the ideal speedup is mostly due to our parallel quicksort, which is not fully optimized yet. The serial sorting takes about 20% of the total wall time, while the parallel sorting takes 29% (47%) for ${N}_{\mathrm{core}}=8\ (64)$ . Nonetheless, a typical AMBER model can be quickly run in under an hour of wall time with 32 to 64 cores.

**Figure 21.** Top: strong scaling test with a fixed number of N_part = 2048³ particles for each AMBER run. The measured wall time t is compared against the ideal scaling. Bottom: weak scaling test with a particle/core ratio ${N}_{\mathrm{part}}/{N}_{\mathrm{core}}\approx {512}^{3}$ .
Download figure:
Standard image High-resolution image

For the weak scaling test, we use a particle/core ratio of ${N}_{\mathrm{part}}/{N}_{\mathrm{core}}\approx {512}^{3}$ . We vary N_core from 1 to 64 and change N_part accordingly. The efficiency is defined as

$\begin{eqnarray}&&E=\displaystyle \frac{t(1)}{t({N}_{\mathrm{core}})},\end{eqnarray} \tag{ B2 }$

and we obtain E = 61% (30%) for ${N}_{\mathrm{core}}=8\ (64)$ . Again, the decrease in efficiency is mostly due to our nonoptimal quicksort. The serial sorting takes about 14% of the total wall time, while the parallel sorting takes 27% (47%) for ${N}_{\mathrm{core}}=8\ (64)$ . We will improve the parallelization of the algorithm in future work.

AMBER: A Semi-numerical Abundance Matching Box for the Epoch of Reionization

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. SCORCH

2.1. N-body Simulations

2.2. Radiation-hydrodynamic Simulations

2.3. Reionization-redshift Fields

2.3.1. Cross Correlations

3. AMBER

3.1. Reionization History

3.1.1. Midpoint, Duration, and Asymmetry

3.1.2. Analytical Interpolating Function

3.2. Lagrangian Perturbation Theory

3.2.1. Density Fields

3.2.2. Velocity Fields

3.3. Excursion Set Formalism

3.3.1. Halo Mass Density Fields

3.4. Abundance Matching

3.4.1. Radiation Field

3.4.2. Reionization-redshift Fields

4. Methods Comparison

4.1. 21cmFAST

4.2. Reionization on Large Scales

4.3. Reionization History

4.4. Reionization-redshift Field

5. Results

5.1. Reionization-redshift Fields

5.1.1. Redshift Midpoint

5.1.2. Duration

5.1.3. Asymmetry

5.1.4. Minimum Halo Mass

5.1.5. Radiation Mean Free Path

6. Conclusion

Appendix A: Weibull Function

Appendix B: Scaling Performance

AMBER: A Semi-numerical Abundance Matching Box for the Epoch of Reionization

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. SCORCH

2.1. N-body Simulations

2.2. Radiation-hydrodynamic Simulations

2.3. Reionization-redshift Fields

2.3.1. Cross Correlations

3. AMBER

3.1. Reionization History

3.1.1. Midpoint, Duration, and Asymmetry

3.1.2. Analytical Interpolating Function

3.2. Lagrangian Perturbation Theory

3.2.1. Density Fields

3.2.2. Velocity Fields

3.3. Excursion Set Formalism

3.3.1. Halo Mass Density Fields

3.4. Abundance Matching

3.4.1. Radiation Field

3.4.2. Reionization-redshift Fields

4. Methods Comparison

4.1. 21cmFAST

4.2. Reionization on Large Scales

4.3. Reionization History

4.4. Reionization-redshift Field

5. Results

5.1. Reionization-redshift Fields

5.1.1. Redshift Midpoint

5.1.2. Duration

5.1.3. Asymmetry

5.1.4. Minimum Halo Mass

5.1.5. Radiation Mean Free Path

6. Conclusion

Appendix A: Weibull Function

Appendix B: Scaling Performance