1. Introduction
Louisiana is one of the most flood-prone states in the United States, and tropical cyclones (TCs) have been major drivers responsible for many floods. Over the past 40 years, dozens of TC events including Katrina (2005) and Harvey (2017), the top two costliest storms among all TC storms in the United States (
$180 and
$143.8 billion, respectively), affected this area [
1]. In 2020, eastern Louisiana experienced a series of landfalls by three TCs (i.e., Beta, Delta, and Zeta) in a two-month window, causing a total of 11 fatalities and around
$7.5 billion in damage in the United States [
2,
3,
4]. While TCs are often the major culprits for flooding, other mechanisms can also be responsible for this natural hazard, as exemplified by the catastrophic flood that occurred in 2016 around Baton Rouge and Lafayette in South Louisiana; this event caused severe economic damages and fatalities because of extreme rainfalls formed due to a stationary low-pressure system near Florida and Alabama [
5]. Precipitation is one of the main drivers that can directly affect the occurrence and magnitude of these natural disasters. Therefore, when conducting historical analysis and calibration of hydrologic and real-time prediction models for managing the risk associated with various rainfall-induced disasters, it is essential to use high-quality historical rainfall data.
There have been many attempts to develop gridded precipitation datasets covering the contiguous United States (CONUS) with consistent spatial and temporal resolutions; these efforts generally involve the integration of multiple data sources, including rain gauge, radar, satellite, and reanalysis products. The Analysis of Period of Record for Calibration (AORC) dataset is a new product released in October 2021 and represents a source of meteorological variables for calibrating the National Water Model (NWM) [
6]. The AORC precipitation is constructed using over a dozen individual rainfall datasets covering different time periods and areas with various spatial and temporal resolutions. By utilizing these datasets to complement each other (e.g., bias correction and downscaling over full coverage of the AORC domain), the AORC precipitation is available on a ~4 km grid cell at an hourly time scale covering the period from 1979 to the near present.
A limited number of studies have utilized the AORC dataset in the field of hydrology [
7,
8]. Lahmers et al. (2021) [
7] used AORC atmospheric datasets including precipitation, temperature, humidity, wind speed, surface pressure, and incoming short- and long-wave radiation, to calibrate the NOAA NWM (version 2.1) to 56 independent basins across the western CONUS. Hong et al. (2022) [
8] evaluated four gridded precipitation datasets including the AORC for the North American Great Lakes basin by comparing the products with observations for the past decade. They illustrated that the AORC precipitation performs well over daily and monthly temporal scales, and it captures the seasonal variations of the ratio between overland and overlake precipitations. Their results suggest that the AORC could be an appropriate forcing input to hydrological models for the entire Great Lakes watershed.
Despite the high temporal and spatial resolutions, the long record, and the good accuracy of the AORC precipitation, there is little research in terms of its evaluation, especially in an area that experiences heavy precipitation from TCs and non-TC storms. Here we evaluate AORC precipitation and compare it to other widely used gridded rainfall products for the state of Louisiana. To evaluate the rainfall products according to different weather conditions causing severe flooding, we stratify the analyses depending on whether precipitation is associated with a TC or not.
2. Data and Methodology
We focus on a domain centered on Louisiana (
Figure 1), and a study period from 1981 to 2020 because of the data availability for all rainfall products considered here. To identify the TCs that affected the study area, we use the HURDAT2 database by the National Hurricane Center [
9]. It contains location, maximum sustained surface wind, central pressure, and wind radii of each recorded storm every six hours during their lifetime. There is a total of 57 TCs that passed through the study area from 1981 to 2020 (
Figure 1). We then divide the whole 40-year period into TC and non-TC periods to stratify precipitation accordingly. For this purpose, we linearly interpolate the 6 h location of the storm center to the hourly scale and define the period during which the storm centers are located within the study area as a TC period.
In terms of rainfall datasets, we consider four gridded rainfall products: (1) AORC, (2) North American Land Data Assimilation System-Phase 2 (NLDAS-2), (3) Parameter-Elevation Regression on Independent Slopes Model (PRISM) downscaled using NLDAS-2 (hereafter referred to as PRISM-NLDAS), and (4) PRISM-NLDAS with bias correction. We also consider the Integrated Surface Database (ISD) as reference data for evaluation of rainfall products. The AORC dataset consists of gridded near-surface meteorological variables including one-hour accumulated precipitation covering the CONUS and Alaska and nearby hydrologically contributing areas [
6]. The CONUS AORC product spans from 1979 to near-present and is produced on a 0.008 decimal degree (~1 km) grid mesh. The AORC hourly precipitation data are modeled based on NLDAS-2 and Stage IV daily precipitation. After the bias correction for retaining long-term consistency with various gauge-based and PRISM climatological datasets, the daily precipitation is disaggregated to the hourly scale using numerous available hourly precipitation datasets such as radar, satellite, reanalysis, and Global Data Assimilation System (GDAS). For spatial downscaling, coarser resolution datasets (e.g., NLDAS-2) are disaggregated based on a finer dataset (e.g., PRISM). This approach ensures the characteristics of coarser datasets while adopting more realistic spatial variability of the finer dataset. The initial CONUS AORC product with ~1 km spatial resolution is separated into 12 hydrologic areas of ~4 km resolution and distributed through the ‘AORC 4-km Version 1.1’ publicly released by the National Weather Service (NWS). More detailed descriptions of the development process of AORC are provided by NWS office of Water Prediction (2021) [
10].
The ISD is a global dataset of hourly and synoptic surface observations merged from over 35,000 stations into a common format and data model. The ISD-Lite hourly precipitation is a subset of the full ISD after removing sub-hourly and duplicate observations (see Smith et al. (2011) [
11] for more details). Here we assume that the ISD-Lite hourly precipitation is independent from gridded rainfall products and use it as reference because, to the best of our knowledge, the ISD-Lite hourly precipitation was not directly utilized to yield hourly rainfall products. We only included rain gauges that had at least one year of hourly observations during the study period. Before using this dataset as a reference, we performed some basic quality control, including plotting the spatial correlation among all the possible pairs of stations (
Figure S1). In a plot of this kind, while we do not know what the true spatial dependence should look like, we focus on points that fall outside the general patterns.
Figure S1a shows the distribution of spatial correlations for 39 rain gauges with at least one year of hourly precipitation. Several pairs of stations show unusual results such as negative correlations and very low correlations even at closer distances. Therefore, as shown in
Figure S1b, we retained 35 rain gauges as reference where they follow a consistent pattern (see
Figure 1 for their location).
The North American Land Data Assimilation System (NLDAS) project is currently providing NLDAS-2 [
12,
13] quality-controlled datasets derived from the best available reanalyses and observations. The NLDAS-2 datasets cover the period from 1979 to the near present at hourly temporal resolution and are in a 0.125 decimal degree (~12 km) grid size covering the CONUS and portions of Canada and Mexico. The NLDAS-2 hourly precipitation is a product of a temporal disaggregation of the Climate Prediction Center unified gauge-only analysis of daily precipitation. After the orographic adjustment of gauge-based daily precipitation using PRISM (see below for a brief description), the daily precipitation is disaggregated into hourly fields by deriving hourly disaggregation weights from various sources of hourly products such as radar, satellite, and reanalysis precipitations.
The PRISM dataset, developed by the PRISM Climate Group at Oregon State University, is a widely used gridded climate product. This dataset is produced based on observations from more than 10,000 surface stations, and spatially interpolated using the climate-elevation regression for digital elevation model grid cells (see Daly et al. (2002) and Daly et al. (2008) [
14,
15] for more details). The PRISM daily precipitation is provided at 2.5 arcmin (~4 km) resolution throughout the United States covering the period from 1981. To disaggregate the daily precipitation to the hourly scale, we conduct a simple temporal downscaling using the NLDAS-2 hourly products. That is, by using the 24-h precipitations of the nearest NLDAS-2 cell as disaggregation weights, the PRISM daily precipitation is split into hourly precipitation (i.e., PRISM-NLDAS).
Moreover, we perform the daily-based spatial bias correction for PRISM-NLDAS data using the 35 selected ISD-Lite rain gauges. The procedure of the daily-based spatial bias correction is as follows:
- (1)
Find the grid cells of PRISM-NLDAS containing the rain gauge.
- (2)
For each grid (or gauge), accumulate PRISM-NLDAS and rain-gauge hourly precipitations within a seven-day window centered on the i-th day over the period from 1981 to 2020 (i.e., 40 years). To obtain reliable bias estimates, only use the grid cell where the record length is at least 10 years.
- (3)
Calculate the relative bias by dividing the accumulated rain-gauge precipitations by the accumulated PRISM-NLDAS precipitations.
- (4)
For each i-th day (from 1 January to 31 December), create a spatial bias field by interpolating the relative bias of all gauges using the inverse distance weight method.
- (5)
Obtain the bias-corrected PRISM-NLDAS by multiplying the PRISM-NLDAS by the spatial bias.
To evaluate the performance of gridded (i.e., modeled) rainfall products, we extract the time series of hourly precipitations from each grid cell where the reference rain gauge is contained and aggregate all grid cells. For all the pairs of modeled and rain-gauge-based hourly precipitations, we calculate the Pearson’s correlation coefficient and mean square error (MSE) skill score (
). The
can be decomposed into three components as follows [
16,
17]:
where
is the square of the correlation between modeled and observed hourly precipitations,
and
are the mean of modeled and observed hourly precipitations, respectively, and
and
are the standard deviation of modeled and observed hourly precipitations, respectively. The first term on the right side of Equation (1) indicates the potential skill (i.e., coefficient of determination), which could be obtained in the absence of biases. The second term refers to the conditional bias (i.e., slope reliability), which indicates the variability of the modeled values with respect to observations. The last term represents the unconditional bias (i.e., standardized mean error), which indicates the overall shift in the modeled precipitations compared to the observations. The use of such decomposition allows us to distinguish different types of biases inherent in the modeled precipitation.
Additionally, as summarized in
Table 1 we consider the error metrics (mean error, ME; relative bias, RBIAS; root mean square error, RMSE) and the contingency metrics (probability of detection, POD; false alarm ratio, FAR; critical success index, CSI), which are commonly used to evaluate rainfall products (e.g., [
18,
19,
20,
21]). The ME and RBIAS measure the magnitude of over- or underestimation of modeled precipitations, while the RMSE measures the accuracy of modeled precipitations by considering the overall deviation between modeled and observed precipitations. The contingency metrics describe the ability to detect precipitation occurrence. POD is defined as the number of correct precipitation occurrences divided by the total number of observed precipitation occurrences. FAR is defined as the number of false precipitation occurrences divided by the total number of modeled precipitation occurrences. CSI is a function of POD and FAR that describes the overall performance of detecting precipitation occurrences.
We apply the abovementioned measures to gridded rainfall products for the entire study period, and attribute precipitation to TCs or not to evaluate the performance according to different weather extremes.
3. Results and Discussion
First, we examine the correlation coefficient between gridded rainfall products and rain-gauge observations.
Figure 2 shows the 2D histograms of rain-gauge and gridded hourly precipitation for the entire record, TC, and non-TC periods. For the entire period, the AORC performs the best among all rainfall products, with the highest correlation coefficient of 0.753. In addition, pairs of AORC and rain-gauge hourly precipitations follow the 1:1 line the best over the other products. The performance of the AORC is followed by that of the PRISM-NLDAS and bias-corrected PRISM-NLDAS. The raw and bias-corrected PRISM-NLDAS have a correlation coefficient of around 0.67, and their results are dispersed more broadly than what is observed for the AORC. NLDAS-2 performs worse than the other products with the lowest correlation coefficient of 0.657. When we focus on TC-rainfall, all products show a better performance compared to the results for the whole period. The AORC has the highest correlation coefficient (0.781), while NLDAS-2 performs the worst (correlation coefficient of 0.694). The bias-corrected PRISM-NLDAS performs slightly better than the raw PRISM-NLDAS, with a performance in between AORC and NLDAS-2. For non-TC rainfall, the results are similar to those observed for the whole period.
When we aggregate hourly precipitation to the daily scale, all rainfall products show higher correlation coefficients compared to the results at hourly scale (
Figure S2). For the entire and non-TC periods, the AORC has the highest correlation coefficient (over 0.89), while NLDAS-2 has the lowest value of 0.848. For TC rainfall, the AORC and raw and bias-corrected PRISM-NLDAS perform very well (correlation coefficient around 0.9).
To quantify the alignment of the points along the 1:1 line, we compute the
and its components, including the potential skill and biases. Furthermore, we stratify the results across different months to examine whether there are seasonal dependences in the performance of these products.
Figure 3 illustrates the
and its components of gridded hourly rainfall products. For the whole period (left column in
Figure 3), the AORC shows the highest
followed by the bias-corrected PRISM-NLDAS, while the NLDAS-2 and the raw PRISM-NLDAS show lower
values. All rainfall products have lower
in the warm season due to the reduction of potential skills (
). The AORC has distinct and more stable potential skills for all 12 months compared to other rainfall products, which led to its highest performance. In contrast, the NLDAS-2 has smaller biases compared to other products but shows worse
due to its lower potential skill, indicating that the NLDAS-2 has overall the lowest association with observations. In the case of raw and bias-corrected PRISM-NLDAS, they have very similar potential skills and unconditional bias regardless of whether bias corrected or not. However, the conditional bias decreases notably after bias correction. This result indicates that our bias-correction method is effective in correcting conditional bias related to slope reliability.
For the TC period (second column in
Figure 3), there are variations in the evaluation measures depending on the months. The AORC shows the highest
in the summer (June–August) while the bias-corrected PRISM-NLDAS has the highest value in the fall (September–November). The NLDAS-2 and the raw PRISM-NLDAS show lower
values in most months. When examining the results for its decomposed components, the AORC has a larger conditional bias in the fall season. Particularly in November, the AORC has significant conditional and unconditional biases: this is because there were only two TCs that affected Louisiana in November, leading to a very small sample size (see
Figure S3). Nevertheless, the AORC shows robust performance, with a more muted sensitivity to monthly changes and the highest
for aggregated samples for all months. In the case of bias-corrected PRISM-NLDAS, although it shows the highest
for the fall season, the overall
for all months decreases because of its lower potential skill in the summer season. The results for the non-TC period show very similar patterns with those for the entire period; the AORC performs the best compared to other products.
In the case of daily scale as shown in
Figure S4, the AORC shows higher and more stable
over all months for the whole and non-TC periods, compared to other products. For TC rainfall (second column in
Figure S4), the bias-corrected PRISM-NLDAS has the highest
for aggregated samples for all months. The AORC and the raw PRISM-NLDAS also show a good performance (
over 0.8).
We extend our evaluation analyses to error metrics and contingency metrics as shown in
Figure 4. For error metrics (top three panels in
Figure 4), The AORC and the raw PRISM-NLDAS have higher positive ME and RBIAS values than the NLDAS-2 and the bias-corrected PRISM-NLDAS. This result indicates that the AORC and the raw PRISM-NLDAS slightly overestimate with respect to the observations, and it is consistent with the result of the unconditional bias (see
Figure 3). Nevertheless, the AORC shows the lowest RMSE for all months regardless of whether the precipitation was associated with a TC or not. For contingency metrics (bottom three rows in
Figure 4), the AORC generally has the highest POD values for the whole and non-TC periods. Although the AORC shows a worse performance compared to other products for the TC period, it performs well with a higher POD over 0.9 for aggregated samples for all months. FAR values of the AORC and raw and bias-corrected PRISM-NLDAS are very similar to each other, while the NLDAS-2 shows slightly higher (i.e., worse) values for all periods. Consequently, the AORC and raw and bias-corrected PRISM-NLDAS show very similar CSI values. For the whole and non-TC periods, their CSI values vary from 0.5 (warm season) to 0.7 (cold season) but the AORC and raw and bias-corrected PRISM-NLDAS show overall good performance, with CSI around 0.6 for aggregated samples for all months. For the TC period, all products perform well, with CSI over 0.7.
In the case of daily scale, although the AORC slightly under- and overestimates for TC and non-TC periods, respectively, it shows competitive performance with lower RMSE values (top three rows in
Figure S5). For contingency metrics (bottom three rows in
Figure S5), the AORC clearly shows the highest CSI values for all 12 months for the whole and non-TC periods. For the TC period, the AORC also performs the best for aggregated samples for all months.
All results stratified by TC and non-TC periods consistently indicate that the AORC data provide reliable rainfall product at an hourly scale as well as at a daily scale, suggesting that the AORC precipitation can represent a viable forcing input for hydrologic modeling and simulation of both TC and non-TC events.
4. Conclusions
The use of accurate and reliable long-term rainfall is crucial for hydrological modeling and simulation of weather-related disasters. In this study, we evaluated the AORC, a new gridded rainfall product, by comparing it to other gridded high-resolution products that are widely used for hydrologic modeling (i.e., NLDAS-2, PRISM-NLDAS, and bias-corrected PRISM-NLDAS) and rain gauges. For the 1981–2020 period, we identified 57 TCs that passed over Louisiana, and stratified the record depending on whether the rainfall was caused by a TC or not. Results showed that the AORC hourly precipitation had the highest correlation coefficients over 0.75 with respect to observations among all hourly rainfall products for both TC and non-TC periods. When we decomposed the skill score into the potential skill, conditional bias, and unconditional bias, we found that the AORC distinctly showed the highest potential skill regardless of months for the whole period. Compared to the potential skill, the AORC had relatively small conditional and unconditional biases, which led to the best skill score. The same held true for TCs, as the AORC exhibited a better and more robust performance compared to the other products. In addition, the overall performance of AORC for the TC period was better than that for the non-TC period. When we aggregated hourly precipitation to daily scale, the AORC also performed very well (correlation coefficient and skill score around 0.9 and 0.8, respectively, regardless of whether TC or not). Our results suggest that the AORC shows good potential to be a viable product for hydrologic modeling of TC- and non-TC-related events.