North American Summer PDSI Reconstructions, Version 2a ----------------------------------------------------------------------- World Data Center for Paleoclimatology, Boulder and NOAA Paleoclimatology Program ----------------------------------------------------------------------- NOTE: PLEASE CITE CONTRIBUTORS WHEN USING THIS DATA!!!!! NAME OF DATA SET: North American Summer PDSI Reconstructions, Version 2a LAST UPDATE: 5/2008 (Updated grid reconstructions thru 2006. The grid spacing and series length are unchanged from the 2004 version, but more trees were used to improve the estimates of PDSI) CONTRIBUTORS: Ed Cook, Lamont-Doherty Earth Observatory IGBP PAGES/WDCA CONTRIBUTION SERIES NUMBER: 2008-046 SUGGESTED DATA CITATION: Cook, E.R., et al. 2008. North American Summer PDSI Reconstructions, Version 2a. IGBP PAGES/World Data Center for Paleoclimatology Data Contribution Series # 2008-046. NOAA/NGDC Paleoclimatology Program, Boulder CO, USA. ORIGINAL REFERENCES: Cook, E.R., D.M. Meko, D.W. Stahle, and M.K. Cleaveland. 1999. Drought reconstructions for the continental United States. Journal of Climate, 12:1145-1162. Cook, E.R., C.A. Woodhouse, C.M. Eakin, D.M. Meko and D.W. Stahle. 2004. Long-Term Aridity Changes in the Western United States. Science, Vol. 306, No. 5698, pp. 1015-1018, 5 November 2004. FUNDING SOURCES: The drought reconstructions were developed through support from the jointly managed NSF/NOAA ESH program through NOAA Project Award NA06GP0450, "Collaborative research: Reconstruction of drought and streamflow over the coterminous United States from tree rings, with extensions into Mexico and Canada" (Principal Investigators: E.R. Cook, U. Lall, C. Woodhouse, D.M. Meko). Additional support was provided by NSF, Division of Atmospheric Sciences, Paleoclimate Program through ATM 03-22403, "Development of a North American Drought Atlas" (Principal Investigator: E.R. Cook). GEOGRAPHIC REGION: North America PERIOD OF RECORD: 0 - 2006 AD DESCRIPTION: Text File Format of the Complete Grid-Point Data Matrix Files in this directory: NADAv2a.txt: Summer PDSI reconstructions in a text grid NADAv2a.zip: Zipped version of summer PDSI reconstructions in a text grid These complete files contain the information for all 286 grid-points in one large matrix. The beginning year is 0 and the last year 2006 As before, missing values are indicated by -99.999 flags. The columns represent the grid point and the rows represent years. The first row on each column contains the grid point number. There is also a netCDF file, NADAv2-2008.nc, which contains all the reconstructions in one file. STATISTICAL DATA HAS NOT BEEN GENERATED FOR UPDATED DATA YET. CONTACT ED COOK FOR MORE INFORMATION. The four statistics that were used as measures of association between the actual and estimated PDSI in the last version of the reconstructions are described below. 1) Calibration R-SQuare (CRSQ). This statistic measures the percent PDSI variance explained by the tree-ring chronologies at each grid point over the 1928-1978 calibration period, based on a regression modeling procedure described in Cook et al. (1999). As defined here, CRSQ is equivalent to the "coefficient of multiple determination" found in standard statistic texts. It ranges from 0 (no calibrated variance) to 1.0 (perfect agreement between instrumental PDSI and the tree-ring estimates). The former represents complete failure to estimate PDSI from tree rings and the latter is not plausible if the model is not seriously over-fit. In our case here, the median (middle or 50th percentile value) CRSQ over all 286 grid points is 0.514. This means that approximately 1/2 of the PDSI variance is being explained by tree- ring chronologies when viewed over the entire 286 grid point domain. In dendroclimatology, this level of calibrated variance is considered to be, in general, quite good. 2) Verification R-SQuare (VRSQ). This statistic measures the percent PDSI variance in common between actual and estimated PDSI in the 1900-1927 verification period. It is calculated as the square of the Pearson correlation coefficient, which is a well known measure of association between two variables. VRSQ also ranges from 0 to 1.0 (VRSQ is assigned a 0 value if the correlation is negative). Roughly speaking, VRSQ>0.11 is statistically significant at the 1-tailed 95% level using our 28 year verification period data. In our case, the median VRSQ over all 286 grid points is 0.445. This drop from 0.514 in the calibration period is expected, but relatively modest. Thus, the overall fidelity of the reconstructions is verified well when compared with withheld PDSI data. 3) Verification reduction of error (RE). This statistic was originally derived by Edward Lorenz as a test of meteorological forecast skill. Unlike CRSQ and VRSQ, RE has a theoretical range of -infinity to 1.0. Over the range 0-1.0, RE expresses the degree to which the estimates over the verification period are better than "climatology", i.e. the calibration period mean of the actual data. So, a positive RE means that the PDSI estimates are better than just using the calibration period mean as a reconstruction of past PDSI behavior. A negative RE is generally interpreted as meaning that the estimates are worse than the calibration period mean and, therefore, have no skill. The use of the calibration period mean as the "yardstick" for assessing reconstruction skill makes this statistic more difficult to pass than VRSQ. However, it is also less robust, meaning that it is very sensitive to even a few bad estimates in the verification period. Therefore, RE>0 is interpreted as evidence for a reconstruction that contains some skill over that of climatology. In our case, the median RE over all 286 grid points is 0.419. This is strong evidence for meaningful reconstruction skill over the PDSI grid. 4) Verification coefficient of efficiency (CE). This statistic comes from the hydrology literature and is very similar to the RE. It too has a theoretical range of -infinity to 1.0. The crucial difference is that the CE uses the verification period mean of the withheld actual data as the "yardstick" for assessing the skill of the estimates. This seemingly minor difference is important because it results in the CE being even more difficult than the RE to pass (i.e., a CE>0). Consequently, CE is never greater than RE (see Cook et al., 1999 for why this is so) and is often significantly smaller. In our case, the median CE over all 286 grid points is 0.357, strongly positive but smaller than RE as expected. This result is again strong evidence for meaningful reconstruction skill over the PDSI grid. Overall, the North American PDSI grid is very well calibrated and verified. However, the calibration/verification statistics typically weaken back in time at each grid point. This occurs because increasingly longer, and smaller, subsets of chronologies are used to extend the reconstructions back as far as possible. The calibration and verification statistics described thus far are for grid point models that are based on the largest number of (and shortest common length) chronologies. Each time a longer (and smaller) subset was used to extend the PDSI reconstruction farther back in time at each grid point, a new set of calibration and verification statistics was produced. This resulted in a time-varying set of statistics for each reconstruction. This being the case, is there a hard-and-fast rule for determining when the reliability of a PDSI reconstructions becomes too poor to use? The answer is "No!" The chosen "rule" will depend on how much uncertainty is acceptable for each individual who uses these reconstructions. Regardless, one must always keep in mind the changing uncertainty in each PDSI reconstruction when using it for whatever purposes. This caveat is especially true for the weakly reconstructed areas of Canada and Mexico. Any questions concerning these PDSI reconstructions should be directed to Ed Cook at drdendro@ldeo.columbia.edu.