International Tree-ring Data Bank: correlation-stats readme file ----------------------------------------------------------------------- World Data Center for Paleoclimatology, Boulder and NOAA Paleoclimatology Program ----------------------------------------------------------------------- NOTE: PLEASE CITE CONTRIBUTORS WHEN USING THIS DATA!!!!! CONTRIBUTORS: Numerous ITRDB Members NAME OF DATA SET: International Tree-ring Data Bank correlation statistics GEOGRAPHIC REGION: Global PERIOD OF RECORD: 1994 A.D. - 6000 B.C. LIST OF FILES: Readme, sitecode.txt files (COFECHA output files). DESCRIPTION: This directory contains correlation statistics (Program COFECHA output) of the tree ring width or density raw measurements from the International Tree Ring Data Bank. Quality control on the ITRDB measurements archive was conducted in 1996 by Henri Grissino-Mayer using the COFECHA program from the ITRDB Program Library. The analysis was updated in 2005 by Jeff Lukas and Henry Adams for data contributed to the ITRDB from 1996 to 2005. Analysis for contributions received since 2005 have been conducted by Bruce Bauer See the ITRDB Program Library manual for information on program COFECHA and how to interpret the output. COFECHA Output (crossdating quality analysis) COFECHA is a quality-control program used to check the crossdating and overall quality of tree-ring chronologies. A modified form of COFECHA has been run on the chronologies in the ITRDB, during two quality-control projects completed in 1996 and 2005, respectively. The COFECHA output file for each chronology provides important information about chronology quality. The online user guide will aid in the interpretation of that information. The name of each COFECHA output file is the sitecode of the associated chronology. For example, the Mohonk Lake chronology in New York is ny003 (ny003.crn, ny003.rwl), so the COFECHA output file is ny003.txt. Guide to COFECHA Output COFECHA is a quality-control program used to check the crossdating and overall quality of tree-ring chronologies. A modified form of COFECHA has been run on the chronologies in the ITRDB, during two quality-control projects completed in 1996 and 2005, respectively. The purpose of this users' guide is to explain the COFECHA output so that data users can better assess the quality of the chronology of interest. There are three parts of the output: 1. the statistical measures and summary information for the chronology (Header) 2. the "correlation matrix" that shows the correlations of each series segment with the master chronology (Correlation of Series by Segments) 3. the summary statistics provided for each series in the chronology, and averaged over all series (Descriptive Statistics) The sample output file below has links to explanations of these sections and the statistics within them. You can use the Quick Chronology Assessment (below at right) to link to the most important elements of the assessment. Chronology file name: XMPL001.CRN Measurement file name: XMPL001.RWL Date checked: 10MAR05 Checked by: H. ADAMS AND J. LUKAS Beginning year: 1545 Ending year: 1997 Principal investigators: T.R. Researcher Site name: Example 1 Site location: Planet Earth Species information: UNSP Unknown Species Latitude: 4000 Longitude: -10515 Elevation: 1610M Series intercorrelation: 0.672 Avg mean sensitivity: 0.195 Avg standard deviation: 0.345 Avg autocorrelation: 0.799 Number dated series: 18 Segment length tested: 50 Number problem segments: 2 Pct problem segments: 0.96 Quick Chronology Assessment 1. Look at Series Intercorrelation and compare it with other chronologies of the same species 2. Look at Pct problem segments and compare it with other chronologies of the same species 3. Read Comments (not all files have these) 4. Examine the correlation matrix Are there obvious misdated series* Number possible misdated series* Percent misdated series* Do they affect chronology quality? N/A* Recommend withhold from ITRDB? N/A* Comments:* *NOTE: not all COFECHA files include these entries PART 5: CORRELATION OF SERIES BY SEGMENTS: xmpl001.rwl 11:03 Tue 21 Jun 2005 Page 5 ------------------------------------------------------------------------------------------------------------------------ Correlations of 50-year dated segments, lagged 25 years Flags: A = correlation under .3281 but highest as dated; B = correlation higher at other than dated position- Seq Series Time_span 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1599 1624 1649 1674 1699 1724 1749 1774 1799 1824 1849 1874 1899 1924 1949 1974 1999 --- -------- --------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1 xmp151 1745 1997 .60 .39 .67 .79 .65 .42 .62 .81 .75 .73 2 xmp032 1681 1997 .77 .72 .69 .49 .41 .53 .70 .75 .74 .70 .70 .76 3 xmp041 1629 1997 .69 .67 .57 .61 .79 .74 .73 .80 .79 .70 .65 .67 .74 .81 4 xmp061 1604 1996 .47 .69 .64 .77 .66 .59 .57 .52 .64 .71 .70 .74 .50 .48 .71 5 xmp021 1649 1997 .58 .59 .58 .78 .80 .69 .73 .66 .64 .65 .72 .81 .68 .74 6 xmp220 1590 1997 .51 .52 .66 .68 .81 .75 .52 .63 .74 .73 .80 .84 .72 .68 .75 .77 7 xmp210 1568 1641 .81 .82 .66 8 xmp210 1662 1997 .60 .66 .68 .49 .52 .71 .80 .85 .75 .68 .63 .64 .70 9 xmp011 1545 1997 .74 .78 .77 .52 .51 .70 .80 .78 .71 .76 .73 .60 .69 .84 .68 .41 .57 10 xmp081 1884 1997 .76 .74 .74 .67 11 xmp092 1827 1996 .63 .71 .62 .44 .55 .72 12 xmp071 1788 1997 .79 .84 .74 .72 .74 .68 .64 .70 13 xmp051 1728 1980 .29B .57 .48 .52 .75 .59 .60 .59 .53 .54 14 xmp072 1718 1996 .68 .65 .61 .65 .76 .75 .67 .62 .48 .45 .63 15 xmp101 1715 1997 .69 .74 .70 .73 .78 .79 .60 .63 .78 .83 .77 16 xmp043 1627 1997 .70 .66 .73 .77 .77 .78 .88 .85 .75 .71 .68 .69 .79 .72 17 xmp042 1622 1997 .76 .75 .71 .64 .66 .78 .77 .79 .77 .70 .69 .73 .74 .74 .76 18 xmp012 1593 1969 .31A .47 .49 .56 .70 .75 .75 .77 .80 .79 .76 .71 .65 .58 .57 Av segment correlation .77 .61 .61 .64 .62 .69 .71 .67 .64 .69 .73 .73 .68 .69 .66 .65 .71 For more information on COFECHA and tree-ring statistics Grissino-Mayer, H. D. 2001. Evaluating crossdating accuracy: A manual and tutorial for the computer program COFECHA. Tree-Ring Research 57(2): 205-221. Holmes, R. L. 1983. Computer-assisted quality control in tree-ring dating and measurement. Tree-Ring Bulletin 43: 69-78. Fritts, H. C. 1976. Tree Rings and Climate. Academic Press, New York, 567 pp. To download COFECHA http://www.ltrr.arizona.edu/pub/dpl/ - click on COFECHA.ZIP Series Intercorrelation Measures the strength of the common signal in the chronology Varies by species and region Is a measure of chronology reliability Does not indicate a chronology’s utility for climate reconstruction Is reduced by misdating errors Series intercorrelation is a measure of the strength of the signal (typically the climate signal) common to all sampled trees at the site. It is the average correlation of each series with a master chronology derived from all other series. The highest values are around 0.900, for very drought-sensitive conifers, and the lowest values for trees that can still be reliably cross-dated are around 0.400. Most chronologies have values between 0.550 and 0.750. Higher values of series intercorrelation are generally more desirable, however trees of some species and in some regions will have higher values than others. For example, an interseries correlation of 0.600 would be considered high for a PIEN (Engelmann spruce) chronology in Canada, but the same value would be considered low for a PIED (pinyon pine) chronology from Arizona. Interseries correlation is not a measure of a chronology’s utility for climate reconstruction. It is possible that a chronology with a low series intercorrelation could contain a very useful climate signal or that a chronology with a high series intercorrelation may not correlate usefully with any climate variable of interest. Series intercorrelation is an important consideration is assessing chronology quality. A chronology with low interseries correlation requires a larger sample size to maintain reliability, compared to a chronology with a stronger common signal. Because sample depth often declines in the earliest parts of the chronology, reliability concerns will be greatest in those areas. Series intercorrelation is reduced by misdating errors and by tree-level influences on growth that are not common to other trees in the chronology. Often series intercorrelation can be improved by deleting problem segments. Average Mean Sensitivity Is the relative change in ring-width from one year to the next Varies by species and region Does not indicate a chronology’s utility for climate reconstruction Mean sensitivity is a measure of the relative change in ring-width from one year to the next in a given series. Mean sensitivity is positively correlated with series intercorrelation; trees with stronger common signals tend to be more sensitive. Average mean sensitivity varies among species and regional climates from around 0.650 (for very drought-sensitive conifers) to 0.150 for the most complacent trees. While mean sensitivity is not a measure of the chronology’s utility for climate reconstruction, it is a good measure of the relative ease of cross-dating. Average standard deviation Since this value (in millimeters) represents the standard deviation of the series before detrending, it is difficult to compare average standard deviation between chronologies or make any general statements about it. Average autocorrelation Is a measure of the previous year's influence on current year's growth Typically varies from 0.300 to 0.800 Is not a measure of the utility of a chronology for climate reconstruction Autocorrelation is a measure of how much the ring width in year n is correlated with the width in year n-1. Tree-ring series have this "first-order" or "Lag-1" autocorrelation mainly because physiological processes within the tree create a lag in response to climate, and also because climate anomalies tend to persist from one year to the next. Higher values of series intercorrelation and mean sensitivity are often associated with lower autocorrelation. The lowest values, 0.300 to 0.500, are found in highly drought-sensitive conifers; the typical range is 0.600 to 0.800; and the highest values are over 0.900. Although lower values are more desirable, average autocorrelation is not a measure of the utility of a chronology for climate reconstruction. Autocorrelation is often removed prior to use for climate reconstruction; a "residual" chronology (as opposed to the standard chronology) has had the first-order autocorrelation removed. Number of dated series Is the number of ring-width series in the chronology. When higher, increases the quality of the chronology. Is seldom constant throughout the length chronology. This is the number of series in the chronology. Often, two series are dated and measured from each tree, so the number of series is double the number of sampled trees. A higher number of series is always more desirable, since it increases the robustness of the common signal. Usually, few series span the entire chronology, so at any point the sample depth is usually less than the number of dated series. The correlation matrix can be examined to see how sample depth changes over time. Most chronologies have between 15 and 60 series, but some have well over 100. Chronologies with fewer than 10 series are at risk of being unreliable unless the interseries correlation is high (>0.800), to compensate for reduced sample depth. Segment length tested : 50 This is the length, in years, of the segments used in all COFECHA runs, and is the default value. For species or sites with particularly weak common signals, it can be helpful to re-run COFECHA using 100-year segments. This may make it easier to distinguish potential dating errors. Number problem segments Is the number of flagged segments Includes A and B flags Indicates lower signal strength in those segments Most problem segments are not related to dating errors This is the number of 50-year segments which were flagged by COFECHA. A problem segment does not necessarily reflect a dating error, it can be caused by periods in which a tree is less responsive to the common (often climatic) signal or by growth anomalies. So the general concern with a problem segment is that the series intercorrelation is particularly low in that series in that time span, and has the potential to degrade the common signal in the chronology. Fewer problem segments are more desirable in a chronology, but the number of problem segments is best interpreted as a percent of the total number of segments. Percent problem segments Is the number of problem segments divided by total segments X 100 Indicates the percent of segments with low signal strength Most chronologies have fewer than 10% problem segments The distribution of problem segments over time is important This is the number of problem segments as a percentage of the total number of 50-year segments in the chronology. For example, a chronology which has 10 series and an average series length of 250 years will have 100 50-year segments (overlapped by 25 years) that are independently tested by COFECHA. If 8 of these segments are flagged by COFECHA, then the Pct problem segments will be 8%. About 25% of the chronologies in the ITRDB have no problem segments. The vast majority have less than 10% problem segments, and nearly all have less than 20% problem segments. A problem segment does not necessarily indicate misdating; in many cases, the low correlation is caused by tree-ring growth that does not match the common signal of the chronology. Percent problem segments vary by species and region with series intercorrelation. As overall series intercorrelation declines, it is more likely that individual segments will fall below the level (0.328) that triggers "A" flags. For example, nearly all chronologies with series intercorrelation below 0.600 have at least some problem segments, while almost all chronologies with a series intercorrelation above 0.700 have fewer than 2% problem segments. In particularly insensitive species, problem segments related to the weak common signal may be numerous. It is important to consider where problem segments occur in the chronology. If problem segments are distributed fairly evenly across the time span of the chronology, they are less of a threat to chronology reliability than if they are clustered in one or more time period,. Problem segments are often clustered at the earliest part of the chronology, where sample depth is almost always lower . Are there obvious misdated series This entry indicates the presence (YES) or absence (NO) of misdated series. A series was considered misdated if the majority of all 50-year segments tested dated better at positions other than the original dated positions based on the correlation coefficients. Furthermore, these alternative positions must be systematic across all segments and appear reasonable (for example, adjustments of -2, -2, -2, -1, -1, -1 years rather than adjustments of -10, -5, +3, +10, and -4 years). This was determined using Part 6 of the full output from COFECHA. This entry does not appear in the COFECHA output files from the 2005 QC project. Number possible misdated series This is the number of series considered to be misdated. This entry does not appear in the COFECHA output files from the 2005 QC project. Percent misdated series This is the percent of total series that were considered to be misdated. This entry does not appear in the COFECHA output files from the 2005 QC project. Do they affect chronology quality?/ Recommend withhold from ITRDB? These entries do not appear in most of the COFECHA output files. Where they do appear, they are always answered with "N/A". Comments For the 1996 quality control project (~1500 chronologies), Henri-Grissino Mayer made individual comments on each chronology. In addition to an overall assessment (usually "EXCELLENT HIGH QUALITY CHRONOLOGY"), the Comments often include specific changes that a user can make to the chronology to improve quality, generally by deleting problem segments and misdated series. These comments are a useful supplement to the data user's own judgment about the chronology. For the 2005 project (~1000) chronologies, no comments were made about the individual chronologies, and the Comments field does not appear in the output. Part 5: Correlation of Series by Segments (The "Correlation Matrix") Shows the correlation of every segment with the overall chronology Provides several key indicators of chronology quality Should be scrutinized closely The Correlation Matrix shows the changes in the signal strength of the chronology over time, and how individual series and segments either enhance or reduce the signal strength. For each series, COFECHA generates a master chronology from every other series, and then calculates the correlation coefficient between every 50-year segment of that series and the master chronology. The correlation matrix can be examined to look at flagged (problem) segments, changes in sample depth over time, and changes in average correlation over time. Overall chronology quality is the product of sample depth and strength of the common signal (e.g., average correlation). Series - This column lists the series identifier provided by the investigator(s) in the RWL file. Most investigators use a consistent naming convention for tree-ring series: the first 3 digits are the site code, the next two digits are the tree identifier and the final digit identifies the series number from that tree. For example, "XMP122" would be the 2nd series from tree 12 at site XMP. Because 2 or even 3 series may come from the same tree, it can be helpful to look at the series identifiers when the sample depth is low. A portion of a chronology with a sample depth of 3 will be more reliable if the 3 series come from different trees, than if they come from the same tree. Also, if a portion of a series cannot be dated or measured (e.g., if there is a gap in the core), often an investigator will "split" the series into two series of the same name. Time_span - This column lists the time span covered by each series. Note that few, if any, series will span the full length of the chronology. Segment start and end dates (e.g., 1550 - 1599) - the two years at the top of a column indicates the start and end dates of a 50-year segment, corresponding to the segment correlations below. COFECHA will calculate the correlation if a series spans at least half (26 years) of that segment. Flagged (problem) segments Do not match the other series in a chronology as dated. Can be either "A" or "B" flags Are often clustered early in a chronology and reduce average segment correlation. In addition to calculating correlation of each segment with the master chronology, COFECHA shifts each segment to all positions between -10 and +10 years of its as-dated position and correlates these shifted segments with the master chronology. If COFECHA finds a correlation between a shifted segment and the master that is greater than the correlation between the as-dated position and master, it will flag the segment in this section with a “B”. An “A” flag indicates that the segment has a correlation with the master of less than 0.328, the 99% significance level (one-tailed), but the segment does not correlate better with the master in any of the shifted positions. Typically, B flags are associated with lower correlations than A flags. Also, B flags may indicate dating errors. Consecutive B-flagged segments with very low correlations (<0.20) often indicates a misdated series. Flagged segments are rarely distributed evenly throughout a chronology; often, they are clustered towards the beginning. Since flagged segments indicate particularly low correlations, these segments reduce the average segment correlation. Changes in sample depth over time The correlation matrix output also indicates the changes in the sample depth of the chronology over time. (For example, in the example correlation matrix, for the 1750-1799 segment there are 14 series.) It is important to consider how many series are being used to generate each part of the chronology, since the robustness of the signal increases as sample depth increases. Sample depth almost always declines in the early part of the chronology, and data users may want to truncate the portions of the chronology with a low sample depth (which often have lower correlations as well). It's also helpful to examine the series identifiers to see how many different trees are represented when the sample depth is low. Changes in correlation over time The correlation matrix also shows how the average correlation of the segments changes over time, that is, how the strength of the common signal varies over time. Typically, the average correlation is lower in the early part of the chronology (see example), but sometimes towards the end of the chronology (see example). The latest version of COFECHA, used for the 2005 project, calculates average segment correlation and reports it at the bottom of each column in the correlation matrix. The COFECHA output from the 1996 project does not report average segment correlation, but the data user can look at the individual segment correlations in the different time periods to assess how the overall chronology varies over time. Part 7: Descriptive Statistics Are the statistics for each individual series in a chronology. Are displayed for filtered (after detrending) and unfiltered data. This section of the output show statistics for each series. These data can be used to examine the quality of the chronology on a series by series basis. Interval refers to the time span of that series. No. of years is the total number of measurements in that series. No. of segments refers to the number of segments in that series. No. of flags is the number of flagged segments in that series. Corr with Master is the correlation coefficient between each series and the master chronology. The rest of the statistics fall under two categories, unfiltered and filtered. Statistics of unfiltered data are based on the raw measurements of the chronology. The statistics under filtered are based on the data after COFECHA has used a 32-year spline function to detrend each series and autoregressive (AR) modeling to remove autocorrelation from each series. Mean msmt and Max msmt are the average and maximum ring-width measurements (in millimeters) for that series. Std dev is the standard deviation of the ring-width measurements (in millimeters) for that series. Auto corr is the autocorrelation of that series. Mean sens is the mean sensitivity of that series. Max value is the maximum computed ring-width index (mean = 1.0) for that series. AR refers to the order of the autoregressive modeling that was used to remove autocorrelation from the series during detrending.