The Global Historical Climatology Network monthly (GHCNm) dataset provides monthly climate summaries from thousands of weather stations around the world. The initial version was developed in the early 1990s, and subsequent iterations were released in 1997, 2011, and most recently in 2018. The period of record for each summary varies by station, with the earliest observations dating to the 18th century. Some station records are purely historical and are no longer updated, but many others are still operational and provide short time delay updates that are useful for climate monitoring. The current version (GHCNm v4) consists of mean monthly temperature data, as well as a beta release of monthly precipitation data.
NCEI uses GHCN monthly to monitor long-term trends in temperature and precipitation. It has also been employed in several international climate assessments, including the Intergovernmental Panel on Climate Change 4th Assessment Report, the Arctic Climate Impact Assessment, and the "State of the Climate" report published annually by the Bulletin of the American Meteorological Society.
- Expanded set of station temperature records
- More comprehensive uncertainties for calculating station and regional temperature trends
Data Utilities and Inventory Files
- Directions on Decompressing and Extracting Files
- Includes description of inventory file and format of data files [measurement, quality, and source flags
- Station-level Metadata
- Station-level information on the data source, monthly mean temperatures, and trends for unadjusted and adjusted data.
- Metadata Read-me
Version 4 combines data from a variety of sources for a total 26,000 monthly temperature stations compared to 7,200 in v2 and v3.
- Global Historical Climatology Network–daily dataset (GHCNd; Menne et al. 2012)
- This primary source consists of temperature observations that have been combined with the original monthly sources used in previous versions of GHCNm.
- International Surface Temperature Initiative (ISTI; Rennie et al. 2013)
- Station data collected and merged under the auspices of the ISTI.
Menne, M. J., C. N. Williams, B.E. Gleason, J. J Rennie, and J. H. Lawrimore, 2018: The Global Historical Climatology Network Monthly Temperature Dataset, Version 4. J. Climate, in press. doi:10.1175/JCLI-D-18-0094.1.
GHCNm v4 uses the same set of quality control (QC) algorithms applied to v3 with some additions. The checks and associated flags are shown in Table 1. Further details regarding the quality control checks are available in the version 4 Algorithm Theoretical Basis Document.
|Table 1. Quality Assurance Checks Applied to GHCNm Version 4 Temperatures|
|Data Problem||Description of Check|
|Inter-Station Duplicate Check (E Flag)||Identifies a station’s monthly values when they are duplicated in any year of another station’s data (annual data must have at least 3 years of data and at least 12 values within 0.015 deg C|
|Consecutive Month Duplicate Check (W Flag)||Used to identify duplicate retransmission and mislabeling of previous month's temperature for current month. Occurs in GTS transmitted CLIMAT bulletins|
|Series Duplication (D Flag)||Identifies duplication of data between years within the same station record|
|World Record Extremes Check (R Flag)||Identifies temperatures that fall outside the range of the highest and lowest monthly mean maximum and minimum temperature values|
|Isolated Value(s) (L Flag)||Identifies single data months or small clusters of data that are isolated in time. A single datum, or a cluster, of consecutively spaced data (up to three consecutive months) is examined to see if the time period before and after the data (or cluster) contain 18 consecutive months of missing data, or more, both before and after the datum (or cluster)|
|Streak Check (K Flag)||Identifies runs of the same value (non-missing) in five or more consecutive months|
|Climatological Outlier (O Flag)||Identifies temperatures that exceed their respective climatological bi-weight means for the corresponding station and calendar month by at least five bi-weight standard deviations|
|Spatial Inconsistency 1 (S Flag)|
Any value found to be between 2.5 and 5.0 bi-weight standard deviations from the bi-weight mean is more closely scrutinized by examining the 5 closest neighbors (not to exceed 500.0 km) and determining their associated distribution of respective z-scores. At least one of the neighbor stations must have a z score with the same sign as the target and its z-score must be greater than or equal to the z–score listed in column B (below), where column B is expressed as a function of the target z-score ranges (column A)
|Spatial Inconsistency 2 (T Flag)||This check uses a weighted average of neighboring stations to identify extreme temperatures that are likely erroneous. Z–scores for the month of interest for the target station are compared with other station z–scores within 500 km using an inverse distance weighting function. If the absolute difference >= 3.0, then it is flagged|
|Erroneous value not detected through automated quality control checks (Z Flag)||Datum (or data) are flagged after manual investigation determines value(s) to be erroneous|
Temperature and Data Hemogenization
Nearly all weather stations undergo changes to data measurement processes and infrastructure at some point in their history. Thermometers, for example, require periodic replacement or recalibration, and measurement technology has evolved over time. Temperature recording protocols have also changed at many locations from recording temperatures at fixed hours during the day to once-per-day readings of the 24-hour maximum and minimum. “Fixed” land stations are sometimes relocated, and even minor temperature equipment moves can change the microclimate exposure of the instruments. In other cases, the land use or land cover in the vicinity of an observing site can change over time, which can impact the local environment that instruments are sampling even when measurement practice is stable.
All of these modifications can cause systematic shifts in temperature readings that are unrelated to any real variation in local weather and climate. These shifts (or “inhomogeneities”) can be large relative to true climate variability, and can cause large systematic errors when calculating climate trends and variability for a single station as well as for the average of multiple stations.
For this reason, detecting and accounting for artifacts associated with changes in observing practice is an important and necessary part of building climate datasets. In GHCNm v4, shifts in monthly temperature series are detected through automated pairwise comparisons of the station series using the algorithm described in Menne and Williams (2009). This procedure, known as the Pairwise Homogenization Algorithm (PHA), systematically evaluates each time series of monthly average surface air temperature to identify cases with abrupt shifts in one station’s temperature series (the “target” series) relative to many other correlated series from other stations in the region (the “reference” series). The algorithm seeks to resolve the timing of shifts for all station series before computing an adjustment factor to compensate for any one particular shift. These adjustment factors are based on the average change in the magnitude of monthly temperature differences between the target station series with the apparent shift and the reference series with no apparent concurrent shifts.
(Right) Total uncertainty for GHCNm v4 mean annual Global Land Surface Air Temperature anomalies. Darker grays show homogenization uncertainties (parametric and missed breaks) and the lighter grays show anomaly and spatial coverage uncertainties. The uncertainties are displayed as cumulative, so the uncertainty bounds depicted in each lighter shade includes the uncertainty of the darker shades (see Menne et al. 2018 for details).
- GHCNm v4
- Menne, M. J., C. N. Williams, B.E. Gleason, J. J Rennie, and J. H. Lawrimore, 2018: The Global Historical Climatology Network Monthly Temperature Dataset, Version 4. J. Climate, in press. doi:10.1175/JCLI-D-18-0094.1.
- GHCNm v3
- Lawrimore, J. H., M. J. Menne, B. E. Gleason, C. N. Williams, D. B. Wuertz, R. S. Vose, and J. Rennie, 2011: An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3, J. Geophys. Res., 116, D19121, doi:10.1029/2011JD016187.
- GHCNm v2
- Peterson, T. C., and R. S. Vose, 1997: An overview of the Global Historical Climatology Network temperature database. Bull. Amer. Meteor. Soc.,78, 2837–2849.
- GHCNm v1
- Vose, R. S., R. L. Schmoyer, P. M. Steurer, T. C. Peterson, R. Heim, T. R. Karl, and J. Eischeid, 1992: The Global Historical Climatology Network: Long‐term monthly temperature, precipitation, sea level pressure, and station pressure data, ORNL/CDIAC‐53, 325 pp., Carbon Dioxide Inf. Anal. Cent., Oak Ridge, Tenn.
- Menne, M. J., I. Durre, B. G. Gleason, T. Houston, and R. S. Vose, 2012: An overview of the Global Historical Climatology Network Daily dataset. J. Atmos. Oceanic Technol., 29, 897–910, doi:10.1175/JTECH-D-11-00103.1.
- Pairwise Homogenization Algorithm (PHA)
- Menne, M. J., and C. N. Williams, 2009: Homogenization of temperature series via pairwise comparisons, J. Climate, 22, 1700–1717, doi:10.1175/2008JCLI2263.1.
- Williams, C. N., M. J. Menne, and P. W. Thorne, 2012: Benchmarking the performance of pairwise homogenization of surface temperatures in the United States, J. Geophys. Res., 117, D05116, doi:10.1029/2011JD016761.
- ISTI Databank
- Rennie, J. J., Lawrimore, J. H., Gleason, B. E., Thorne, P. W., Morice, C. P., Menne, M. J., Williams, C. N., de Almeida, W. G., Christy, J. R., Flannery, M., Ishihara, M., Kamiguchi, K., Klein-Tank, A. M. G., Mhanda, A., Lister, D. H., Razuvaev, V., Renom, M., Rusticucci, M., Tandy, J., Worley, S. J., Venema, V., Angel, W., Brunet, M., Dattore, B., Diamond, H., Lazzara, M. A., Le Blancq, F., Luterbacher, J., Mächel, H., Revadekar, J., Vose, R. S., and Yin, X. (2014), The international surface temperature initiative global land surface databank: monthly temperature data release description and methods. Geoscience Data Journal, 1, 75–102. doi: 10.1002/gdj3.8.
Unadjusted and Adjusted GHCN monthly Data
- Data and inventory files
- How to decompress and extract files (includes description of inventory file and format of data files [measurement, quality, and source flags])
Station vs Neighbor Comparison Graphs
- Graphs of station versus neighbor comparisons (with additional information on station-level bias adjustments)
- Explanation of Graphs
With the development of GHCNm v3, new quality control (QC) procedures were instituted using methods established as part of other dataset development efforts during the past five years. Scientists created the QC algorithms based on methods used for the GHCN daily temperature dataset and subsequently applied to USHCN-Monthly Version 2 data (Menne et al. 2009). The QC process used in GHCNm v3 consists of the seven checks listed in Table 1. We perform the QC algorithms in the order listed in the table. Scientists selected the thresholds shown in column two based on their performance evaluated using the method outlined in Durre et al. (2008).
The GHCN–M version 3 temperature data make use of processing improvements that included a new method for the homogenization of temperature data. Automated pairwise comparisons of mean monthly temperature series (Menne and Williams 2009) form the basis for adjustments to the apparent impacts of documented and undocumented inhomogeneities.
In this approach, numerous combinations of temperature series in a region are compared to identify cases of abrupt shifts in one station series relative to many others. The algorithm starts by forming a large number of pairwise difference series between serial monthly temperature values from a region.
In an automated and reproducible way, each difference series undergoes statistical evaluation for abrupt shifts. After the algorithm identifies all of the shifts attributed to the appropriate station within the network, adjustments apply to each target shift. Adjustments are determined by estimating the magnitude of change in pairwise difference series between the target series and highly correlated neighboring series that have no apparent shifts at the same time.
- Durre, I., M.J. Menne, and R.S. Vose, 2008: Strategies for evaluating quality assurance procedures. Journal of Applied Meteorology and Climatology, 47, 1785–1791, doi:10.1175/2007JAMC1706.1.
- Easterling, D.R., and T.C. Peterson, 1995: A new method for detecting undocumented discontinuities in climatological time series. International Journal of Climatology, 15, 369–377. doi:10.1002/joc.3370150403.
- Lawrimore, J.H., M.J. Menne, B.E. Gleason, C.N. Williams, D.B. Wuertz, R.S. Vose, and J. Rennie, 2011: An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3. Journal of Geophysical Research, 116, D19121, doi:10.1029/2011JD016187.
- Menne, M.J., and C.N. Williams Jr., 2009: Homogenization of temperature series via pairwise comparisons. Journal of Climate, 22, 1700–1717, doi:10.1175/2008JCLI2263.1.
- Peterson, T.C., and D.R. Easterling, 1994: Creation of homogeneous composite climatological reference series. International Journal of Climatology, 14, 671–679, doi:10.1002/joc.3370140606.
- Peterson, T.C., and R.S. Vose, 1997: An overview of the Global Historical Climatology Network temperature database. Bulletin of the American Meteorological Society, 78, 2837–2849, doi:10.1175/1520-0477(1997)078%3C2837:AOOTGH%3E2.0.CO;2.
The following journal articles describe the methods used in developing the GHCN monthly Temperature dataset:
- Peterson, T. C., and R. S. Vose, 1997: An overview of the Global Historical Climatology Network temperature database. Bulletin of the American Meteorological Society, 78, 2837-2849, doi:10.1175/1520-0477(1997)078%3C2837:AOOTGH%3E2.0.CO;2.
- Peterson, T.C., R. Vose, R. Schmoyer, and V. Razuvaev, 1998: Global Historical Climatology Network (GHCN) quality control of monthly temperature data. International Journal of Climatology, 18, 1169-1179, doi:10.1002/(SICI)1097-0088(199809)18:11<1169::AID-JOC309>3.0.CO;2-U
Version 2 Bias Correction Software
The automated bias correction software (Peterson and Easterling, 1994; Easterling and Peterson, 1995) used to detect and adjust for documented and undocumented inhomogeneities in the GHCN monthly version 2 monthly temperature dataset. Please refer to the readme file in this directory for information on this software.
- Easterling, D. R., and T. C. Peterson, 1995: A new method for detecting undocumented discontinuities in climatological time series. International Journal of Climatology, 15, 369-377, doi:10.1002/joc.3370150403.
- Peterson, T. C., and D. R. Easterling, 1994: Creation of homogeneous composite climatological reference series. International Journal of Climatology, 14, 671-679, doi:10.1002/joc.3370140606.
The Global Historical Climatology Network (GHCN) Monthly Precipitation, Version 4 is a collection of worldwide monthly precipitation values offering significant enhancement over the previous version 2. It contains more values both historically and for the most recent months. Its methods for merging records and quality control have been modernized.
The data set is updated typically in the first week of each month and available through the links below.
Precipitation Data (Beta)
Utilities and Inventory Files
- Station Inventory: Location information for each record.
- Read-me: Describes file formats and defines various data flags.
Version 4 combines data from a variety of sources for a total of over 118,000 monthly temperature stations compared to 20,590 in v2.
Scientists frequently obtain a precipitation time series for a given station from more than one source. For example, rainfall data for Beijing were available in three different source datasets. In brief, comparing each station with all the other stations in all source datasets identified duplicate stations. The description of similarity between stations uses several statistics, including the number of identical months of data, the length of the longest run of identical months, and the number of identical values that were zero. Use of these diagnostic statistics, in conjunction with station metadata subjectively determine if stations were duplicates. In most cases, the decision was relatively straightforward, although a few degenerate time series posed proved more challenging.
Staff used a variety of tests to assess data quality. The first step involved comparing stations with a gridded climatology and plotting the stations for visual inspection. Both of these processes uncovered mislocated stations and the digitized formerly uncovered stations 6 months out of phase. Additionally, each time series was tested for significant discontinuities using the Cumulative Sum test (which looks for changes in the mean) and an analogous test that looks for changes in the variance or scale. Evaluation of each time series for runs of three or more months of the same nonzero value. Finally, scientists evaluated each individual precipitation total to determine if it was an outlier in space and/or time using a variety of nonparametric statistics.
For questions specific to GHCNM, please email NCDC.GHCNM@noaa.gov.
Citing and Metadata
Information on how to cite the dataset and view the Metadata.
Scientists first developed the Global Historical Climatology Network, Monthly (GHCNm) temperature dataset in the early 1990s (Vose et al. 1992). Version 2 was released in 1997 following extensive efforts to increase the number of stations and length of the data record (Peterson and Vose, 1997). This update also introduced quality assurance techniques to remove inhomogeneities from the data record associated with non-climatic influences such as changes in instrumentation, station environment, and observing practices that occur over time (Peterson and Easterling, 1994; Easterling and Peterson 1995). Subsequent updates have continued to improve dataset development methods, including new quality control processes and advanced techniques for removing data inhomogeneities (Menne and Williams, 2009)., the GHCNm version 3 replaced GHCNm version 2 as the operational dataset for climate monitoring activities on May 2, 2011.