README FILE FOR HOURLY PRECIPITATION DATA (HPD) NETWORK Version 1.0 Beta The beta release of NCEI’s Hourly Precipitation Dataset (HPD), known historically as DSI-3240 (NCDC 2003). The new dataset combines the legacy data from DSI-3240 with a new source of data collected from the same Fischer-Porter gauges, but which were upgraded to digital recording beginning in the mid-2000s. As with the legacy 3240 dataset, this new dataset provides observations of hourly precipitation amounts from 1948 to the present. Details are provided in the Algorithm Theoretical Basis Document (HPD-Auto-v1-ATBD-20170201.pdf) available on this ftp site. -------------------------------------------------------------------------------- How to cite: To acknowledge the specific version of the dataset used, please cite: Hourly Precipitation Data (HPD) Network, Version 1. [indicate subset used following decimal, e.g. Version 1.0] beta, NOAA National Centers for Environmental Information. [access date]. -------------------------------------------------------------------------------- NOTE: The engineered accuracy of the Fischer-Porter network (F&P) gauges is one tenth of an inch. However, the stations in the F&P network measure precipitation to one hundredth of an inch. The weighing gauge sensors are susceptible to noise at levels less than one tenth of an inch, but NCEI believes a true precipitation signal can be identified at lighter amounts. Although it can be challenging to always distinguish noise from the true precipitation signal, NCEI feels that in most cases it can determine an accurate precipitation amount at totals as low as one hundredth of an inch. However, users are cautioned that there is less confidence in hourly precipitation amounts less than one tenth of an inch. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- I. DOWNLOAD QUICK START Start by downloading "hpd-stations.txt," which has metadata for all stations. Then download the following TAR file: - "hpd_all.tar.gz" Then uncompress and untar the contents of the tar file, e.g., by using the following Linux command: tar xzvf hpd_all.tar.gz The files will be extracted into a subdirectory under the directory where the command is issued. ALTERNATIVELY, if you only need data for one station: - Find the station's name in "hpd-stations.txt" and note its station identification code (e.g., FLAGSTAFF, AZ is "USC00023009"); and - Download the data file (i.e., ".hly" file) that corresponds to this code (e.g., "USC00023009.hly" has the data for FLAGSTAFF). Note that the ".hly" file is located in the "all" subdirectory. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- II. CONTENTS OF ftp://ftp.ncdc.noaa.gov/pub/data/hpd/auto/v1/beta all: Directory with ".hly" files for all of HPD hpd_all.tar.gz: TAR file of the GZIP-compressed files in the "all" directory hpd-stations.txt: List of stations and their metadata (e.g., coordinates) hpd-states.txt: List of U.S. state codes used in hpd-stations.txt hpd-version.txt: File that specifies the current version of HPD readme.txt: This file status.txt: Notes on the current status of HPD -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- III. FORMAT OF DATA FILES (".hly" FILES) Each ".hly" file contains data for one station. The name of the file corresponds to a station's identification code. For example, "USC00313017.hly" contains the data for the station with the identification code USC00313017. Each record in a file contains one day of hourly data. The variables on each line include the following: ------------------------------ Variable Columns Type ------------------------------ ID 1-11 Character YEAR 12-15 Integer MONTH 16-17 Integer DAY 18-19 Integer ELEMENT 20-23 Character VALUE1 24-28 Integer MFLAG1 29-29 Character QFLAG1 30-30 Character SFLAG1 31-31 Character S2FLAG1 32-32 Character VALUE2 33-37 Integer MFLAG2 38-38 Character QFLAG2 39-39 Character SFLAG2 40-40 Character S2FLAG2 41-41 Character . . . . . . . . . VALUE24 231-235 Integer MFLAG24 236-236 Character QFLAG24 237-237 Character SFLAG24 238-238 Character S2FLAG24 239-239 Character ------------------------------ These variables have the following definitions: ID is the station identification code. Please see "hpd-stations.txt" for a complete list of stations and their metadata. YEAR is the year of the record. MONTH is the month of the record. DAY is the day of the record (i.e., the day of month). ELEMENT is the element type. Currently there is only one element type: HPCP = Hourly Precipitation total (hundreths of in) VALUE1 is the value on the first hour of the day (i.e., the precipitation total during the time of day 00:00-01:00; missing = -9999). The units are hundredths of inch. MFLAG1 is the measurement flag for the first hour of the day. The possible values are: Blank = no measurement information applicable g = a carry-over measurement flag from the DSI-3240 dataset which was used only on the very first hour of the month if there was zero precipitation during that hour. The purpose of this flag was mainly to indicate that the station was functional and reporting during the month. Normally in DSI-3240, zero precipitation amounts were not included in the data file in order to save space. This HPD dataset does include zero precipitation totals, both those assummed from the DSI-3240 dataset and those determined from the digital recordings of bucket level data. Z = represents an "assumed" zero precipitation total. Usually these are values from the DSI-3240 dataset. The rule in that dataset was to "assume" a zero total for any hour where nothing else was reported or indicated for that hour as long as the very first hour of the month had a non-zero amount or a zero amount with the "g" measurement flag. Zero amounts were omitted from the DSI-3240 dataset in order to save disk space. We are not concerned with that anymore. a = represents the beginning hour of an accumulation period. Sometimes multi-hour accumulations were reported in DSI-3240, so they were brought over for the period of record in which we have DSI-3240 data when that was the best information available. The data value for the beginning of an accumulation period is set to missing (-9999). . = represents an hour during an accumulation period, between the beginning and ending hours of the accumulation period. The data value during the midst of an accumulation period is set to missing (-9999). A = designates the end of an accumulation period. The accumulation total for the period is given for the data value. T = trace of precipitation QFLAG1 is the quality flag for the first hour of the day. The possible values are: Blank = did not fail any quality assurance check X = failed global extreme exceedence check N = failed negative precipitation check Y = failed state extreme exceedence check (performed on daily totals) K = failed streak/frequent-value check G = failed gap check O = failed climatological outlier check Z = flagged as a result of an official Datzilla investigation A = The value is not an hourly precipitation total but rather an accumulation total for a period greater than an hour in duration and lasting through the end of this hour. (See measurement flag for the beginning time of the accumulation period.) M = represents the associated value at this observation time is missing in the DSI-3240 dataset and no alternate data source is available. This is a carry-over indicator from DSI-3240 to allow the user to distinguish between missing and deleted data in that older system. (See the "D" quality flag.) However, the most consistent way to identify hours of missing data across the entire dataset is to test if the precipitation value is equal to the special missing value of -9999. D = represents the associated value at this time was deleted by the DSI-3240 processing system. Usually this was done manually by a trained meteorological technician who made the decision using ancillary information and experience. SFLAG1 is the source flag for the first hour of the day. The possible values are: Blank = No source (i.e., data value missing) 4 = DSI-3240 6 = DSI-3260 (not used in current version) H = derived from digital data from the NWS HPD network S2FLAG1 is the secondary source flag for the first hour of the day. The possible values are: Blank = No source (i.e., no secondary source code applies) C = hourly total is computed from high-temporal resolution totals (e.g., computed from 15-min precip totals) When data are available for the same time from more than one source, the highest priority source is chosen according to the following priority order (from highest to lowest): H,4,6 VALUE2 is the value on the second hour of the day (i.e., time 01:00-02:00) MFLAG2 is the measurement flag for the second hour of the day. QFLAG2 is the quality flag for the second hour of the day. SFLAG2 is the source flag for the second hour of the day. S2FLAG2 is the secondary source flag for the second hour of the day. ... and so on through the 24th hour of the day. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- IV. FORMAT OF "hpd-stations.txt" ------------------------------ Variable Columns Type ------------------------------ ID 1-11 Character LATITUDE 13-20 Real LONGITUDE 22-30 Real ELEVATION 32-37 Real STATE 39-40 Character NAME 42-122 Character WMO ID 124-128 Character NOMINAL SAMPLING INTERVAL 130-133 Character N HOURS OFFSET FROM GMT 135-139 Character --------------------------------------------- These variables have the following definitions: ID is the station identification code. Note that the first two characters denote the FIPS country code, the third character is a network code that identifies the station numbering system used, and the remaining eight characters contain the actual station ID. See "hpd-states.txt" for a list of state/territory codes. The network code has the following five values: C = U.S. Cooperative Network identification number (last six characters of the GHCN-Daily ID) W = WBAN identification number (last five characters of the GHCN-Daily ID) LATITUDE is latitude of the station (in decimal degrees). LONGITUDE is the longitude of the station (in decimal degrees). ELEVATION is the elevation of the station (in meters, missing = -999.9). STATE is the U.S. postal code for the state (for U.S. stations only). NAME is the name of the station. WMO ID is the World Meteorological Organization (WMO) number for the station. If the station has no WMO number (or one has not yet been matched to this station), then the field is blank. NOMINAL SAMPLING INTERVAL is in units of minute and indicates the typical time between sampling of the level of water in the gauge. N HOURS OFFSET FROM GMT is the number of hours the station's local time is offset from GMT. Negative values earlier than GMT. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- V. FORMAT OF "hpd-states.txt" ------------------------------ Variable Columns Type ------------------------------ CODE 1-2 Character NAME 4-50 Character ------------------------------ These variables have the following definitions: CODE is the POSTAL code of the U.S. state/territory where the station is located NAME is the name of the state or territory. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- VI. REFERENCES For additional information, please send an e-mail to hpd.ncdc@noaa.gov.