Skip to main content
U.S. flag

An official website of the United States government

Global Temperature and Salinity Profile Programme

The Global Temperature and Salinity Profile Programme (GTSPP) is an international cooperative developed by a group of marine and oceanic science organizations to provide researchers and marine operations managers with accurate, up-to-date temperature and salinity data. The World Meteorological Organization (WMO) and the Intergovernmental Oceanographic Commission (IOC) jointly manage the program’s network of capture, archive, and dissemination systems to ensure sustained quality control, storage, and access. NCEI maintains the program’s long-term archive by providing storage and quality control services to ensure that best copy versions of GTSPP data are properly preserved and available to the public.

GTSPP Interface

Use the GTSPP Interface to search the entire database by latitude and longitude, date range (1990-Present), season, data summary, and format. 

Launch Interface

Real-Time Datasets

Weekly and monthly real-time temperature and salinity data received and processed by The Integrated Scientific Data Management (ISDM) of Canada.

Data

HTTPFTPTHREDDS

Temporal Resolutions
Weekly (NetCDF): 
gtspp4_rt_nc_yyyymmdd-YYYYMMDD.tgz

Monthly (NetCDF): 
gtspp4_rt_nc_yyyymm-YYYYMM.eds.gz

Best Copy Datasets

NCEI operates the GTSPP Long-Term Archive, providing stewardship and quality control services for data collected by ISDM by replacing low resolution, real-time data with high resolution, best copy duplicates when they become available.

Best copy duplicates are assembled from delayed mode data, including:

  • Full resolution Shipboard XBT or CTD data or
  • Fully processed and quality controlled data from the organizations that submitted the original real time versions
Data

HTTP  FTPTHREDDS

Note: An outline of best copy data hierarchy is available on the HTTP server. 

Citation

Sun, C. and Co-Authors (2010). "The Data Management System for the Global Temperature and Salinity Profile Programme" in Proceedings of OceanObs.09: Sustained Ocean Observations and Information for Society (Vol. 2), Venice, Italy, 21-25 September 2009, Hall, J., Harrison, D.E., & Stammer, D., Eds., ESA Publication WPP-306, doi:10.5270/OceanObs09.cwp.86

Data Access Software

Ocean Data View

Ocean Data View (ODV) is a software package for the interactive exploration, analysis and visualization of oceanographic and other geo-referenced profile, time-series, trajectory or sequence data. ODV can display original data points or gridded fields based on the original data. It has two fast weighted-averaging gridding algorithms, as well as advanced DIVA gridding software. Gridded fields can be color-shaded and/or contoured. ODV supports five different map projections and can be used to produce high quality cruise maps (20kB).

ODV also supports the netCDF format and lets you explore and visualize CF, COARDS, GDT and CDC compliant netCDF datasets. This works with netCDF files on your local machine as well as with remote netCDF files served by an OPeNDAP server.

Documentation Download 

ncBrowse

ncBrowse is a Java application that provides flexible, interactive graphical displays of data and attributes from a wide range of netCDF data file conventions. It was written by Donald W. Denbo at PMEL, NOAA.

It is a free and easy to use application that provides flexible, interactive graphical displays of data and attributes from a wide range of netCDF data file conventions.

Features
  • Color lines by another variable
  • 3D visualizations and LAS data access now available
  • New OPeNDAP (formerly known as DODS) support with test version
  • MacOS X support with test version.
  • Contains support for variable mapping and animation (Warning: this is test version, please report any problems)
  • Directly access remote netCDF files using the HTTPClient library for connectivity.
  • Designed to work with arbitrary netCDF files
  • Browses file using the EPIC and COARDS conventions
  • Provides a "tree" view of the netCDF file. · Handles character variables
  • Handles dimensions without an associated variable
  • Uses sgt graphics to perform 1 and 2 dimensional cuts through data
  • Save to file a complete or the subset of a single variable as a "cdl" text file
  • Save to file time-depth slice in UNH format

Background

GTSPP is a cooperative international program that was developed to address the need for up-to-date, high quality ocean temperature and salinity data in the ocean science and marine operational communities researching sustainable development, climate change, and human and environmental safety.

These research requirements have lead to increasingly complex, high volume data collection programs with correspondingly significant quality control needs. Improved means of high-speed data communications and the WWW in international data management, data and improved data and information products need to be available more quickly than in the past. GTSPP was created to provide this infra 

Timeline

1990: 
Initiated jointly by IODE and IGOSS as a pilot project (through Recommendation IODE-XIII.4)

1996: 
Became a permanent program/operation in 1996

2001: 
Jointly sponsored by JCOMM and IODE.

Infrastructure

Global Telecommunication System (GTS): 
Carries real-time data from ships and buoys to the IOC/WMO Integrated Global Ocean Services System (IGOSS).

IODE Data Centers: 
Contribute data, monitor the project, and distribute products. Long-Term Archive Center (LTAC): Maintains the up-to-date global temperature-salinity data, replaces near real-time records with higher quality delayed-mode records as they are received, and creates and distributes data copies.

Data Product Center (DPC): 
Performs analysis of all the GTSPP data in the region of interest to assess its data quality consistency, provide feedback to data collectors about the results of the analysis, and prepare and distribute data products on a regular basis.

Goals

  1. Provide a timely and complete data and information base of ocean temperature and salinity profile data
  2. To implement data flow monitoring system for improving the capture and timeliness of real-time and delayed-mode data
  3. To improve and implement agreed and uniform quality control and duplicates management systems
  4. To facilitate the development and provision of a wide variety of useful data analyses, data and information products, and datasets

Data Access Characteristics 

GTSPP consolidates ocean profile data into a single format with consistent quality control and duplicates processing. It allows for ready comparisons of CTD and Argo data, XBT with CTD, CTDs from instrumented animals with Argo, etc. The GTSPP sends additional operational data from non GTS sources to numerical weather forecasting services. Marine operations receive data in operational time frames for such operations as ship routing and fishing strategies. 

The program also provides higher quality timely data sets of temperature and salinity observations which are used for seasonal to inter-annual forecasting. Science and engineering users receive higher quality and more timely data sets for strategic studies and design. The GTSPP users include, but are not limited to, the Australian Bureau of Meteorology (BOM). the National Centers for Environmental Prediction, and the Geophysical Fluid Dynamics Laboratory, the Southwest Regional Office of the U.S. National Marine Fisheries Service, the European Centre for Medium-Range Weather Forecasts, the Marine Data and Information Service (NMDIS) of China, and the Japan Meteorological Agency (JMA). 

Steering Group Membership

  • One representative from each core participating country (IOC Member States and WMO Members actively engaged in data and information exchanges with GTSPP.)
  • Experts from one or more Member / Member States of other programs/projects that are relevant to GTSPP may accompany representatives.
  • Representatives invited by the SG from Member States of the IODE and JCOMM and representatives of relevant oceanographic projects.
  • The Chair is selected by the Steering Group and will be reviewed every two sessions.
  • Funding for participants and sessions of the SG will be provided by Members/Member States.

GTSPP Steering Group

Chair: 
Peter Chu
Long-Term Archive Center
Name: 
National Centers for Environmental Information (NCEI)

Country: 
USA

Contact: 
Christopher Paver
Data Assembly Centers
Name: 
Commonwealth Scientific and Industrial Research Organisation (CSIRO)

Country: 
Australia

Contact: 
Rebecca Crowley
Name: 
Marine Environmental Data Services (MEDS)

Country: 
Canada

Contact: 
Mathieu Ouellet
Name: 
Institute Francais pour la Recherche et l’Exploration de la Mer (IFREMER)

Country: 
France

Contact: 
Thierry Carval
Name: 
Atlantic Oceanographic and Atmospheric Laboratory (AOML)

Country: 
USA

Contact: 
Gustavo Goni
Data Product Center
Name: 
Japan Meteorological Agency (JMA)

Country: 
Japan

Contact: 
Masaya Konishi
Advisory Committee
  • Molly Baringer
  • Guilherme Castelao
  • Lijing Cheng
  • Mauro Cirano
  • Yulong Liu
  • Franco Reseghetti
  • Reiner Schiltzer
  • Janet Sprintall

Terms of Reference

Revised Terms of Reference and Composition for the Steering Group on the Global Temperature and Salinity Profile Programme (GTSPP)

The Steering Group shall conduct the program for the collection and management of temperature and salinity data sets to support IODE (International Oceanographic Data and Information Exchange) and JCOMM (Joint Technical Commission for Oceanography and Marine Meteorology) requirements with the following Terms of Reference and general membership.

  1. Provide scientific and technical guidance for the program in the implementation and enhancement of the GTSPP including:
    • Near real time data (observations within 30 days) acquisition
    • Non real time data (observations older than 30 days or data never circulated on the Global Telecommunication System) acquisition
    • Communications infrastructures
    • Quality control and analysis procedures
    • Continuously managed database
    • Ocean data and metadata standards
    • Data and information products
  2. In conjunction with user groups and data collectors, design and implement data flow monitoring systems to ensure that the data are collected, processed and distributed according to agreed schedules and responsibilities.
  3. Collaborate with international Climate Observing System) and GOOS (Global Ocean Observing System) to assemble process and disseminate data managed by GTSPP.
  4. Actively promote the GTSPP and provide information to the users of GTSPP services, such as the planners of international science programs.
  5. Provide GTSPP status reports and other requested material to the IODE committee and JCOMM ETDMP, to international programs in which GTSPP is a participant.

Acknowledgments

We are very grateful to the following for contributions to the compilation of this scientific quality data set: Rick Bailey (CSIRO/BMRC JAFOOS), Lisa Cowen (CSIRO/BMRC JAFOOS), Yeun-Ho Chong Daneshzadeh (AOML), Ann Gronell (CSIRO/JAFOOS), Norman Hall (US NODC), Bob Keeley (MEDS), Melanie Hamilton (US NODC), Roger Menard (MEDS), Marguerette Schultz (SIO), Bob Molinari (AOML), Michael Simmons (US NODC), Don Spear (MEDS), Charles Sun (US NODC), Loic Petit de la Villeon (IFREMER), and Warren White (SIO).

Special thanks are extended to Thierry Carval (IFREMER), Steve Diggs (SIO), Peter Jackson (CSIRO), Doug Hamilton (US NODC), Gary Meyers (CSIRO), Helen Phillips (CSIRO), Yvette Raguenes (IFREMER), Jean-Paul Rebert (IRD), Neil Smith (Australia BMRC), Neville Smith (Australia BMRC), Edwina Tanner (Australia NODC) and Ron Wilson (MEDS) for their contributions.

Many agencies have played important roles in the development of the GTSPP system. The most important contributors are the collectors of the original data. Without their efforts, this compilation of data and information would not have been possible. Each participating agency carries out a number of functions in handling the data for the GTSPP.

Thanks are also due to other agencies who have provided data feeds, organization and ideas in the development and running of the system. These include the Japanese Oceanographic Data Centre, the Japanese Meteorological Agency, the Bundesamt für Seeschiffahrt, the Australian Oceanographic Data Centre, the Surface and Subsurface Data Centre in Brest, the U.S. Fleet Numerical Oceanographic Center.

XBT Probe Type Codes

DPC$: 
XBT depth correction status. Codes to describe Depth Correction status

FRA$: 
Fall rate (correction factor for XBT probes)

FRE$: 
Code for fall rate equation used (WMO code 1770)

PEQ$: 
XBT fall rate equation (WMO code 1770)

PFR$: 
XBT probe type, fall rate equation and recorder type (WMO code 1770 and 4770)

PRT$: 
XBT probe type (WMO code 1770)

XEQ$: 
XBT fall rate equation used (WMO code 1770)

Codes are stored in the "SURF_CODES structure" of ASCII and netCDF format files. Identify an "SRFC_Code" set to any above codes. Code values are stored in "SRFC_Parm".

DPC$ indicates the status of depth correction and the value of will be one of the following states:

  • 01 = Known probe type, needs correction,
  • 02 = Known probe type, no need to correct
  • 03 = Unknown probe type, not enough information to make changes
  • 04 = Known XBT probe type, correction was made
  • 05 = Unknown probe type, correction was made

If the code "PFR$" is present, look at the first 3 characters of the value in " SRFC_Parm" as these encode the probe type and the fall rate equation used. Compare these to WMO code table 1770 to determine which equation was used to calculate depth.

For example, the value of the PFR$ code could be 04205 where 042 means a Sippican T-7 probe (table 1770) and 05 means a MK12 recorder (table 4770). Note that 041 is also a Sippican T-7 but with different (older) fall rate equation coefficients.

Unless information is specifically present, you should assume the old fall rate equations have been used. XBTs that have used the new fall rate equations always have information about the probe, recorder, and the equations.

ASCII Format Description

The ASCII (character) format is moderately complex, because it incorporates metadata along with actual observations. It uses a number of international codes to describe data and collection circumstances. International vessel call sign tables and ship platform names are also provided.

Station Record Components

First Component 

Fixed number of fields, always present. Includes:

  • Location
  • Time of the station
  • Receipt
  • Number of repeats of other components found in the 'Station' record
Second Component
  • Number of station profiles
  • Duplicate flags
  • Variable accuracy and precision
  • Segment profiles for depth
Third Component

Carries information about other variables measured at the station, such as winds, air temperature, etc. These measurements are expressed as numeric values. A code table indicates the variable measured. This component can be repeated as many times as necessary to capture all numeric variables present.

Fourth Component

Carries information about other variables measured at the station which are recorded as alphanumerics, including Beaufort winds, QC tests executed, etc. This component can be repeated as many times as necessary to capture all alphanumeric variables present.

Fifth Component

Used to record the processing history of the station. This component includes documentation of any changes to a variable, including the date of the change, the person who made it, and the rationale behind it. If values are changed, the original value will also be stored. This component can be repeated as necessary.

Sample Layout with Specifications

Temperature and a salinity profiles were collected at a station, with observations every meter to 3500 m depth. Wind speed and direction were measured, the Beaufort wind speed was recorded and the station had 5 different actions taken against it.

  Component Contents
1 Station location, time and other information.
2 Two repeats of profile information, one for temperature and one for salinity
3 Two repeats, one for each of wind speed and direction
4 One repeat for the Beaufort code
5 Five repeats of the history information, one to describe each action taken against the record

Interpreting GTSPP ASCII Files

Linux/Unix Instructions:

Enter "gtspp2txt.pl -h" on the command line to open the help menu as shown below:

gtspp2txt.pl - Converts MEDS-ASCII to Text or CSV
Version 1.3 May 25, 2011
----------
Usage: gtspp2txt.pl [options]
 Options flags may be in any order, flags may be upper or lower case
 Space between flag and parameter is optional
 Input filename may be provided without the -i flag
 If gzip is installed, will read compressed input filename ending in .gz
  -i filename = Input MEDS-ASCII file (may be gzipped)
  -o filename = Output file
  -unw        = Text output with QC Surface Codes 'unwound' (Default)
  -txt        = Text output with no interpretation of surface codes
  -csv        = Comma Separated Values output
  -doc        = prints description of QC group expansion
  -h          = Help (Prints this message)
Examples:
  gtspp2txt.pl -i input.meds -o output.txt -unw
    (writes output to file as text with QC interpretation)
  gtspp2txt.pl -i input.meds -o output.txt -txt
    (writes output as text with no QC interpretation)
  gtspp2txt.pl -i input.meds -o output.txt -csv
    (writes output as comma separated values)
  gtspp2txt.pl input.meds.gz
    (reads gzipped file, writes text output to screen with QC interpretation)

Profile Records

There may be one or more 'Profile' record associated with each station record. The associated  profile records always follow immediately after the station record to which they are linked.

First Component
  • Always present
  • Has a fixed number of fields
  • Repeats the station location and time
  • Identifies the profile type and segment of the profile
  • Indicates depths or pressures (if recorded) and the number of depth-variable pairs
Second Component
  • Records the depth and measured variable
  • Quality control flags that have been applied at each depth
  • This component can be repeated as often as necessary

Sample Layout with Specifications

  Component Contents
1 Station location, time, profile, and segment identifiers
2 Up to 1500 repeats of depth-variable information and associated quality control flags

NetCDF Description

All GTSPP data is located and represented by three spatial axes (longitude, latitude, and depth) and one temporal (time) axis. Other information is not included as an axis, it must be included elsewhere within the data file, if the file is to be self-describing.

Axes

Geographic axes in a GTSPP netCDF file are currently described by a numeric variable code included as an attribute in the data file. The disk file “epic.key" contains all EPIC variable codes with other related information. The numeric variable code is a unique identifier for the variable or axis, and is described below under Variables

Each axis needs to be defined with a numeric code for EPS library V2.1 and earlier, and for PPLUS V1.2c and earlier. In future releases, axis variable codes may be replaced with units from UDUNITS.

Axis Conventions

Use the listed conventions to represent each axis in netCDF format

Longitude: East Convention
Eastern longitudes: 
positive numbers (170E is +170.0)

Axis unit: 
degree_east
Latitude: North Convention
Northern latitudes: 
positive numbers (e.g., 10N is +10.0)

Southern latitudes: 
negative numbers (e.g., 10S is -10.0)

Axis unit: 
degree_ north
Depth: Oceanographic Convention
Depth: 
positive number, increasing downwards from the surface of the water towards the bottom of the ocean

Axis Unit: 
dbar (pressure axis) or meters (depth axis)
Time axis

WOCE standard

Double real numeric array: 
"days since 1900-01-01 00:00:00 UTC"

Other supported time representations used in the Argo netCDF convention:
"days since 1950-01-01 00:00:00 UTC"

EPIC system library uses the two integer array to return the time axis from a data file:

Integer 1: 
True Julian Day Number with units of days

Integer 2: 
The number of milliseconds since 0000 GMT of the True Julian Day

Note: Oceanographers and meteorologists frequently confuse the True Julian Day (eg, May 23, 1968 is 2,400,000), used by astronomers, the "year-day" (eg, Feb 2 is year-day 33).

Our double-dimensioned integer time word (word1=True Julian Day, word2=milliseconds since 0000 GMT of the True Julian Day) allows millisecond accuracy for time periods extending over centuries. There is a complete set of EPS routines for manipulation, calculation, and character string representation of this standard representation of time.Time axes can be written or read in either real or integer format.

Variables

Global Attribute, Variable Metadata, and Data Metadata Table

Quality Tests Stages

Quality Test review stages are conducted in ascending order of complexity, and the data is retested if a value or variable is corrected. 

Test stage objectives
  1. Review profile position, time, and identification
  2. Resolve impossible variable values
  3. Examine the consistency of the incoming data with respect to references such as climatologies
  4. Examine adjacent profiles in an incoming file for similarities
  5. Manual review of the entire submission

In Stage 1, tests are applied sequentially. Tests within individual stages can be applied in any order, but each stage should be performed sequentially. Each Quality Control Manual test is assigned an index number to base 2, which represents numbers using 0 (zero) and 1 (one). 

  Stage 1: Location and Identification Tests
1.1 Platform Identification (1)
1.2 Impossible Date/Time (2)
1.3 Impossible Location (4)
1.4 Position on Land (8)
1.5 Impossible Speed (16)
1.6 Impossible Sounding (32)
  Stage: 2 Profile Tests
2.1 Global Impossible Parameter Values (64)
2.2 Regional Impossible Parameter Values (128)
2.3 Increasing Depth (256)
2.4 Profile Envelop (512)
2.5 Constant Profile (1024)
2.6 Freezing Point (2048)
2.7 Spike (4096)
2.8 Top and Bottom Spike (8192)
2.9 Gradient (16384)
2.10 Density Inversion (32768)
2.11 Bottom (8388608)
2.12 Temperature Inversion (16777216)
  Stage 3: Climatology Tests
3.1 Levitus Seasonal Statistics (65536)
3.2 Emery and Dewar Climatology (131072)
3.3 Asheville Climatology (262144)
3.4 Levitus Monthly Climatology (524288)
3.5 Levitus Annual Climatology (33554432)
  Stage 4: Profile Consistency Tests
4.1 Waterfall (1048576)
  Stage 5: Visual Inspection
5.1 Cruise Track (2097152)
5.2 Profiles (4194304)

Example: If there are 10 tests, and all are employed, the Test Number is then 000003FF. Every profile includes a hexadecimal number that indicates which tests have been employed. The hexadecimal number is the sum of the index numbers of the tests used.

Hexadecimal (also base 16, or hex) is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F (or alternatively a– f) to represent values ten to fifteen (A=10 through to F=15).

Interpreting QC Test Codes and Results

Quality Control (QC) test codes and results are stored in a eight-byte hexadecimal number, which can represent up-to 32 different tests and results. Each byte of the hexadecimal number has four (4) bits that represent 1 (one: test performed or failed) or 0 (zero: test not performed or passed).

GTSPP QC Test Codes and Results Interpretation Steps
  1. Convert the QC test codes ( or results) from hexadecimal to decimal number
  2. Convert the decimal number to binary number
  3. Reverse the binary number in accordance with the order of the GTSPP QC test numbers
  4. Map the reversed bit numbers to the corresponding numbers of the QC tests in the ascending order and identify the test names and results, accordingly

Example: The following table is a sample list of data quality control tests code (QCP$=41E1FFD) and results (QCF$=0000004) performed by the Integrated Scientific Data Management in Canada.

  1. The first test, 'Platform Identification Test", was performed (1=Y) and the test result showed "Passed" (0=P)
  2. The second test, "Impossible Date/Time", was not performed (0=N)
  3. The third test, "Impossible Location", was performed (1=Y) and passed (0=P)
  4. and so on...

QC Codes and Results Example

Quality Control Documents and Tables

GTSPP Quality Flags

Class Quality Description Comment
Flag 1 No QC No quality control has been performed on this element Flag 0 data are the level at which all data enter the working archive. They have not yet been quality controlled.
Flag 2 "Probably" good data The element appears to be probably good. Flag 2 data are good data in which some features (probably real) are present but these are unconfirmed. Flag 2 data are also data in which minor malfunctions may be present but these errors are small and/or can be successfully corrected without seriously affecting the overall quality of the data.
Flag 3 "Probably" bad data The element appears doubtful Flag 3 data are suspect data in which unusual, and probably erroneous features are observed.
Flag 4 Bad data The element appears erroneous Flag 4 data are those in which obviously erroneous values are observed.
Flag 5 Changed The element has been changed Flag 5 data have been altered by a QC Centre, with original values (before the change) preserved in the history record of the profile.
Flags 6 - 8 Reserved Reserved for future use Flags 6 - 8 are reserved for future use.
Flag 9 Element missing The element is missing Flag 9 data indicate that the element is missing.

Delayed-Mode Data Duplicates Identification

Exact Duplicates Check

When possible, real time data is replaced with high resolution, delayed mode equivalents. Each new file checked undergoes duplicate testing for:

  • Date and time (year, month, day, hour, minute)
  • Latitude and longitude (degrees, minutes, seconds, hemisphere)
  • Data type

Additionally, each record of the new file is compared to data in the GTSPP database to identify exact duplicate records. A database update file is created from the input file from which all duplicates (either in the file or between the file and database) are excluded.

Inexact Duplicates Check

Periodically, the GTSPP database is checked for inexact or near duplicate records in which two or more observations are:

  • Of the same data type
  • Within 15 minutes time
  • Within 5 kilometers distance of each other.

The following information from "near-duplicate" records is displayed on the screen for review:

  • NODC accession number (identifies the source dataset)
  • program identifier (software that performed the most recent operation on this record)
  • database load date
  • number of profiles (1 = temperature; 2 = temperature and salinity)
  • number of depth-data pairs
  • platform code
  • call sign
  • latitude
  • longitude
  • observation date and time
  • data type
  • GTSPP database unique station identifier

Real Time Duplicates Identification

GTSPP uses the Real Time Assembly and QC Centre from Integrated Science Data Management (ISDM) in Canada to identify duplicate reports in the BATHY (temperature only) and TESAC (temperature and salinity) data streams coming from the Global Telecommunications System (GTS).

The long term archive receives data from multiple sources to ensure that datasets are as comprehensive as possible. This process is thorough, but can create data duplicates that take up space and corrupt archival functions.

There are two tests designed to catch duplicates. To perform these tests, users need to be able to access identification, date-time, and position information simultaneously. To process potential duplicates without sifting through thousands observations, the GTSPP Real Time Assembly and QC Centre system has a series of tests to isolate and evaluate potential duplicates.

Testing

GTSPP Potential Duplicate Tests

  1. Check for matching platform identification (in this case, the ship call sign), date, and time. If these fields are the same, the reports are potential duplicates.
  2. Check for matching observation times and geographical position. If the times are within 15 minutes and the positions are within 5 km, the reports are potential duplicates.

Organize Data by Date-Time Sort

A duplicate testing FORTRAN program reads the first station in the input file, and all the subsequent stations that have an observation time within 15 minutes of the first station. As each station is read, several fields are abstracted and stored in a series of ring buffer arrays in named common storage.

Abstracted Fields

C Common storage area and declarations for ring buffer used in the C duplicates identification
C COMMON/RING/KMSG_NO(500),KCALL(500),KDATE(500),
+ KTIME(500),KLAT(500),KLONG(500)
C CHARACTER*10 KCALL,KDATE*8,KTIME*4
C REAL*4 KLAT,KLONG
C

KMSG_NO: 
A variable that counts up from 1 from the first message in the input file. Its use will be discussed below.

KCALL: 
Platform Identification

KDATE: 
Observation Date

KTIME: 
Observation Time

KLAT: 
Latitude

KLONG: 
Longitude

Ring buffer arrays are named for their storage pattern. Information from the first station is stored in array location 1. Each subsequent station’s reading is stored in a corresponding location array.

Identify Potential Duplicates

Once every record within the 15 minute window is read, the first location in the array becomes the target observation and processing begins. Next, each subsequent observation is compared with the target as in the two tests described above. If either test detects a potential duplicate, that report is recorded as the value KMSG_NO on the potential duplicates list.

Duplicate Review

Once potential duplicates are identified, the list is reviewed to identify true duplicates.

Tests

Compare Subsurface Information 

Designed to test pairs of observations. If the target and potential duplicate are within selected location/time window, but contain completely different subsurface information, the observation is not a duplicate.

Review Position Data 

If a potential duplicate initially identified by overlapping platform-date-time information falls outside the 5 km window, it is not a duplicate.

Example: When data collected at the same time, from different ships are both tagged with the SHIP call sign.

The duplicates identification system uses a random access input file, which allows testers to its contents non-sequentially, using the Indexed Sequential Access Method (ISAM) supported by the DEC Alpha OpenVMS operating system, and based on relatively simple access keys.

This method can be limiting because users can only access one key at a time, but hasn’t been an issue when using ISAMs to manage ocean data in MEDS.

The variable KMSG_NO is included in the "ring buffer" for the list of duplicates, and is the ISAM key for retrieving a pair of observations with their subsurface profiles for comparison.

Considerations

Once observations and their corresponding subsurface profiles, there are several potential outcomes to consider:

  • The algorithm identifies duplicates between the real time and delayed mode versions of an observation. Depth observations are usually different, and temperature and salinity measurements will probably differ in at least the first or second decimal point. This means that a straight comparison of depths and variable values at the depths cannot confirm that observations are not duplicates. However, potential duplicate profiles that have matching depths and variables at each depth
  • Each observation has a stream identification variable to distinguish between real time and delayed mode data, allowing the computer to determine whether the subsurface information from two observations should be the same or not.

If the message comes from different streams, then one would not expect the subsurface information to be the different. If from the same stream, they should be the same.

Conclusions

Duplicates

Both observations come from the same stream type and the subsurface values for depth, temperature, and salinity are similar enough* to confirm duplication

  1. Identical values for depths and the variables observed at each depth or
  2. The same type and number of variables observed as a function of depth (i.e. both have temperature only, or both have temperature and salinity)

Not Duplicates 

Both observations come from the same stream type, but the subsurface values for depth, temperature, and salinity are different enough* to be removed from consideration as potential duplicates.

  1. The observation depth ranges don’t overlap by at least 99%
  2. The temperature and/or salinity observation depths are different
  3. More than 80% of the reported subsurface levels have a different depth, temperature, or salinity

Undetermined

The observations don’t align with either conclusion, and the decision must be referred to an operator. This occurs when the two observations come from different stream types, or when neither of the cases 1) or 2) are satisfied. The decision is referred to an operator, who receives a printed copy of the potential duplicates list. A technician then reviews the listing and makes the appropriate decision.

The duplicates identification system runs as a batch job. Once the technician has reviewed the list, he or she then applies the final decisions to the ISAM file used as input to the original batch run.

Potential Duplicates Example

 

MESSAGE GROUP NUMBER 4**************************************

 

********** NUMBER 1, ********** UNIQUE IDENT IS 0

Ident: JCC     90 Date/Time: 19901115/0228 QDT/QP/QR: 111    
Latitude: 25.50 Longitude: -123.27 Header:    
Profile: TEMP Segment: 01 No. Depths: 10 Deepest Depth: 450.0 Dup: D
0.0 0   25.70 0       42.0 0   25.70 0       60.0 0   23.70 0       79.0 0   23.20 0
100.0 0   21.70 0       128.0 0   19.20 0       178.0 0   16.80 0       200.0 0   16.50 0
400.0 0   10.00 0       450.0 0   9.60 0
Stream: FNBA Source: I Data type: BA Hist flag:  Update flag: S

 

********** NUMBER 2, ********** UNIQUE IDENT IS 0

Ident: JCCX    90 Date/Time: 19901115/0228 QDT/QP/QR: 111    
Latitude: 25.50 Longitude: -123.27 Header: SOVX01 RJtd    
Profile: TEMP Segment: 0 No. Depths: 10 Deepest Depth: 450.0 Dup: N
0.0 0   25.70 0       42.0 0   25.80 0       60.0 0   23.70 0       79.0 0   23.30 0
100.0 0   21.70 0       128.0 0   19.20 0       178.0 0   16.90 0       200.0 0   16.50 0
400.0 0   10.40 0       450.0 0   9.60 0
Stream: MEBA Source: I Data type: BA Hist flag:  Update flag: U

 

********** NUMBER 3, ********** UNIQUE IDENT IS 0

Ident: JCCX    90 Date/Time: 19901115/0228 QDT/QP/QR: 111
Latitude: 25.50 Longitude: -123.27 Header: SOVX01 RJtd
Profile: TEMP Segment: 01 No. Depths: 10 Deepest Depth: 450.0 Dup: D
0.0 0   25.70 0       42.0 0   25.80 0       60.0 0   23.70 0       79.0 0   23.30 0
100.0 0   21.70 0       128.0 0   19.20 0       178.0 0   16.90 0       200.0 0   16.50 0
400.0 0   10.40 0       450.0 0   9.60 0
Stream: MEBA Source: I Data type: BA Hist flag:  Update flag: S

 

********** NUMBER 4, ********** UNIQUE IDENT IS 0

Ident: JCCX    90 Date/Time: 19901115/0228 QDT/QP/QR: 111
Latitude: 25.50 Longitude: -123.27 Header:
Profile: TEMP Segment: 01 No. Depths: 10 Deepest Depth: 450.0 Dup: D
0.0 0   25.70 0       42.0 0   25.70 0       60.0 0   23.70 0       79.0 0   23.20 0
100.0 0   21.70 0       128.0 0   19.20 0       178.0 0   16.80 0       200.0 0   16.50 0
400.0 0   10.30 0       450.0 0   9.60 0
Stream: FNBA Source: I Data type: BA Hist flag:  Update flag: S

 

********** NUMBER 5, ********** UNIQUE IDENT IS 0

Ident: JCCX    90 Date/Time: 19901115/0228 QDT/QP/QR: 111
Latitude: 25.50 Longitude: -123.26 Header: SOVX01 RJtd
Profile: TEMP Segment: 01 No. Depths: 10 Deepest Depth: 450.0 Dup: D
0.0 1   25.70 1       42.0 1   25.80 1       60.0 1   23.70 1       79.0 1   23.30 1
100.0 1   21.70 1       128.0 1   19.20 1       178.0 1   16.90 1       200.0 1   16.50 1
400.0 1   10.00 1       450.0 1   9.60 1
Stream: NWBA Source: D Data type: BA Hist flag:  Update flag: D

 

In the above example, four potential duplicates are identified in the target observation (the first BATHY). Each potential duplicate had the same:

  • Stream type (BATHY observation) depth ranges
  • Date-time
  • Lat-long

This group of observations were referred to the technician, because some of them have temperature differences at several depths.. For example, at the 79 meter depth, the temperature is 23.20 in the fourth version of the BATHY, and 23.30 in the fifth version of the BATHY.

Technicians decide which version makes the final dataset based on guidelines and prioritizations, such as selecting the version that covers the greatest range of depths or the correct call sign. Note that the first message in the group has an incomplete call sign and was included in the group through the fuzzy time-fuzzy area test.

QC Software

interpretQC.pl

Download

QCed

This Data Quality Cruise Editor (qced) Software is designed for the Global Temperature-Salnity Profile Program (GTSPP), which is written in IDL (interactive data language) that allows an operator to view and edit temperature and salinity data from files in the GTSPP MEDS-ASCII format.

Download

QCed Features
  • Map of ship position for visual inspection of the cruise.
  • Bar graph of the ship speeds between stations in the cruise.
  • Waterfall plot of neighboring profiles.
  • Profile plot overlaid on the World Ocean Atlas 2005 climatology and ETOPO5 Bathymetry plots.
  • Temp/Salinity plot when both are available.
  • Formatted text display of all fields from the data file.
  • Key metadata displayed in a scrolling list.
  • Performs a suite of automated data quality tests and displays "trouble lights" to draw operator attention to questionable data.
  • Operator may edit a) Time and Position and/or b) QC flags for temperature or salinity values.
  • In output file, generates history records to document changes.
  • Science QC generates customized action codes in the history.

Help

How do I select stations with a quality flag of 'good' for the position and date/time?

Good position and date/time quality variables are both single character fields, and are always flagged with a 1.

Position quality flag variables:

  • position_quality_flag in netCDF
  • Q_POS in the ASCII

Date and time quality flag variables:

  • time_quality_flag in netCDF
  • Q_Date_Time in ASCII

How do I select only stations with temperature and salinity profiles?

Look at the contents of the PROF structure of the station record, and check for both temperature and salinity profiles. Variable PROF_TYPE is a four-character variable that describes the profile. You must check for 'TEMP' for temperature and any of 'PSAL for salinity.