Order Guide

HAS uses subdirectories to efficiently organize the data ordered from the NCEI Archive. The directory structure for each order includes the listing of all ordered files and subdirectories containing 100 files per directory. The text file list is named ‘fileList.txt’ and can be used in download scripts to automate the retrieval of each sub-directory of file.

1) User Interface examples

    1.1) Cyberduck
    1.2) Windows Explorer

2) Script examples

    2.1) wget mirror
    2.2) wget individual files
    2.3) Traditional FTP client
    2.4) lftp


1.1) Cyberduck

Cyberduck is a free, open source tool designed for transferring files over FTP. To download an entire order, open a connection to ‘ftp.ncdc.noaa.gov’ and check the ‘anonymous login’ box. Navigate to your order directory ‘/pub/has/HASxxxxxxxxx’ (where x is your order number). Select all subdirectories and the fileList.txt file, using the shift or control buttons and clicking. Right-click and select ‘Download’ or ‘Download To’ to start the download.

1.2) Windows Explorer includes an FTP client.

To download from Windows Explorer using FTP, enter the FTP URL for the order folder in the navigation bar. Then press the ‘Organize’ button and select ‘Select All’. The selected files and directories may now be copy and pasted or dragged to a destination location.

2.1) wget mirror

To mirror the entire order directory using the wget command-line utility, use the following command:


wget -erobots=off -nv -m -np -nH --cut-dirs=2 --reject "index.html*" http://www1.ncdc.noaa.gov/pub/has/HAS010577159/


-nv : not verbose
-m : mirror
-np : don’t follow links to download parent directories
-nH : don’t create the host directory
--cut-dirs=2 : remove the /pub/has directories in the local file structure
--reject “index.html*” : don’t download the index.html files which are automatically generated by the web server

The subdirectories will be created using this method.

2.2) wget individual files

To download the order directory tree and save all files in a single directory using the wget command-line utility, use the following command:

wget -erobots=off -nv -m -np -nH --cut-dirs=100 --reject "index.html*,fileList.txt" http://www1.ncdc.noaa.gov/pub/has/HAS010816322/

Or use the following script:

# This script will download the fileList.txt file and iterated through the list of files
# to individually download each file.
#
# “-nv” means ‘not verbose’
# “-O -” means send the output to standard out
for x in $(wget -nv -O - http://www1.ncdc.noaa.gov/pub/has/HAS000004599/fileList.txt);
do
wget -nv http://www1.ncdc.noaa.gov/pub/has/HAS000004599/$x;
done

2.3) Traditional FTP command-line

open ftp.ncdc.noaa.gov (Logon to NCEI system)
anonymous (login userid)
user@internet (enter your e-mail address as a password)
binary (changes transfer mode to binary)
prompt (disable prompt before each file download)
cd /pub/has/HAS010577159/ (changes to correct directory)
cd 0001 (change to desired subdirectory)
mget *
cd ../0002 (change to next subdirectory)
mget *
.. and repeat..

2.4) lftp FTP utility

To download with the lftp FTP utility, execute the following command.

Interactively:
lftp ftp.ncdc.noaa.gov (connect and login anonymously to NCEI system)
cd /pub/has/HAS010577159 (change to correct directory)
mirror . (mirror/download all data including subdirectories)

Single Command:
lftp -c "open ftp.ncdc.noaa.gov:/pub/has/HAS010577159; mirror ."

Additional information on ‘lftp’ usage can be found at: http://lftp.yar.ru/