Downloading ISCCP files (wget, curl, etc)

135 views
Skip to first unread message

ken....@noaa.gov

unread,
Jul 19, 2017, 8:24:02 AM7/19/17
to ISCCP-QA
How do I download lots of files at once?

ken....@noaa.gov

unread,
Jul 19, 2017, 8:26:12 AM7/19/17
to ISCCP-QA
Files can be downloaded in many ways. Here are some ideas:

ISCCP Download Suggestions

Single file transfer

Web browser

Simply right click file and select “Save link as”

wget

Using the command “wget <URL>” seems to work for one file. For example:

wget https://www.ncei.noaa.gov/data/international-satellite-cloud-climate-project-isccp-h-series-data/access/isccp/hgm/ISCCP.HGM.v01r00.GLOBAL.1984.01.99.9999.GPC.10KM.CS00.EQ1.00.nc

curl

Like wget, curl also seems to work well with a single file. Use the “-O” option to save the file using the ISCCP filename.

curl -O https://www.ncei.noaa.gov/data/international-satellite-cloud-climate-project-isccp-h-series-data/access/isccp/hgm/ISCCP.HGM.v01r00.GLOBAL.1985.07.99.9999.GPC.10KM.CS00.EQ1.00.nc

Bulk file transfers

Web browser

Not really designed for bulk downloads.

wget

We have found the following to work with NCEI https files:

wget -r -c -nH -nd -np -A nc

The options are:

  • -r = recursive

  • -c = continue (in  case it gets interrupted)

  • -nH = doesn’t create host directories, stores all files in current directory

  • -nd = don’t create directories (stops wget from making too many directories)

  • -np = don’t go to the parent directory

  • -A nc = download all “nc” files … which should grab the netCDF files.

All these options are not required, but they make getting the

Example:

To grab all the ISCCP Basic HGM files:

wget -r -c -nH -nd -np -A nc https://www.ncei.noaa.gov/data/international-satellite-cloud-climate-project-isccp-h-series-data/access/isccp/hgm


Some users have found that they need other options for bulk wget downloads. Some options that might be necessary include:

  • --no-check-certificate = don't validate the server's certificate

  • Options for cookies ( --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --keep-session-cookies)

Your system may require some or all of these and possibly others.

curl

Curl does have the capability to download multiple files where the filename is known using brackets, but it does try to download all combinations in the brackets.

Example:

curl -O https://www.ncei.noaa.gov/data/international-satellite-cloud-climate-project-isccp-h-series-data/access/isccp/hgm/ISCCP.HGM.v01r00.GLOBAL.1985.[01-12].99.9999.GPC.10KM.CS00.EQ1.00.nc

This will download all months of 1985 (see bolded text).

Example:

To download all monthly files:

curl -O https://www.ncei.noaa.gov/data/international-satellite-cloud-climate-project-isccp-h-series-data/access/isccp/hgm/ISCCP.HGM.v01r00.GLOBAL.[1983-2009].[01-12].99.9999.GPC.10KM.CS00.EQ1.00.nc

This will download all monthly files by cycling through the years and the months. However, it will try to download the first 6 months of 1983 which have yet to be produced, so those files will fail.


Reminder: Failed downloads in curl produce files with a size of 475 bytes. These usually result from bad URLs.

Other apps

Other apps that simplify bulk file transfers will be listed here as we learn of them.


Reply all
Reply to author
Forward
0 new messages