How to efficient download Hycom reanalysis

1,020 views
Skip to first unread message

Aragon Caminero, German

unread,
May 22, 2019, 10:31:08 AM5/22/19
to fo...@hycom.org

Hello there,

 

I’m having problems trying to download this product: https://www.hycom.org/dataserver/gofs-3pt1/reanalysis with ref.: GLBv0.08 53.X

 

We want to download the full reanalysis (lon/lat/time) taking only 8 vertical layers by means of a stride [5] and we activated nc4 conversion and added lon/lat variables for CF-conventions. We select these variables:

·         surf_el

·         salinity

·         water_temp

·         water_u

·         water_v

 

An example of the first url for the year 1994 will be:

http://ncss.hycom.org/thredds/ncss/GLBv0.08/expt_53.X/data/1994?var=surf_el&var=salinity&var=water_temp&var=water_u&var=water_v&horizStride=1&time_start=1994-01-01T12%3A00%3A00Z&time_end=1994-01-01T21%3A00%3A00Z&timeStride=1&vertStride=5&addLatLon=true&accept=netcdf4

 

I code a small python program for that based on netCDF Subset access. We are requesting 12h dataset at each petition and looping over the entire year.

The python3 code I am using is this: “start_serie.py” attached in this email.  I am using request library to request these urls but we are having a lot of problems with the response of the server, and it normally get stacked during any request…

Also I tried to run several processes in the same machine for different years in order to speed up the downloading and it neither worked.

 

Is there any ip/request limit?

 

Is there any way to download this database in a more efficient way???

 

I will appreciate very much your help. Thank you in advance!

 

signature_946578819

 

Germán Aragón Caminero

Oceanografía, Estuarios y Calidad del Agua

Instituto de Hidráulica Ambiental de la Universidad de Cantabria "IHCantabria"

 

Parque Científico y Tecnológico de Cantabria

C/ Isabel Torres, Nº 15

C.P. 39011 Santander

Teléfono: +34 942 20 16 16  Ext. 1341

 

Conoce IHCantabria: signature_11522811        Memoria X Aniversario signature_969334568

 

start_serie.py

Michael McDonald

unread,
May 24, 2019, 4:36:39 PM5/24/19
to Aragon Caminero, German, fo...@hycom.org
This is not what NCSS is intended for. Please use OPENDAP directly with a client-side subset utility (ncks) to download the entire global reanalysis. note: this will be about 50TB of data on disk and will still take a considerable amount of time to download.

The 1st step is to get a list of the individual files (not the virtually aggregated dataset) via http/FTP/rsync listing

e.g.,
...

and then iterate over the lists with a simple IN OPENDAP URL and OUT netcdf4 file with ncks

e.g.,
ncks -D 1 -4 -d depth,0,,5 -v surf_el,salinity,water_temp,water_u,water_v \
-O hycom_GLBv0.08_539_2015010112_t000_d5stride_varsubset.nc4

Please give this a try and let us know how it works for you. My few tests took less than 3min to complete per hour and created ~ 975 MB output files per hour. 




I’m having problems trying to download this product:
https://www.hycom.org/dataserver/gofs-3pt1/reanalysis with ref.: GLBv0.08 53.X

 

We want to download the full reanalysis (lon/lat/time) taking only 8 vertical layers by means of a stride [5] and we activated nc4 conversion and added lon/lat variables for CF-conventions. We select these variables:

·         surf_elsalinity,  water_temp,  water_u,  water_v

--
Michael McDonald
HYCOM.org Administrator
Reply all
Reply to author
Forward
0 new messages