Problems accessing new GOM 31.0 data with OPeNDAP

40 views
Skip to first unread message

Jason Roberts

unread,
May 1, 2012, 11:36:40 AM5/1/12
to fo...@hycom.org

Hello HYCOM folks,

 

I’m having trouble accessing the new GOM 31.0 data with OPeNDAP (http://tds.hycom.org/thredds/dodsC/GOMl0.04/expt_31.0.html). Simply opening the URL in the browser takes 5-10 minutes. Querying any variables takes longer, if it completes at all. For example, this morning I was able to query the MT variable—it took a few minutes—but a request for the Depth variable never completed (I cancelled it after 20 minutes).

 

Do you have any idea when the OPeNDAP version of the data might be usable? It would be very helpful to me if it was usable by May 14. I am supporting someone who is actively using these data to plan NOAA larval surveys in the Gulf. The next cruise that would benefit from these data departs on May 14.

 

Thanks for your help,

 

Jason

Michael McDonald

unread,
May 3, 2012, 2:08:33 PM5/3/12
to Jason Roberts, fo...@hycom.org
Jason,

> I’m having trouble accessing the new GOM 31.0 data with OPeNDAP
> (http://tds.hycom.org/thredds/dodsC/GOMl0.04/expt_31.0.html). Simply opening
> the URL in the browser takes 5-10 minutes. Querying any variables takes
> longer, if it completes at all. For example, this morning I was able to
> query the MT variable—it took a few minutes—but a request for the Depth
> variable never completed (I cancelled it after 20 minutes).

The hourly granularity of this GOMl0.04/expt_31.0 dataset is proving
to be quite I/O intensive on our THREDDS server. It's an experiment to
see if all the experiment's data files can reside in a single catalog
and let the THREDDS software auto aggregate (union). With over 15,000
files this can be tricky to optimize. I am working on optimizing this
catalog so that data requests load faster.


> Do you have any idea when the OPeNDAP version of the data might be usable?
> It would be very helpful to me if it was usable by May 14. I am supporting
> someone who is actively using these data to plan NOAA larval surveys in the
> Gulf. The next cruise that would benefit from these data departs on May 14.

In the mean time I would strongly suggest that you query the data
directly (w/o aggregation) using any of the following methods:

* OPeNDAP - legacy mode w/o aggregation
(open the link below in your web browser)
http://dap.hycom.org:8080/opendap/nph-dods/datasets/GOMl0.04/expt_31.0/data/

e.g., OPeNDAP URL,
(use this OPeNDAP URL in your application, just change the "YYYY_DDD_HH")
http://dap.hycom.org:8080/opendap/nph-dods/datasets/GOMl0.04/expt_31.0/data/archv.2012_118_00_3z.nc


* OPeNDAP - THREDDS w/o aggregation
(open the link below in your web browser)
http://tds.hycom.org/thredds/catalog/datasets/GOMl0.04/expt_31.0/data/catalog.html

e.g., OPeNDAP (THREDDS) URL,
(use this OPeNDAP URL in your application, just change the "YYYY_DDD_HH")
http://tds.hycom.org/thredds/dodsC/datasets/GOMl0.04/expt_31.0/data/archv.2012_118_00_3z.nc


* FTP (direct access, each hour of data is only ~170MB)
(open the link below in your web browser or FTP client)
ftp://ftp.hycom.org/datasets/GOMl0.04/expt_31.0/data/

e.g., FTP URL,
(wget/curl/etc this URL in your application, just change the "YYYY_DDD_HH")
ftp://ftp.hycom.org/datasets/GOMl0.04/expt_31.0/data/archv.2012_118_00_3z.nc


Hope this helps.

Michael McDonald

unread,
May 4, 2012, 12:34:47 PM5/4/12
to Jason Roberts, fo...@hycom.org
Jason,

> It is possible for us to handle the aggregation on our end.

Not that I am aware of. I am fairly certain that you need direct file
access to aggregate/union these netcdf datasets. Even if it was
possible to remotely aggregate data (thredds remote catalog feature,
maybe?), the performance hit you would incur would be far worse than
our current 30sec~5min delay while the system scans/aggregates this
GOM dataset.


> the legacy OPeNDAP is going to be taken down at some point...

No. The legacy access method will be around for as long as users want
it. We still have a few users that need this version for some legacy
applications. Eventually the machines/software requiring this older
version of opendap will retire and then it can be phased out. The
OPeNDAP server running with THREDDS essentially does the same job,
just with newer code, and it runs under tomcat (i.e., java).

The top two access methods that I previously mentioned essentially do
the same thing. The dap.hycom.org address uses a older OPeNDAP version
(compiled C code hasn't been touched since ~2008: XDODS-Server:
DAP2/DAP2/3.8.02, XOPeNDAP-Server: opendap/DAP2/3.8.02) and runs under
apache. It is very lightweight and can support multiple instances
(i.e., each request via this method gets its own apache PID). The
tds.hycom.org method just uses the latest and greatest OPeNDAP version
packaged with THREDDS (also based on the DAP 2.0 standard). However,
this OPENDAP+THREDDS portal is susceptible to slowdowns when other
users are querying the aggregation catalogs (served out via this same
tomcat instance), or when the tomcat server is restarted daily @ noon
EST. Apache hardly ever needs to be restarted. There is never a need
to "update the catalog" via this traditional OPeNDAP access method. I
am working on provisioning separate servers and instances of tomcat so
that one access method does not affect the others. We've already done
this for the FTP service and have had great success (running solidly
on a separate server).


> OPeNDAP is going to be taken down at some point...

No. OPeNDAP is and will reamin the preferred/recommended method for
accessing HYCOM data. Once the HYCOM global model increases its
resolution to 1/25deg then the size of the dataset will make it very
prohibitive to download directly via traditional FTP/HTTP means.
Sub-setting via THREDDS/OPeNDAP will be much more heavily relied upon.


> Second question: the prior GOM data only released daily slices on OPeNDAP.
> Do you think you will do that for this new GOM data, either as a separate
> URL beside the hourly one, or as the fallback if you can't get the hourly
> aggregation working?

I suppose if this hourly aggregation catalog is too sluggish for
everyone that we could try only aggregating @ 00z for the *primary*
catalog (like the previous one), and create a second full/hourly
catalog.


> Finally, it does not look like the legacy OPeNDAP is up to date. The most
> recent file is archv.2012_118_00_3z.nc, with a modification time of
> 04-Apr-2012. Will this data start being regularly updated soon? (If not, we
> will not be able to use it for the NOAA mission that is happening almost
> immediately.)

Thanks. You caught my script error. Data was being published to a
folder not viewable from the web. I've updated the code with correct
path and the data is now up to date (day 131 of 2012).

Jason Roberts

unread,
May 3, 2012, 3:37:46 PM5/3/12
to Michael McDonald, fo...@hycom.org
Michael,

Thanks for your response. In the short term at least, it may be feasible for
us to use the legacy OPeNDAP. It is possible for us to handle the
aggregation on our end. In the long term, would it be better for us to use
the legacy one and handle aggregation ourselves or should we hold out for
the aggregated OPeNDAP? The aggregated OPeNDAP would be more convenient but
if it is not likely to work, it seems we wouldn't have a choice. On the
other hand, if the legacy OPeNDAP is going to be taken down at some point,
it would be hard for us to standardize on that.

Second question: the prior GOM data only released daily slices on OPeNDAP.
Do you think you will do that for this new GOM data, either as a separate
URL beside the hourly one, or as the fallback if you can't get the hourly
aggregation working?

Finally, it does not look like the legacy OPeNDAP is up to date. The most
recent file is archv.2012_118_00_3z.nc, with a modification time of
04-Apr-2012. Will this data start being regularly updated soon? (If not, we
will not be able to use it for the NOAA mission that is happening almost
immediately.)

Thanks very much for your help,

Jason

-----Original Message-----
From: e.m.mc...@gmail.com [mailto:e.m.mc...@gmail.com] On Behalf Of
Michael McDonald
Sent: Thursday, May 03, 2012 2:09 PM
To: Jason Roberts
Cc: fo...@hycom.org
Subject: Re: Problems accessing new GOM 31.0 data with OPeNDAP

Jason,

> I'm having trouble accessing the new GOM 31.0 data with OPeNDAP
> (http://tds.hycom.org/thredds/dodsC/GOMl0.04/expt_31.0.html). Simply
> opening the URL in the browser takes 5-10 minutes. Querying any
> variables takes longer, if it completes at all. For example, this
> morning I was able to query the MT variable-it took a few minutes-but
Reply all
Reply to author
Forward
0 new messages