CEDA OPeNDAP failed connections

224 views
Skip to first unread message

Daniel Hobley

unread,
Feb 16, 2021, 10:24:21 AM2/16/21
to sup...@ceda.ac.uk, sup...@opendap.org

Hello,

 

I am looking to use a scripted downloader to get at some CEDA (ceda.ac.uk) climate data through the OPeNDAP interface I gather you designed for them. My issue appears to be a duplicate of this, which I found deep in my websearches: https://groups.google.com/a/opendap.org/forum/embed/#!topic/support/V2WX-vclCfs

 

Essentially, I am trying to use curl (and indeed, I’ve tried R and Matlab embedded routines as well) to issue an OPeNDAP call to this server:

http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.html

As in the issue on the website I pasted above, I can get no adequate response from the server at all. If I paste the Data URL the Access Form provides, curl/netCDF recognises it as an invalid url. If I append “.ascii”, I seem to be able to get a response, but am then immediately unable to actually issue a call to OPeNDAP for only part of the data – which is what I need to do. Pressing the manual button in the Access Form works fine in all cases – but I have 1000 of these to work through, and each whole netCDF4 file is ~130Mb. That means an OPeNDAP call is my only hope of automating this.

 

Can you explain what’s going on here? Is this a known issue? It seems like it is, given that previous issue raised. Am I doing something obviously wrong?

 

(For completeness’ sake, the shortest version of what I tried is the instructions here: https://help.ceda.ac.uk/article/4442-ceda-opendap-scripted-interactions#cert. The permissions part of this works fine, but then I get these responses:

Curl –cert XXXX -L -c [url].nc -o testout  gives 400, Unrecognised request (this is exactly what the Data URL is given as on the Form)

Curl –cert XXXX -L -c [url].nc.ascii -o testout  works for some calls, but many of my files are >50Mb so this then gives 403, too big

Curl –cert XXXX -L -c [url].nc?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout  returns an empty file – again, this is a Data URL output from the form

Curl –cert XXXX -L -c [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout  returns an empty file)

 

Thank you!

 

Dan Hobley

 

 

Daniel Hobley

Modelling & Informatics

 

RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB

 

Tel: +44 (0)7918 888121

 

www.adas.uk | @ADASGroup

 

 

 

 

 

Daniel Hobley

Modelling & Informatics

 

RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB

 

Tel: +44 (0)7918 888121

 

www.adas.uk | @ADASGroup

 

Nathan Potter

unread,
Feb 16, 2021, 12:08:56 PM2/16/21
to Daniel Hobley, Nathan Potter, sup...@ceda.ac.uk, sup...@opendap.org
Hi Daniel,


My comments are inline below.

> On Feb 16, 2021, at 7:24 AM, Daniel Hobley <Daniel...@adas.co.uk> wrote:
>
> Hello,
>
> I am looking to use a scripted downloader to get at some CEDA (ceda.ac.uk) climate data through the OPeNDAP interface I gather you designed for them. My issue appears to be a duplicate of this, which I found deep in my websearches: https://groups.google.com/a/opendap.org/forum/embed/#!topic/support/V2WX-vclCfs
>
> Essentially, I am trying to use curl (and indeed, I’ve tried R and Matlab embedded routines as well) to issue an OPeNDAP call to this server:
> http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.html
> As in the issue on the website I pasted above, I can get no adequate response from the server at all. If I paste the Data URL the Access Form provides, curl/netCDF recognises it as an invalid url. If I append “.ascii”, I seem to be able to get a response, but am then immediately unable to actually issue a call to OPeNDAP for only part of the data – which is what I need to do. Pressing the manual button in the Access Form works fine in all cases – but I have 1000 of these to work through, and each whole netCDF4 file is ~130Mb. That means an OPeNDAP call is my only hope of automating this.
>
> Can you explain what’s going on here? Is this a known issue? It seems like it is, given that previous issue raised. Am I doing something obviously wrong?


There are a couple of things going on.

The first issue is a possible misunderstanding about what we refer to as the dataset URL.

In your example it would be:

http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc

This dataset URL may return the underlying netcdf file, or it may return a HTTP 4XX error depending on server configuration. Many data providers do not wish to provide download access through the DAP service but rather through a separate download endpoint, or not at all.

The dataset URL is utilized as a base URL by people and client software to form access queries for the dataset. For example one might look at the structural metadata for the dataset with this:
http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.dds

And the semantics metadata with this:
http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.das

The netcdf tool ncdump does exactly this when it responds to:

ncdump -h http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc

It's not actually getting into the file directly, but rather utilizing the (OPeN)DAP proctocol to interrogate the server on the other end.


Second, your subsetting requests are most likely failing because you have not properly encoded the query string. About 18 month the Apache Tomcat web engine (and others) began enforcing the HTTP specifications requirement that special characters be encoded in the query string, and that includes '[' and ']'

So, for any recent release of Tomcat, this is a non-starter:

> [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0]


You can confirm this by adding "-i" to the cURL request. Tomcat will return an HTTP status of 400 along with an empty response body if your URL is not correctly encoded.

I think if you try this:

[url].nc.ascii?tasmax%5B0%3A1%3A0%5D%5B0%3A1%3A0%5D%5B0%3A1%3A0%5D%5B0%3A1%3A0%5D

It will probably work.

I find this useful for testing:

https://www.urlencoder.org

But cURL also has "--data-urlencode" which might be useful to you. Unfortunately this change has only recently made it into the netcdf-java and netcdf-c libraries. Which means that programs that rely on the netcdf-* libraries to access remote data, like R and Matlab, may not be using a recent netcdf library implementation that supports the URL encoding for array subsets. Y.M.M.V.


Fyi:

> tasmax[0:1:0][0:1:0][0:1:0][0:1:0]

Can also be expressed:
tasmax[0][0][0][0]

And:
tasmax[0:1:10][0][0][0]

Can also be expressed:
tasmax[0:10][0][0][0]



Please let me know if this helps you move forward.


Sincerely,


Nathan



>
> (For completeness’ sake, the shortest version of what I tried is the instructions here: https://help.ceda.ac.uk/article/4442-ceda-opendap-scripted-interactions#cert. The permissions part of this works fine, but then I get these responses:
> Curl –cert XXXX -L -c [url].nc -o testout gives 400, Unrecognised request (this is exactly what the Data URL is given as on the Form)
> Curl –cert XXXX -L -c [url].nc.ascii -o testout works for some calls, but many of my files are >50Mb so this then gives 403, too big
> Curl –cert XXXX -L -c [url].nc?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an empty file – again, this is a Data URL output from the form
> Curl –cert XXXX -L -c [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an empty file)
>
> Thank you!
>
> Dan Hobley
>
>
> Daniel Hobley
> Modelling & Informatics
>
> RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB
>
> Tel: +44 (0)7918 888121
>
> www.adas.uk | @ADASGroup
>
>
>
>
>
> Daniel Hobley
> Modelling & Informatics
>
> RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB
>
> Tel: +44 (0)7918 888121
>
> www.adas.uk | @ADASGroup

= = =
Nathan Potter ndp at opendap.org
OPeNDAP, Inc. +1.541.231.3317

CEDA Helpdesk

unread,
Feb 17, 2021, 1:31:22 PM2/17/21
to Daniel Hobley, sup...@opendap.org

William replied

Cc: sup...@opendap.org

Feb 17, 18:29
Hi Dan,

I haven't had a chance to investigate your issue yet but I'll try to get back to you about it as soon as possible.

Thanks for getting in touch.


William
--
William Tucker
CEDA Helpdesk
sup...@ceda.ac.uk
--
We have lots of documentation about our services, take a look here: https://help.ceda.ac.uk

Follow us on Twitter: @cedanews

CEDA Support hours: Monday - Thursday: 9 am-5 pm; Friday: 9 am-4.30 pm (UK, excluding Bank Holidays). Read our privacy and disclaimer policies. 
--


How would you rate my reply?
Great    Okay    Not Good

Daniel Hobley sent a message

Feb 16, 15:24
Hello,


I am looking to use a scripted downloader to get at some CEDA (ceda.ac.uk) climate data through the OPeNDAP interface I gather you designed for them. My issue appears to be a duplicate of this, which I found deep in my websearches: https://groups.google.com/a/opendap.org/forum/embed/#!topic/support/V2WX-vclCfs


Essentially, I am trying to use curl (and indeed, I've tried R and Matlab embedded routines as well) to issue an OPeNDAP call to this server:
http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.html
As in the issue on the website I pasted above, I can get no adequate response from the server at all. If I paste the Data URL the Access Form provides, curl/netCDF recognises it as an invalid url. If I append ".ascii", I seem to be able to get a response, but am then immediately unable to actually issue a call to OPeNDAP for only part of the data - which is what I need to do. Pressing the manual button in the Access Form works fine in all cases - but I have 1000 of these to work through, and each whole netCDF4 file is ~130Mb. That means an OPeNDAP call is my only hope of automating this.


Can you explain what's going on here? Is this a known issue? It seems like it is, given that previous issue raised. Am I doing something obviously wrong?


(For completeness' sake, the shortest version of what I tried is the instructions here: https://help.ceda.ac.uk/article/4442-ceda-opendap-scripted-interactions#cert. The permissions part of this works fine, but then I get these responses:
Curl -cert XXXX -L -c [url].nc -o testout gives 400, Unrecognised request (this is exactly what the Data URL is given as on the Form)
Curl -cert XXXX -L -c [url].nc.ascii -o testout works for some calls, but many of my files are >50Mb so this then gives 403, too big
Curl -cert XXXX -L -c [url].nc?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an empty file - again, this is a Data URL output from the form
Curl -cert XXXX -L -c [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an empty file)


Thank you!


Dan Hobley



Daniel Hobley
Modelling & Informatics


RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB


Tel: +44 (0)7918 888121


www.adas.uk | @ADASGroup




Daniel Hobley
Modelling & Informatics


RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB


Tel: +44 (0)7918 888121


www.adas.uk | @ADASGroup

{#HS:1427462940-41758#}

Nathan Potter

unread,
Feb 18, 2021, 8:43:07 AM2/18/21
to Daniel Hobley, Nathan Potter, sup...@opendap.org



Hi Dan,

I will be the first to admit that I know not very much about PyDap and authentication.

I tried to work thorough this problem using cURL and I found that the way CEDA has configured their OpenID portal will pretty much block any normative HTTP client from authenticating.
This is an endemic problem with deployments of OAuth2/OpenID that pretty much ignore the HTTP side of the interaction, and I think it's really unfortunate.

CEDA does provide some potentially useful instructions (that you may or may not have read) about how to perform this type of access:

cURL:
https://help.ceda.ac.uk/article/4442-ceda-opendap-scripted-interactions

Python w/netcdf:
https://help.ceda.ac.uk/article/4712-reading-netcdf-with-python-opendap

Have you read these?

If not, I think that if you can wade through those instruction you may have better luck.

If so, and you are still getting blocked, then you will need to work directly with CEDA because I pretty much can't help you with what they have deployed.

I'm sorry to punt here, but the access controls on their service are, as far as I can tell, very much bespoke to CEDA and will require their help to navigate.

I am certainly available to help with the (OPeN)DAP request side of things.


Sincerely,


Nathan



> On Feb 18, 2021, at 1:41 AM, Daniel Hobley <Daniel...@adas.co.uk> wrote:
>
> Hi Nathan,
>
> Thanks for taking all this time! I am having an absolute mare with this. In Python, I can try with a simple xarray open, ignoring authentication, which nets me:
>
>>>> url = [url].nc
>>>> xr.open_dataset(url)
> OSError: [Error -47] NetCDF: NC_UNLIMITED in the wrong index: b'[url].nc'
>
> Or I can try to set up my authentication with the Pydap back end, which nets me a second, different error:
>
>>>> session = requests.Session()
>>>> session.auth = ('uname', 'pwd')
>>>> session.cert = 'creds.pem'
>>>> store = xr.backends.PydapDataStore(url, session=session)
>>>> ds = xr.open_dataset(store)
> ---------------------------------------------------------------------------
> UnicodeDecodeError Traceback (most recent call last)
> <ipython-input-6-5b4ae9012b8e> in <module>
> 9 session.auth = ('dhobley001', 'hobl0411')
> 10 session.cert = 'creds.pem'
> ---> 11 store = xr.backends.PydapDataStore.open(url, session=session)
> 12 ds = xr.open_dataset(store)
>
> ~/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/pydap_.py in open(cls, url, session)
> 76 import pydap.client
> 77
> ---> 78 ds = pydap.client.open_url(url, session=session)
> 79 return cls(ds)
> 80
>
> ~/opt/anaconda3/lib/python3.7/site-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout)
> 65 """
> 66 dataset = DAPHandler(url, application, session, output_grid,
> ---> 67 timeout).dataset
> 68
> 69 # attach server-side functions
>
> ~/opt/anaconda3/lib/python3.7/site-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout)
> 55 if not r.charset:
> 56 r.charset = 'ascii'
> ---> 57 dds = r.text
> 58
> 59 dasurl = urlunsplit((scheme, netloc, path + '.das', query, fragment))
>
> ~/opt/anaconda3/lib/python3.7/site-packages/webob/response.py in _text__get(self)
> 620 decoding = self.charset or self.default_body_encoding
> 621 body = self.body
> --> 622 return body.decode(decoding, self.unicode_errors)
> 623
> 624 def _text__set(self, value):
>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)
>
> After some googling, this second bug looks quite malign, related to interaction of the returned netCDF and the innards of Pydap - though I would welcome being wrong!
>
> Any ideas? I am happy to do this in the shell or some other mechanism; I just want it to work!
>
> Dan
>
>
>
>
> -----Original Message-----
> From: Nathan Potter <n...@opendap.org>
> Sent: 17 February 2021 16:55
> To: Daniel Hobley <Daniel...@adas.co.uk>
> Cc: Nathan Potter <n...@opendap.org>
> Subject: Re: [support] CEDA OPeNDAP failed connections
>
>
> XArray and the netcdf module in Python both use these dataset URLs to get data, using the ".dods" response and subsetting with the [] notation.
>
> If they have been recently updated they should just work with the "dataset url":
>
> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.ceda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2F12km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C9644029fae6d494ef94608d8d364cf87%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C637491779960080073%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=2Wp1OX9ztZskQ1yH8omR8GJIA89yA01Ek5HqBlOmRa4%3D&amp;reserved=0
>
> Instead of a local file name.
>
> Please try that and let me know what happens.
>
> Caveat: Authentication! While not Python clients are addressed specifically, this page may be helpful:
>
> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopendap.github.io%2Fhyrax_guide%2FMaster_Hyrax_Guide.html%23_authentication_for_dap_clients&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C9644029fae6d494ef94608d8d364cf87%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C637491779960080073%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ENeA49DoGLU895UOyIdYwsP9g5iLkWWBUp%2FnfNU7%2BZE%3D&amp;reserved=0
>
>
> Sincerely,
>
>
> Nathan
>
>
>
>
>> On Feb 17, 2021, at 8:43 AM, Daniel Hobley <Daniel...@adas.co.uk> wrote:
>>
>> Hi Nathan,
>>
>> I am flexible to some extent. I can work in R if necessary, but was hoping to use xarray in Python (or just the straight netcdf module) to handle the data. I also have a temporary matlab license kicking around if helpful?
>>
>> Dan
>>
>>
>>
>> -----Original Message-----
>> From: Nathan Potter <n...@opendap.org>
>> Sent: 17 February 2021 16:31
>> To: Daniel Hobley <Daniel...@adas.co.uk>
>> Cc: Nathan Potter <n...@opendap.org>
>> Subject: Re: [support] CEDA OPeNDAP failed connections
>>
>> Daniel,
>>
>>
>> The ".dods" response is a binary response that the server can stream, rather than build the entire response in a cache file or in memory.
>>
>>
>> Is R your target application?
>>
>>
>> N
>>
>>
>>
>>
>>> On Feb 17, 2021, at 8:28 AM, Daniel Hobley <Daniel...@adas.co.uk> wrote:
>>>
>>> Hi Nathan,
>>>
>>> Thanks for this, it's really helpful. I'm chasing up with someone from CEDA itself right now, who seems to be pointing me at using the .dods interface instead of just .nc, but things aren't going so well with that approach either! Hopefully they will be able to explain exactly what their server is up to soon...
>>>
>>> Thanks again,
>>>
>>> Dan
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Nathan Potter <n...@opendap.org>
>>> Sent: 16 February 2021 17:08
>>> To: Daniel Hobley <Daniel...@adas.co.uk>
>>> Cc: Nathan Potter <n...@opendap.org>; sup...@ceda.ac.uk;
>>> sup...@opendap.org
>>> Subject: Re: [support] CEDA OPeNDAP failed connections
>>>
>>> Hi Daniel,
>>>
>>>
>>> My comments are inline below.
>>>
>>>> On Feb 16, 2021, at 7:24 AM, Daniel Hobley <Daniel...@adas.co.uk> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I am looking to use a scripted downloader to get at some CEDA
>>>> (ceda.ac.uk) climate data through the OPeNDAP interface I gather you
>>>> designed for them. My issue appears to be a duplicate of this, which
>>>> I found deep in my websearches:
>>>> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgr
>>>> o
>>>> u
>>>> ps.google.com%2Fa%2Fopendap.org%2Fforum%2Fembed%2F%23!topic%2Fsuppor
>>>> t
>>>> %
>>>> 2FV2WX-vclCfs&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2a04e5
>>>> e
>>>> d
>>>> 30ca4f51bc4308d8d29d8f39%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%
>>>> 7
>>>> C
>>>> 637490921472360274%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI
>>>> j
>>>> o
>>>> iV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=vD%2Blm6Y3m
>>>> s
>>>> 3
>>>> 1Oq8PUVlW5w%2B2mTpd4vsIHZlr0L5uzIY%3D&amp;reserved=0
>>>>
>>>> Essentially, I am trying to use curl (and indeed, I've tried R and Matlab embedded routines as well) to issue an OPeNDAP call to this server:
>>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>>> c
>>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2
>>>> F
>>>> 1
>>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_
>>>> 1
>>>> 2
>>>> km_01_day_19801201-19901130.nc.html&amp;data=04%7C01%7CDaniel.Hobley
>>>> %
>>>> 4
>>>> 0adas.co.uk%7C2a04e5ed30ca4f51bc4308d8d29d8f39%7C5ef3ea3b97df42ee9bd
>>>> 9
>>>> 1
>>>> 1ae7068b6f3%7C0%7C0%7C637490921472360274%7CUnknown%7CTWFpbGZsb3d8eyJ
>>>> W
>>>> I
>>>> joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000
>>>> &
>>>> a
>>>> mp;sdata=kjWzRsLirfPSBCJXAnHxoZ7FkJlgPY84SJXw01aaWZk%3D&amp;reserved
>>>> =
>>>> 0 As in the issue on the website I pasted above, I can get no
>>>> adequate response from the server at all. If I paste the Data URL the Access Form provides, curl/netCDF recognises it as an invalid url. If I append ".ascii", I seem to be able to get a response, but am then immediately unable to actually issue a call to OPeNDAP for only part of the data - which is what I need to do. Pressing the manual button in the Access Form works fine in all cases - but I have 1000 of these to work through, and each whole netCDF4 file is ~130Mb. That means an OPeNDAP call is my only hope of automating this.
>>>>
>>>> Can you explain what's going on here? Is this a known issue? It seems like it is, given that previous issue raised. Am I doing something obviously wrong?
>>>
>>>
>>> There are a couple of things going on.
>>>
>>> The first issue is a possible misunderstanding about what we refer to as the dataset URL.
>>>
>>> In your example it would be:
>>>
>>>
>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>> c
>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2F
>>> 1
>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_1
>>> 2
>>> km_01_day_19801201-19901130.nc&amp;data=04%7C01%7CDaniel.Hobley%40ada
>>> s
>>> .co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd911ae7
>>> 0
>>> 68b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
>>> C
>>> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;s
>>> d
>>> ata=81UfuAhktVI8G%2FedfQTcddB7zGmB5Ao4O3SC2iGkXr8%3D&amp;reserved=0
>>>
>>> This dataset URL may return the underlying netcdf file, or it may return a HTTP 4XX error depending on server configuration. Many data providers do not wish to provide download access through the DAP service but rather through a separate download endpoint, or not at all.
>>>
>>> The dataset URL is utilized as a base URL by people and client software to form access queries for the dataset. For example one might look at the structural metadata for the dataset with this:
>>>
>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>> c
>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2F
>>> 1
>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_1
>>> 2
>>> km_01_day_19801201-19901130.nc.dds&amp;data=04%7C01%7CDaniel.Hobley%4
>>> 0
>>> adas.co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd91
>>> 1
>>> ae7068b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWI
>>> j
>>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a
>>> m
>>> p;sdata=QbErsFQchO7pvHX07EBdiFAWYOT0hXqgHxXh4dxkr48%3D&amp;reserved=0
>>>
>>> And the semantics metadata with this:
>>>
>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>> c
>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2F
>>> 1
>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_1
>>> 2
>>> km_01_day_19801201-19901130.nc.das&amp;data=04%7C01%7CDaniel.Hobley%4
>>> 0
>>> adas.co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd91
>>> 1
>>> ae7068b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWI
>>> j
>>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a
>>> m
>>> p;sdata=QBRCKf9XRlpajik78R5TofU14o7crna3sDhLG7PDO98%3D&amp;reserved=0
>>>
>>> The netcdf tool ncdump does exactly this when it responds to:
>>>
>>> ncdump -h
>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>> c
>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2F
>>> 1
>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_1
>>> 2
>>> km_01_day_19801201-19901130.nc&amp;data=04%7C01%7CDaniel.Hobley%40ada
>>> s
>>> .co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd911ae7
>>> 0
>>> 68b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
>>> C
>>> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;s
>>> d
>>> ata=81UfuAhktVI8G%2FedfQTcddB7zGmB5Ao4O3SC2iGkXr8%3D&amp;reserved=0
>>>
>>> It's not actually getting into the file directly, but rather utilizing the (OPeN)DAP proctocol to interrogate the server on the other end.
>>>
>>>
>>> Second, your subsetting requests are most likely failing because you have not properly encoded the query string. About 18 month the Apache Tomcat web engine (and others) began enforcing the HTTP specifications requirement that special characters be encoded in the query string, and that includes '[' and ']'
>>>
>>> So, for any recent release of Tomcat, this is a non-starter:
>>>
>>>> [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0]
>>>
>>>
>>> You can confirm this by adding "-i" to the cURL request. Tomcat will return an HTTP status of 400 along with an empty response body if your URL is not correctly encoded.
>>>
>>> I think if you try this:
>>>
>>>
>>> [url].nc.ascii?tasmax%5B0%3A1%3A0%5D%5B0%3A1%3A0%5D%5B0%3A1%3A0%5D%5B
>>> 0
>>> %3A1%3A0%5D
>>>
>>> It will probably work.
>>>
>>> I find this useful for testing:
>>>
>>>
>>> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
>>> urlencoder.org%2F&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7Cd52
>>> 6
>>> 35eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7
>>> C
>>> 0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLC
>>> J
>>> QIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=e4cU4n96
>>> P
>>> L9qKY80iJHf5XQ2rZjj1btrgD6Hup%2FlOgw%3D&amp;reserved=0
>>>
>>> But cURL also has "--data-urlencode" which might be useful to you. Unfortunately this change has only recently made it into the netcdf-java and netcdf-c libraries. Which means that programs that rely on the netcdf-* libraries to access remote data, like R and Matlab, may not be using a recent netcdf library implementation that supports the URL encoding for array subsets. Y.M.M.V.
>>>
>>>
>>> Fyi:
>>>
>>>> tasmax[0:1:0][0:1:0][0:1:0][0:1:0]
>>>
>>> Can also be expressed:
>>> tasmax[0][0][0][0]
>>>
>>> And:
>>> tasmax[0:1:10][0][0][0]
>>>
>>> Can also be expressed:
>>> tasmax[0:10][0][0][0]
>>>
>>>
>>>
>>> Please let me know if this helps you move forward.
>>>
>>>
>>> Sincerely,
>>>
>>>
>>> Nathan
>>>
>>>
>>>
>>>>
>>>> (For completeness' sake, the shortest version of what I tried is the instructions here: https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp.ceda.ac.uk%2Farticle%2F4442-ceda-opendap-scripted-interactions%23cert&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C9644029fae6d494ef94608d8d364cf87%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C637491779960080073%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=U3N1fGESeX60JD1l4UOeeGDyBqi286i43hIIuOJ9VqE%3D&amp;reserved=0. The permissions part of this works fine, but then I get these responses:
>>>> Curl -cert XXXX -L -c [url].nc -o testout gives 400, Unrecognised
>>>> request (this is exactly what the Data URL is given as on the Form)
>>>> Curl -cert XXXX -L -c [url].nc.ascii -o testout works for some
>>>> calls, but many of my files are >50Mb so this then gives 403, too
>>>> big Curl -cert XXXX -L -c
>>>> [url].nc?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an
>>>> empty file - again, this is a Data URL output from the form Curl
>>>> -cert XXXX -L -c [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0]
>>>> -o testout returns an empty file)
>>>>
>>>> Thank you!
>>>>
>>>> Dan Hobley
>>>>
>>>>
>>>> Daniel Hobley
>>>> Modelling & Informatics
>>>>
>>>> RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB
>>>>
>>>> Tel: +44 (0)7918 888121
>>>>
>>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>>>> a
>>>> das.uk%2F&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2a04e5ed30
>>>> c
>>>> a
>>>> 4f51bc4308d8d29d8f39%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C63
>>>> 7
>>>> 4
>>>> 90921472370265%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>>>> 2
>>>> l
>>>> uMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=bdPx11xLJ%2Fj2r
>>>> R
>>>> c
>>>> DwHQ369yZqb34FM1%2BS8BVl177RBc%3D&amp;reserved=0 | @ADASGroup
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Daniel Hobley
>>>> Modelling & Informatics
>>>>
>>>> RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB
>>>>
>>>> Tel: +44 (0)7918 888121
>>>>
>>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>>>> a
>>>> das.uk%2F&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2a04e5ed30
>>>> c
>>>> a
>>>> 4f51bc4308d8d29d8f39%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C63
>>>> 7
>>>> 4
>>>> 90921472370265%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>>>> 2
>>>> l
>>>> uMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=bdPx11xLJ%2Fj2r
>>>> R
>>>> c
>>>> DwHQ369yZqb34FM1%2BS8BVl177RBc%3D&amp;reserved=0 | @ADASGroup
>>>
>>> = = =
>>> Nathan Potter ndp at opendap.org
>>> OPeNDAP, Inc. +1.541.231.3317
>>>
>>> [WARNING: This email originated outside of RSK. DO NOT CLICK links,
>>> attachments or respond unless you recognise the sender and are
>>> certain that the content is safe]
>>
>> = = =
>> Nathan Potter ndp at opendap.org
>> OPeNDAP, Inc. +1.541.231.3317
>>
>> [WARNING: This email originated outside of RSK. DO NOT CLICK links,
>> attachments or respond unless you recognise the sender and are certain
>> that the content is safe]
>
> = = =
> Nathan Potter ndp at opendap.org
> OPeNDAP, Inc. +1.541.231.3317
>
> [WARNING: This email originated outside of RSK. DO NOT CLICK links, attachments or respond unless you recognise the sender and are certain that the content is safe]

Nathan Potter

unread,
Feb 18, 2021, 11:38:00 AM2/18/21
to Daniel Hobley, Nathan Potter, sup...@opendap.org
Hi Dan,

I'm glad you got something to work!

More below.


> On Feb 18, 2021, at 7:08 AM, Daniel Hobley <Daniel...@adas.co.uk> wrote:
>
> Hi Nathan,
>
> This has all been really helpful. I think I have now - largely thanks to your help - convinced myself that the problem here is CEDA's (inadequately documented) service. I got the python automation working fully, and... was able to replicate the exact set of results I sent you in my very first email! i.e., their server isn't responding to pretty sensible requests, including (/especially?) the basic call on the .nc file. This returns a 400 Unrecognised Request, and it _definitely_ isn't a permissions issue this time.

Getting an HTTP status of 400 for dereferencing the dataset URL is not really an error on the services part.

The server (an instance of the THREDDS data server or TDS) is telling you that you can't have that thing. The DAP2 specification, which is what the TDS has implemented never specified what the behavior should be for dereferencing the dataset URL. So, HTTP status 400, 403, etc are some of the ways that a server might reject that request. The 400 makes sense because the server is expecting a response type suffix like ".dds", ".das", ".html", ".dods", or others. These are described in the DAP2 specification and utilized in libraries and applications like Netcdf and PyDap.

Other DAP servers might just transmit the underlying file in response to the request for the dataset URL. In most of the cases where dataset URL request os rejected, and CEDAR is for surely one of them, the data provider has an entirely different service and endpoint for downloading the entire data file.

Point your browser at this:

http://dap.ceda.ac.uk/thredds/catalog/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/catalog.html?dataset=badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc

That's the TDS giving a summary of what it can do for you.

The very first thing, HTTPServer is in fact the download service:
http://dap.ceda.ac.uk/thredds/fileServer/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc


The OPeNDAP service is not a download service and will only respond to valid DAP2 requests, and I explained above the dataset URL is not a valid request on this server.
Notice that the link they provide is to the HTML Data Request Form, and not to the dataset URL:
http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.html



]Not the difference in the beginning of each URL:

Download Service: http://dap.ceda.ac.uk/thredds/fileServer/
OPeNDAP Service: http://dap.ceda.ac.uk/thredds/dodsC/


In DAP4 this is much more clearly defined.



Sincerely,


Nathan



>
> I will chase them specifically for a full description of what they think their interface does!
>
> Thanks so much for all your help - it's been incredibly helpful and I think I have learned a lot about HTML requests at least!
>
> Dan
>
>
>
> -----Original Message-----
> From: Nathan Potter <n...@opendap.org>
> Sent: 18 February 2021 13:43
> To: Daniel Hobley <Daniel...@adas.co.uk>
> Cc: Nathan Potter <n...@opendap.org>; sup...@opendap.org
> Subject: Re: [support] CEDA OPeNDAP failed connections
>
>
>
>
> Hi Dan,
>
> I will be the first to admit that I know not very much about PyDap and authentication.
>
> I tried to work thorough this problem using cURL and I found that the way CEDA has configured their OpenID portal will pretty much block any normative HTTP client from authenticating.
> This is an endemic problem with deployments of OAuth2/OpenID that pretty much ignore the HTTP side of the interaction, and I think it's really unfortunate.
>
> CEDA does provide some potentially useful instructions (that you may or may not have read) about how to perform this type of access:
>
> cURL:
> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp.ceda.ac.uk%2Farticle%2F4442-ceda-opendap-scripted-interactions&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2bab57062ecb4606ba3208d8d413248e%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C637492527949328063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=J07CZwVI7x5Rq6%2FF%2F77HqeWl4EdkUWhwFzdUYzc8pSs%3D&amp;reserved=0
>
> Python w/netcdf:
> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp.ceda.ac.uk%2Farticle%2F4712-reading-netcdf-with-python-opendap&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2bab57062ecb4606ba3208d8d413248e%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C637492527949328063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AoBilsKFVO28U54utDk5fvm%2BW0v4q2l1K6MAaOyPms8%3D&amp;reserved=0
>> .co.uk%7C2bab57062ecb4606ba3208d8d413248e%7C5ef3ea3b97df42ee9bd911ae70
>> 68b6f3%7C0%7C0%7C637492527949328063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
>> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sd
>> ata=R0u%2F%2BMjNZwdlA%2FQigq12N%2Fw9n8PsZ8hWEFw9MZf5gIg%3D&amp;reserve
>> d=0
>>
>> Instead of a local file name.
>>
>> Please try that and let me know what happens.
>>
>> Caveat: Authentication! While not Python clients are addressed specifically, this page may be helpful:
>>
>> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopen
>> dap.github.io%2Fhyrax_guide%2FMaster_Hyrax_Guide.html%23_authenticatio
>> n_for_dap_clients&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2bab
>> 57062ecb4606ba3208d8d413248e%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C
>> 0%7C637492527949328063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJ
>> QIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=1YiQbVemx
>> ImKvRqcO28zClPX3hHGa7iXAlHLJncAMeg%3D&amp;reserved=0
>>>> km_01_day_19801201-19901130.nc&amp;data=04%7C01%7CDaniel.Hobley%40ad
>>>> a
>>>> s
>>>> .co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd911ae
>>>> 7
>>>> 0
>>>> 68b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
>>>> M
>>>> C
>>>> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;
>>>> s
>>>> d
>>>> ata=81UfuAhktVI8G%2FedfQTcddB7zGmB5Ao4O3SC2iGkXr8%3D&amp;reserved=0
>>>>
>>>> This dataset URL may return the underlying netcdf file, or it may return a HTTP 4XX error depending on server configuration. Many data providers do not wish to provide download access through the DAP service but rather through a separate download endpoint, or not at all.
>>>>
>>>> The dataset URL is utilized as a base URL by people and client software to form access queries for the dataset. For example one might look at the structural metadata for the dataset with this:
>>>>
>>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>>> c
>>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2
>>>> F
>>>> 1
>>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_
>>>> 1
>>>> 2
>>>> km_01_day_19801201-19901130.nc.dds&amp;data=04%7C01%7CDaniel.Hobley%
>>>> 4
>>>> 0
>>>> adas.co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd9
>>>> 1
>>>> 1
>>>> ae7068b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJW
>>>> I
>>>> j
>>>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&
>>>> a
>>>> m
>>>> p;sdata=QbErsFQchO7pvHX07EBdiFAWYOT0hXqgHxXh4dxkr48%3D&amp;reserved=
>>>> 0
>>>>
>>>> And the semantics metadata with this:
>>>>
>>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>>> c
>>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2
>>>> F
>>>> 1
>>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_
>>>> 1
>>>> 2
>>>> km_01_day_19801201-19901130.nc.das&amp;data=04%7C01%7CDaniel.Hobley%
>>>> 4
>>>> 0
>>>> adas.co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd9
>>>> 1
>>>> 1
>>>> ae7068b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJW
>>>> I
>>>> j
>>>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&
>>>> a
>>>> m
>>>> p;sdata=QBRCKf9XRlpajik78R5TofU14o7crna3sDhLG7PDO98%3D&amp;reserved=
>>>> 0
>>>>
>>>> The netcdf tool ncdump does exactly this when it responds to:
>>>>
>>>> ncdump -h
>>>> https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdap.
>>>> c
>>>> eda.ac.uk%2Fthredds%2FdodsC%2Fbadc%2Fukcp18%2Fdata%2Fland-rcm%2Fuk%2
>>>> F
>>>> 1
>>>> 2km%2Frcp85%2F01%2Ftasmax%2Fday%2Flatest%2Ftasmax_rcp85_land-rcm_uk_
>>>> 1
>>>> 2
>>>> km_01_day_19801201-19901130.nc&amp;data=04%7C01%7CDaniel.Hobley%40ad
>>>> a
>>>> s
>>>> .co.uk%7Cd52635eb5a30451cf0cd08d8d36180a8%7C5ef3ea3b97df42ee9bd911ae
>>>> 7
>>>> 0
>>>> 68b6f3%7C0%7C0%7C637491764794332446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
>>>>> (For completeness' sake, the shortest version of what I tried is the instructions here: https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp.ceda.ac.uk%2Farticle%2F4442-ceda-opendap-scripted-interactions%23cert&amp;data=04%7C01%7CDaniel.Hobley%40adas.co.uk%7C2bab57062ecb4606ba3208d8d413248e%7C5ef3ea3b97df42ee9bd911ae7068b6f3%7C0%7C0%7C637492527949338057%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gKaCoJA%2FE8284LX8UizRPkyvEc7dNU%2Fs0nkTbbQuKqE%3D&amp;reserved=0. The permissions part of this works fine, but then I get these responses:

CEDA Helpdesk

unread,
Feb 24, 2021, 7:17:01 AM2/24/21
to Daniel Hobley, sup...@opendap.org

William replied

Cc: sup...@opendap.org

Feb 24, 12:16
Hi Dan,

Thanks for your patience. This is a fairly complex issue which relates to multiple problems. Firstly, the data files you are trying to access are using multiple unlimited dimensions. It would seem that THREDDs, our OPeNDAP provider, can only support the classic style of NetCDF file with single unlimited dimensions. This prevents you from using the ".nc" addressed URL and subsequent ".dds" endpoint to subset the file in a normal NetCDF library. Unfortunately, we can't currently do anything about the structure of these data files, or this particular limitation of THREDDS.

Secondly, tThe ASCII links, as you have discovered, only work when activated manually (or by constructing the equivalent URL encoded address). But these will always cap out at 50MB in size. I don't think we want to support a higher cap, since we would generally prefer people use other methods of accessing data.

I can suggest a couple of solutions. One would be to simply script the download of the whole files, using the CURL method or FTP. It should then be possible to subset them with a NetCDF library of your choice. Alternatively, with a JASMIN account, you can access the files on our archive directly on disk and do analysis using our Conda environments, or with our Notebook service.

Sorry I help be of more help than that. Let me know if you have any more questions.
Hello,


I am looking to use a scripted downloader to get at some CEDA (ceda.ac.uk) climate data through the OPeNDAP interface I gather you designed for them. My issue appears to be a duplicate of this, which I found deep in my websearches: https://groups.google.com/a/opendap.org/forum/embed/#!topic/support/V2WX-vclCfs


Essentially, I am trying to use curl (and indeed, I've tried R and Matlab embedded routines as well) to issue an OPeNDAP call to this server:
http://dap.ceda.ac.uk/thredds/dodsC/badc/ukcp18/data/land-rcm/uk/12km/rcp85/01/tasmax/day/latest/tasmax_rcp85_land-rcm_uk_12km_01_day_19801201-19901130.nc.html
As in the issue on the website I pasted above, I can get no adequate response from the server at all. If I paste the Data URL the Access Form provides, curl/netCDF recognises it as an invalid url. If I append ".ascii", I seem to be able to get a response, but am then immediately unable to actually issue a call to OPeNDAP for only part of the data - which is what I need to do. Pressing the manual button in the Access Form works fine in all cases - but I have 1000 of these to work through, and each whole netCDF4 file is ~130Mb. That means an OPeNDAP call is my only hope of automating this.


Can you explain what's going on here? Is this a known issue? It seems like it is, given that previous issue raised. Am I doing something obviously wrong?


(For completeness' sake, the shortest version of what I tried is the instructions here: https://help.ceda.ac.uk/article/4442-ceda-opendap-scripted-interactions#cert. The permissions part of this works fine, but then I get these responses:
Curl -cert XXXX -L -c [url].nc -o testout gives 400, Unrecognised request (this is exactly what the Data URL is given as on the Form)
Curl -cert XXXX -L -c [url].nc.ascii -o testout works for some calls, but many of my files are >50Mb so this then gives 403, too big
Curl -cert XXXX -L -c [url].nc?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an empty file - again, this is a Data URL output from the form
Curl -cert XXXX -L -c [url].nc.ascii?tasmax[0:1:0][0:1:0][0:1:0][0:1:0] -o testout returns an empty file)


Thank you!


Dan Hobley



Daniel Hobley
Modelling & Informatics


RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB


Tel: +44 (0)7918 888121


www.adas.uk | @ADASGroup




Daniel Hobley
Modelling & Informatics


RSK Bristol | The Old School | Stillhouse Lane | Bristol | BS3 4EB


Tel: +44 (0)7918 888121


www.adas.uk | @ADASGroup

{#HS:1427462940-41758#}

Nathan Potter

unread,
Feb 24, 2021, 7:41:02 AM2/24/21
to sup...@opendap.org, James Gallagher, Nathan Potter
Does Hyrax support multiple unlimited dimensions?

Nathan


Begin forwarded message:

James Gallagher

unread,
Mar 1, 2021, 2:43:31 PM3/1/21
to Nathan Potter, Gallagher James, sup...@opendap.org

On Feb 24, 2021, at 05:40, Nathan Potter <n...@opendap.org> wrote:

Does Hyrax support multiple unlimited dimensions?

I believe that it does because we use the NetCDF C library and not the Java library that the TDS uses. It appears this was added to NetCDF4 in 2010, so it’s a little odd that the TDS doesn’t support it. 

NB: by ’support it’ I assume you mean reading the data files.

James
James Gallagher



signature.asc
Reply all
Reply to author
Forward
0 new messages