[Q] OPeNDAP DDX XML output parser in Python?

33 views
Skip to first unread message

H. Joe Lee

unread,
Oct 5, 2015, 1:57:17 PM10/5/15
to openda...@opendap.org
Hi,

Has anyone written Python parser for OPeNDAP DDX?

I'm not looking for generic XML parser such as lxml.etree. I'm
looking for a parser that can handle all OPeNDAP DDX schema elements
such as Array, dimension, Float32 tags correctly and can reconstruct
netCDF (without data) or NcML or CDL.

Regards,

--
HDF Product Designer: Ideate interoperable Bigdata for Citizen Science and IoT.

H. Joe Lee

unread,
Oct 5, 2015, 2:22:14 PM10/5/15
to Nathan Potter, OPeNDAP Tech
Nathan,

I checked PyDAP and could not find DDX parser. PyDap parses DAS and DDS only.

Is there any plan for OPeNDAP team (and Unidata) to rewrite / merge
DDX in NcML? I don't see a good reason why OPeNDAP maintains separate
schema for DAP3.2/4.0 and produces DDX that looks quite difficult to
parse.

Yes, I'm working on DDX parser in Python now and curious if anyone
succeeded in parsing OPeNDAP DDX (particularly with DAP4.0 schema).



--
HDF Product Designer: Ideate interoperable Bigdata for Citizen Science and IoT.


On Mon, Oct 5, 2015 at 1:08 PM, Nathan Potter <n...@opendap.org> wrote:
>
>
> Is there one in PyDAP? http://www.pydap.org/
>
> Nathan
>> --
>> You received this message because you are subscribed to the Google Groups "OPeNDAP Tech" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to opendap-tech...@opendap.org.
>> To post to this group, send email to openda...@opendap.org.
>> Visit this group at http://groups.google.com/a/opendap.org/group/opendap-tech/.
>> For more options, visit https://groups.google.com/a/opendap.org/d/optout.
>
> = = =
> Nathan Potter ndp at opendap.org
> OPeNDAP, Inc. +1.541.231.3317
>
>
>
>

H. Joe Lee

unread,
Oct 5, 2015, 3:03:32 PM10/5/15
to Nathan Potter, OPeNDAP Tech
Nathan,

Here's my idea about merging (or rewriting) DDX into NcML. I just hope
that the future DDX output by Hyrax like below:

<Array name="temperature">
<Attribute name="units" type="String">
<value>K</value>
</Attribute>
<Float32/>
<dimension size="4"/>
<dimension size="8"/>
</Array>

becomes NcML-compatible output like below:

<dimension name='phony0' size=4>
<dimension name='phony1' size=8>
<variable name='temperature' type='float' shape='phony0 phony1' >
<attribute name='units' value='K' />
</variable>

OPeNDAP's DDX output looks overly complicated.

Of course, there are some things that NcML can't represent (e.g., DAP
url, unnamed dimensions, and attribute container). I think extending
NcML part belongs to Unidata. Thus, I'm asking team work to come up
with a single XML that can cover both netCDF NcML and OPeNDAP DDX.


--
HDF Product Designer: Ideate interoperable Bigdata for Citizen Science and IoT.


On Mon, Oct 5, 2015 at 1:35 PM, Nathan Potter <n...@opendap.org> wrote:
> Joe,
>
> I think that the DDX is really a dead end. There’s nothing in the DDX that isn’t in the DDS and DAS which are good enough representations of the DAP2 data model. At least in Hyrax I think that the DDX is built from a DDS object to which a DAS has been added. I guess it has the advantage to a client of being a single request to the server instead of two (one for DDS and one for DAS).
>
> Is there something in the DDX you need? I am not sure I understand what you mean about "merge the DDX into NcML”. Is there an idea you have that you want to share?

Nathan Potter

unread,
Oct 6, 2015, 5:55:10 AM10/6/15
to H. Joe Lee, Nathan Potter, OPeNDAP Tech


Is there one in PyDAP? http://www.pydap.org/

Nathan


> On Oct 5, 2015, at 10:57 AM, H. Joe Lee <hyo...@hdfgroup.org> wrote:
>

Nathan Potter

unread,
Oct 6, 2015, 5:55:10 AM10/6/15
to H. Joe Lee, Nathan Potter, OPeNDAP Tech
Joe,

I think that the DDX is really a dead end. There’s nothing in the DDX that isn’t in the DDS and DAS which are good enough representations of the DAP2 data model. At least in Hyrax I think that the DDX is built from a DDS object to which a DAS has been added. I guess it has the advantage to a client of being a single request to the server instead of two (one for DDS and one for DAS).

Is there something in the DDX you need? I am not sure I understand what you mean about "merge the DDX into NcML”. Is there an idea you have that you want to share?

Nathan

Joe Lee

unread,
Oct 7, 2015, 9:42:46 AM10/7/15
to James Gallagher, Nathan Potter, OPeNDAP Tech
Hi, James!

The problem of the current DMR is that the server doesn't return XML but DAP4 data access form [1].
It seems that BES OLFS intercepts it and translates into DAP4 form.
I expected that Hyrax provides a way to return XML like the examples in [2].

Since you're saying that DDX is tried and dropped, will future Hyrax ever expose DAP4 XML files [2] to the client?

[1] https://eosdap.hdfgroup.org:8989/opendap/data/hdf5/grid_1_2d.h5.dmr
[2] http://xml.opendap.org/dap/tests-dap4/



-----Original Message-----
From: James Gallagher [mailto:jgall...@opendap.org]
Sent: Wednesday, October 07, 2015 8:29 AM
To: Joe Lee
Cc: Nathan Potter; OPeNDAP Tech
Subject: Re: [opendap-tech] [Q] OPeNDAP DDX XML output parser in Python?


> On Oct 5, 2015, at 8:03 PM, H. Joe Lee <hyo...@hdfgroup.org> wrote:
>
> Nathan,
>
> Here's my idea about merging (or rewriting) DDX into NcML. I just hope
> that the future DDX output by Hyrax like below:
>
> <Array name="temperature">
> <Attribute name="units" type="String">
> <value>K</value>
> </Attribute>
> <Float32/>
> <dimension size="4"/>
> <dimension size="8"/>
> </Array>
>
> becomes NcML-compatible output like below:
>
> <dimension name='phony0' size=4>
> <dimension name='phony1' size=8>
> <variable name='temperature' type='float' shape='phony0 phony1' >
> <attribute name='units' value='K' />
> </variable>
>
> OPeNDAP's DDX output looks overly complicated.

Joe,

The DDX was an initial design we tried and dropped. It’s still in the code and some servers return it, but DAP4 uses a different document called a DMR. It’s similar, but not that same - and similar in some key ways. NCML is not suitable for DAP.

>
> Of course, there are some things that NcML can't represent (e.g., DAP
> url, unnamed dimensions, and attribute container). I think extending
> NcML part belongs to Unidata. Thus, I'm asking team work to come up
> with a single XML that can cover both netCDF NcML and OPeNDAP DDX.

Please take a look at the DMR for DAP4.

Thanks,
James
--
James Gallagher
jgall...@opendap.org

Joe Lee

unread,
Oct 7, 2015, 10:18:51 AM10/7/15
to Nathan Potter, James Gallagher, OPeNDAP Tech

So, adding the .dmr.xml was the trick. Now I can get XML that I want from our demo server.
Thank you so much, Nathan!

-----Original Message-----
From: Nathan Potter [mailto:n...@opendap.org]
Sent: Wednesday, October 07, 2015 8:53 AM
To: Joe Lee
Cc: Nathan Potter; James Gallagher; OPeNDAP Tech
Subject: Re: [opendap-tech] [Q] OPeNDAP DDX XML output parser in Python?


> On Oct 7, 2015, at 6:42 AM, Joe Lee <hyo...@hdfgroup.org> wrote:
>
> Hi, James!
>
> The problem of the current DMR is that the server doesn't return XML but DAP4 data access form [1].
> It seems that BES OLFS intercepts it and translates into DAP4 form.
> I expected that Hyrax provides a way to return XML like the examples in [2].


This is because the server correctly supports HTTP Content negotiation. You are asking for the DMR using a browser, and the browser is telling the server that it prefers HTML, so that’s what the server is sending.

This URL:

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr

Will return HTML to a browser, but if you use “curl” to retrieve it you’ll get the XML document.

You can use the URL to specify the returned media type:

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr.xml

Will return XML to the browser (or curl) because the client has specified (via the URL) the media type it wants.

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr.html

Will return HTML to the browser (or curl) because the client has specified (via the URL) the media type it wants.

The DAP4 specification docs describe all of this in much detail.

http://docs.opendap.org/index.php/DAP4:_Specification_Volume_2

In Hyrax this is all a little bit broken because the server comes out of the box with the DAP2 URL patterns as the default behavior. If you want to see correct DAP4 behavior edit the olfs.xml file and remove/comment out the line:

<UseDAP2ResourceUrlResponse />

Hope that helps,

Nathan

Nathan Potter

unread,
Oct 7, 2015, 10:22:10 AM10/7/15
to Joe Lee, Nathan Potter, James Gallagher, OPeNDAP Tech

> On Oct 7, 2015, at 6:42 AM, Joe Lee <hyo...@hdfgroup.org> wrote:
>
> Hi, James!
>
> The problem of the current DMR is that the server doesn't return XML but DAP4 data access form [1].
> It seems that BES OLFS intercepts it and translates into DAP4 form.
> I expected that Hyrax provides a way to return XML like the examples in [2].


This is because the server correctly supports HTTP Content negotiation. You are asking for the DMR using a browser, and the browser is telling the server that it prefers HTML, so that’s what the server is sending.

This URL:

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr

Will return HTML to a browser, but if you use “curl” to retrieve it you’ll get the XML document.

You can use the URL to specify the returned media type:

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr.xml

Will return XML to the browser (or curl) because the client has specified (via the URL) the media type it wants.

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr.html

Will return HTML to the browser (or curl) because the client has specified (via the URL) the media type it wants.

The DAP4 specification docs describe all of this in much detail.

http://docs.opendap.org/index.php/DAP4:_Specification_Volume_2

In Hyrax this is all a little bit broken because the server comes out of the box with the DAP2 URL patterns as the default behavior. If you want to see correct DAP4 behavior edit the olfs.xml file and remove/comment out the line:

<UseDAP2ResourceUrlResponse />

Hope that helps,

Nathan


>

James Gallagher

unread,
Oct 7, 2015, 10:22:11 AM10/7/15
to (Joe) Lee Hyo-Kyung, Nathan Potter, OPeNDAP Tech

> On Oct 5, 2015, at 8:03 PM, H. Joe Lee <hyo...@hdfgroup.org> wrote:
>
> Nathan,
>
> Here's my idea about merging (or rewriting) DDX into NcML. I just hope
> that the future DDX output by Hyrax like below:
>
> <Array name="temperature">
> <Attribute name="units" type="String">
> <value>K</value>
> </Attribute>
> <Float32/>
> <dimension size="4"/>
> <dimension size="8"/>
> </Array>
>
> becomes NcML-compatible output like below:
>
> <dimension name='phony0' size=4>
> <dimension name='phony1' size=8>
> <variable name='temperature' type='float' shape='phony0 phony1' >
> <attribute name='units' value='K' />
> </variable>
>
> OPeNDAP's DDX output looks overly complicated.

Joe,

The DDX was an initial design we tried and dropped. It’s still in the code and some servers return it, but DAP4 uses a different document called a DMR. It’s similar, but not that same - and similar in some key ways. NCML is not suitable for DAP.

>
> Of course, there are some things that NcML can't represent (e.g., DAP
> url, unnamed dimensions, and attribute container). I think extending
> NcML part belongs to Unidata. Thus, I'm asking team work to come up
> with a single XML that can cover both netCDF NcML and OPeNDAP DDX.

Please take a look at the DMR for DAP4.

Thanks,
James
>
>
--
James Gallagher
jgall...@opendap.org

Nathan Potter

unread,
Oct 7, 2015, 10:43:09 AM10/7/15
to Joe Lee, Nathan Potter, James Gallagher, OPeNDAP Tech

> On Oct 7, 2015, at 7:18 AM, Joe Lee <hyo...@hdfgroup.org> wrote:
>
>
> So, adding the .dmr.xml was the trick. Now I can get XML that I want from our demo server.
> Thank you so much, Nathan!

You’re welcome. Just remember that the default encoding for the DMR is XML, so if your client (curl for example) doesn’t provide an HTTP Accept header stating it’s preference (which the browser always does) then you will get XML from:

http://test.opendap.org:8080/opendap/hyrax/data/hdf4/S2000415.HDF.gz.dmr

Joe Lee

unread,
Oct 7, 2015, 10:46:19 AM10/7/15
to Nathan Potter, James Gallagher, OPeNDAP Tech
Hi, Nathan!

Thanks for the additional info.
I could confirm that XML is returned with curl and .dmr suffix only.
It's very interesting to play with new DAP4 DMR!


-----Original Message-----
From: Nathan Potter [mailto:n...@opendap.org]
Sent: Wednesday, October 07, 2015 9:38 AM
To: Joe Lee
Cc: Nathan Potter; James Gallagher; OPeNDAP Tech
Subject: Re: [opendap-tech] [Q] OPeNDAP DDX XML output parser in Python?


Reply all
Reply to author
Forward
0 new messages