XML tag formation

45 views
Skip to first unread message

Richard Clark

unread,
Dec 22, 2008, 2:52:01 AM12/22/08
to iPredict Developers
The documentation contains the following construct:

<claims>
<MP.PETERS>
<last>0.0573</last>
<bid>0.3933</bid>
<ask>0.4013</ask>
</MP.PETERS>
...
</claims>

You really don't want to do it that way. XML parsers, especially low
level ones, work best with properly named tags that can then be parsed
for their attributes or contents. Myr ecommendation would be:

<claims>
<claim code="MP.PETERS" last="0.0573" bid="0.3933" ask="0.4013">
..
</claims>

This makes using parsers much easier - they need only match the
"claim" tag, and get the contents of the attributes. Getting canonical
attribute contents is simpler - there is no possibility of sub-tags in
attributes.

I don't have a problem with the use of the last,bid,ask as tags if
that is preferred for some other reason, but most definitely you do
not want to create random tags based on the name of the item - it
makes it impossible to create a DTD for example.



Hamish Campbell

unread,
Dec 22, 2008, 4:20:41 AM12/22/08
to iPredict Developers
Seconded. It that form you can't use XPATH meaningfully, or really any
of the normal DOM operations.

Hamish Campbell

unread,
Dec 22, 2008, 4:30:41 AM12/22/08
to iPredict Developers
Actually, the same applies to the error XML:

"Caused by providing a action that is unknown to the system

<InvalidAction>
<{action}>
<error>Unsupported Version</error>
<version>#</version>
</{action}>
</InvalidAction>"

Also, the xml response for 'book' has the root tag 'Book', and then
child elements 'book'. Yes, XML tags are case sensitive, but it's
confusing. You'd expect the 'entry' tags to be wrapped in an 'entries'
element.

Method

unread,
Dec 22, 2008, 2:14:00 PM12/22/08
to ipredict-...@googlegroups.com
Although I agree.. Is it possible to use it like a DOM tree.. crawling through it..
if you know the hierarchy, assuming it is a valid document, you should be able to just recursively go through the entire set, referencing the DOMNode->name & DOMNode->value

Don't take this as a re-buttle, more an addition.. cause I can see how this can cause problems.

Could you possibly relate it to another function that does come out more like you'd expect?
Like the History command pops out a different schema for it's results.. 
I can see the benefits of <trade symbol="GM.BAILOUT"> over <trade><symbol>GM.BAILOUT</symbol>
But if the latter is ok, I can move the other things to be inline with it.
Examples are great, so posting a small snippet from what is currently vs what you want, will really aid on my picking out the patterns most people care about :)

Cheers for the feedback!

Method

unread,
Dec 22, 2008, 4:37:43 PM12/22/08
to iPredict Developers
I have made some of the suggested changes on my internal dev
environment... Tell me what you guys think of this:

api.php?action=prices&allstock=true

<?xml version="1.0" encoding="UTF-8"?>
<Prices>
<apiVersion>1</apiVersion>
<allstock/>
<claims>
<claim stock="OCR.75.29JAN">
<last>0.2198</last>
<bid>0.2110</bid>
<ask>0.2198</ask>
</claim>
<!-- ... -->
</claims>
</Prices>


*AND*

api.php?action=book&stock=GM.BAILOUT

<?xml version="1.0" encoding="UTF-8"?>
<Book>
<apiVersion>1</apiVersion>
<stock>GM.BAILOUT</stock>
<entries>
<entry type="buy">
<quantity>11</quantity>
<price>0.9820</price>
</entry>
<entry type="buy">
<quantity>25</quantity>
<price>0.9808</price>
</entry>
<!-- ... -->
</entries>
</Book>



Comments?




Hamish Campbell

unread,
Dec 22, 2008, 4:49:52 PM12/22/08
to iPredict Developers
Looks good.

Richard Clark

unread,
Dec 22, 2008, 5:06:28 PM12/22/08
to iPredict Developers
Much better, more usable now. For aesthetics I would avoid the Book
style tags, I tend towards lower-case all the way through, it avoids
making bad guesses and general confusing when developing against, but
I'm not really fussed about it, in the end it's now easy to parse
which is the important bit.

Simon

unread,
Dec 22, 2008, 5:23:10 PM12/22/08
to ipredict-...@googlegroups.com
thanks for this feedback ... great to get your input on this (I'm
working with Keaton - but not directly working on this public API
project)

I know what you mean about the case.

I have seen and even occasionaly used some-tag, someTag, or SomeTag
but have also seen some_tag or even sometag (neither of which I like
much)

I guess my favourite is some-tag, maybe someTag would come second (I
do use it) but perhaps that might have more meaning to developers ....

2008/12/23 Richard Clark <richar...@gmail.com>:

Richard Clark

unread,
Dec 22, 2008, 5:54:57 PM12/22/08
to iPredict Developers
Generally, I recommend some_tag.

The reason for this is that it's the only construct that actually
generalises across languages and formats. Programming languages, for
example, rarely permit some-tag as a variable name because as far as
they're concerned it's two variables and a subtract operation.

Keeping everything consistent allows the programmer to utilise the
same nomenclature in their software as it is presented in the
dataset, and avoids confusion.

CamelCase, or someTag in this instance, is fairly popular but the
cross-pollination of concepts between HTML and XML means that many XML
developers fail to realise that it is in fact case sensitive. Use of
multi-case can sometimes lead to a real debugging headache for those
users.

If I were doing the design, I suspect my output would be something
like:

<prices version="1" all_stock="1" dated="2009-01-01T00:00:00">
<claims>
<claim stock="OCR.75.29JAN" last="0.2198" bid="0.2110"
ask="0.2198" />
<claim stock="OCR.75.29FEB" last="0.2198" bid="0.2110"
ask="0.2198" />
<claim stock="OCR.75.29MAR" last="0.2198" bid="0.2110"
ask="0.2198" />
...
</claims>
</prices>

This kind of simple, attribute-based design simplifies xpath
constructs and xsl transformations, keeps naming dead simple and
obvious, avoids issues like whether "true" == 1 == "True" == "yes" ==
whatever, uses standard XML datetime formatting, and keeps the entire
result-set self contained so that it can be saved as XML and, without
context, handed to XSLT for transformation into something useful.

For example, the xpath to retrieve the bid price for all claims iff
the version of the result is 1 is:

/prices[@version="1"]//claim/@bid

This is both robust - it won't trigger on any random claim element,
only on one that is a proper subset of prices, version 1 - and simple,
a short, single, easy to read line.

For the existing layout, the following would be roughly equivalent
(it's been a while so don't hate me if I get it wrong)

/Prices[apiVersion/text() = "1"]//claim/bid/text()

This is neither as intuitive, nor is it common usage - tutorials etc
do not always cover this kind of use, and thus you find developers
reparsing the whole tree rather than xpathing intelligently because
they don't know how to predicate correctly.

As I indicated before, I am unconcerned really, what you've proposed
now is sufficiently parsable for my purposes, but it seemed worthwhile
to explain my reasoning for how I'd do it.

On Dec 23, 11:23 am, Simon <sighman.le...@gmail.com> wrote:
> thanks for this feedback ... great to get your input on this (I'm
> working with Keaton - but not directly working on this public API
> project)
>
> I know what you mean about the case.
>
> I have seen and even occasionaly used some-tag, someTag, or SomeTag
> but have also seen some_tag or even sometag (neither of which I like
> much)
>
> I guess my favourite is some-tag, maybe someTag would come second (I
> do use it) but perhaps that might have more meaning to developers ....
>
> 2008/12/23 Richard Clark <richard.cl...@gmail.com>:

Method

unread,
Dec 22, 2008, 7:44:07 PM12/22/08
to ipredict-...@googlegroups.com
I think this:

/Prices[apiVersion/text() = "1"]//claim/bid/text()
is a little over dramatic. Since each of the elements are the element values, the reference to text() is only useful if we were to have done something crazy like
<claim>GM.BAILOUT<last>0.9999</last></claim>
then the usage of text() would be needed since it's a mixed element...
True we don't have a true XSD to validate against at the moment..

but anyhow ... this:
/Prices[apiVersion=1]//claim/bid
works, which is equiv to your proposed:
/prices[@version="1"]//claim/@bid

With the recent changes I have made, this also works:
/prices[apiVersion=1]/claims/claim/bid
or shortened..
///bid

:)

Method

unread,
Dec 22, 2008, 7:46:07 PM12/22/08
to ipredict-...@googlegroups.com
Correction:
///bid = long version of:
//bid

Richard Clark

unread,
Dec 22, 2008, 8:04:03 PM12/22/08
to iPredict Developers
True!, it has been a while since I had to xpath anything more than a
relatively trivial thing :)

On Dec 23, 1:44 pm, Method <dirty.jesus...@gmail.com> wrote:
> I think this:
> /Prices[apiVersion/text() = "1"]//claim/bid/text()
> is a little over dramatic. Since each of the elements are the element
> values, the reference to text() is only useful if we were to have done
> something crazy like
> <claim>GM.BAILOUT<last>0.9999</last></claim>
> then the usage of text() would be needed since it's a mixed element...
> True we don't have a true XSD to validate against at the moment..
>
> but anyhow ... this:
> /Prices[apiVersion=1]//claim/bid
> works, which is equiv to your proposed:
> /prices[@version="1"]//claim/@bid
>
> With the recent changes I have made, this also works:
> /prices[apiVersion=1]/claims/claim/bid
> or shortened..
> ///bid
>
> :)
>
Reply all
Reply to author
Forward
0 new messages