microformats

10 views

Skip to first unread message

Richard Boulton

unread,

Dec 9, 2011, 3:39:32 PM12/9/11

to Guardian API Talk

Not sure if this is quite the right forum for this question, which
concerns the microformat / opengraph markup in articles rather than
the API directly. Apologies if there's a better place I could have
asked this.

I was taking a look at the markup applied on Guardian articles,
specifically the rel="tag" link markup, and the opengraph news tags.
Taking an example, say http://www.guardian.co.uk/world/2011/dec/09/uk-leading-role-europe-hague,
each tag is marked up in two places:

and

<a href="http://www.guardian.co.uk/business/debt-crisis"
rel="tag">Eurozone crisis</a>

This seems fairly reasonable, except that the rel="tag" specification
at http://microformats.org/wiki/rel-tag clearly says "the last segment
of the path portion of the URI (after the final "/" character)
contains the tag value". In other words, according to the
specification, the meta-property markup is for the tag "Eurozone
crisis", but the rel=tag markup is for the tag "debt-crisis".

I wonder if this is an intentional decision by the guardian, or an
accident, or just technically unavoidable. I also wonder what tags
the guardian staff would recommend I extract from the page.

I also wonder if the last segment of these paths is unique for a given
tag; eg, it looks like they might be scoped by the section of the
newspaper ("business" in the above example), so is it possible that
two different tags might be given the same final segment?