Hi Michael,
On Jul 10, 2:31 pm, michael <
mich...@ifelse.org> wrote:
> First, let me introduce myself, My name is Michael Donohoe and I've
> been at The New York Times for five years or so. My first role was
> working with the CMS, both internals and templating and now I'm
> focused on frontend dev work, from Javascript, HTML, XSLT, PHP and all
> the fun templating and framework options that provides.
We're glad to have someone from the NYT joining the discussion!
>
> My first impression was 'oh no, not another microformat' but very
> pleased you're looking to extend hAtom instead of a competitor.
It's worth noting that we're not (currently) a microformat. A couple
of initial conversations with the microformats community indicated a
bit of a "chicken and egg" problem - it only becomes interesting from
the point of view of the community when something has been done and
people are starting to use it. We're looking at putting our
suggestions through the microformats process in the next couple of
weeks, but we don't need to do so in order for these proposals to be
useful. (The microformats community would currently call us a
"poshformat" - we're just calling it Value Added News).
> Would it be beneficial to allow a limited amount of meta-data also
> (example: 'politics', 'gop', 'pelosi, nancy' )?
Potentially. For someone using data from, for example, your news site
and only your news site, having some limited access to your taxonomy
would be incredibly useful. For someone working with articles from
several publishers, access to a mixed taxonomy would be less useful
(to borrow from your example, the NYT might mark up Nancy Pelosi as
'pelosi, nancy', and another organisation may mark her up as 'nancy
pelosi'.
Is the category/rel-tag part of the hAtom specification (http://
microformats.org/wiki/hatom#Entry_Category which we extend from, so is
available) sufficient for what you're talking about, or do you think
there is a general requirement for a more sophisticated way of
exposing internal taxonomies ?
> I would see it as helpful as providing context to an article and to
> say that while this article might talk about many things this is
> really what t centers around.
In my absolutely ideal world, it would be great if those subjects were
marked up in microformats or RDFa inside the article text, for
example, marking up people inside news articles with hcard. My
concern about this is that the kind of content enrichment required for
this is very much not in the workflow of most news organisations, and
automatically doing this using tools like Reuters Open Calais provides
no way of making sure that only important subjects are marked up
(perhaps using a relevance threshold, but something like that is only
reliable to a certain extent.
> Obviously this could be a can of worms, from SEO, or if you would want
> to put a standardized taxonomy in front of that, and unknown forms of
> abuse... But I'd like to throw it out there to get other peoples
> opinions.
It's still information that the search engines can spider if they
choose to, and act accordingly.
>
> Would the negatives outweigh any positive gains?
I don't think there are too many negatives from providing more useful
machine readable categorisation and subject information. The
potential gains are well worthwhile, though, I think. A good example
from a presentation at NewsInnovationLondon yesterday was talking
about a certain news outlet in the UK, on whose summary page for
Gordon Brown, the top entry was a caravan review, because it contained
a complaint that people were going to have to all have caravan
holidays because gordon brown is ruining the economy.
Mark