[Fwd: Re: Glimmers of RDF/XML support]

5 views
Skip to first unread message

Ben Campbell

unread,
Apr 21, 2010, 4:23:44 AM4/21/10
to jl-...@googlegroups.com
(just forwarding Dan's reply)

-------- Original Message --------
Date: Sun, 18 Apr 2010 00:24:47 -0700 (PDT)
Subject: Re: Glimmers of RDF/XML support
From: danbri
To: Ben Campbell

On Apr 16, 5:50 pm, Ben Campbell <b...@scumways.com> wrote:
> Hi all (but mostly, I suspect, Dan ;-)
[snip]

Hey, this is a great start :) If this data can be linked to Wikipedia/
Freebase etc, there are all kinds of fun things we could do...

Some quick fixes -

<doac:Education rdf:about="#edu2">
<foaf:organization>Marple Hall County High School</
foaf:organization>

... here you probably want foaf:name; 'Organization' (note the
capital) is a class not a property btw.
So that would be ...
<foaf:name>Marple Hall County High School</foaf:name>

I've not looked closely at DOAC yet, will check it out.

I looked up Robert Fisk btw just as another interesting journalist.
The results from http://journalisted.com/robert-fisk?fmt=rdfxml are
rather slim; is something wrong or this is just work-in-progress?

Also re my comment about linkage to Wikipedia, ... I notice at least
Fisk's page there has a journalisted entry -
http://en.wikipedia.org/wiki/Robert_Fisk#External_links - has anyone
taken a look to see what % of journalisted entries are cited in
wikipedia? I don't see reciprocal links from JL to Wikipedia yet, but
maybe you could just harvest these from the wiki?

Once we have this link, it gets us to DBpedia, eg
http://dbpedia.org/page/Robert_Fisk
where you'll find info like birthplace, education, ethnicity and
various other factoids; somewhat messy as it's wiki-scraped. DBpedia
in turn gives you a sameAs link to the description over on Freebase
http://www.freebase.com/view/en/robert_fisk (which is often a little
more carefully curated). So there you get info like a film appearance
(in http://www.freebase.com/view/en/peace_propaganda_the_promised_land),
publications (which you seem to have already in Journalisted; but
without any outgoing links - isbns, amazon etc.), oh also they have
'gender'; is that tracked in JL yet?

Back re the JL RDF,

<foaf:Person>

<foaf:name>Robert Fisk</foaf:name>
<foaf:givenname>robert</foaf:givenname>
<foaf:family_name>fisk</foaf:family_name>

...we have finally tidied up this bit of FOAF, so foaf:givenName and
foaf:familyName are preferable now.

Since I went off on a tangent re wikipedia linkage, it's probably
clear my main concern with all this is linking together datasets so we
can combine them for a larger perspective.

At the moment you're emitting markup like

<doac:has_experience>
<doac:Experience rdf:about="#exp3">
<foaf:organization>Sunday Times</foaf:organization>
<doac:title>Education Correspondent</doac:title>
<doac:date_starts>1990</doac:date_starts>
<doac:date_ends>1990</doac:date_ends>
</doac:Experience>

... do you have internal (or public) IDs for some of these
organizations? eg. a homepage url for Sunday Times (or wikipedia link)
would make it a much more solid hub for running queries around. Linked
with the freebase/dbpedia info, you could run SPARQL queries to pull
out gender-info per topic per newspace, for example.

BTW are there any other sites out there this could be usefully linked
with? eg. other countries, or info from govt perhaps?

Give me a should if you have any RDF'y questions...

cheers,

Dan

ps. any chance of an SQL dump so I can set up a test installation here
to tweak?

--
You received this message because you are subscribed to the Google Groups "jl-dev" group.
To post to this group, send email to jl-...@googlegroups.com.
To unsubscribe from this group, send email to jl-dev+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/jl-dev?hl=en.

Reply all
Reply to author
Forward
0 new messages