RDF and URIs

3 views
Skip to first unread message

Bruce D'Arcus

unread,
Apr 7, 2007, 1:23:36 PM4/7/07
to zoter...@googlegroups.com
I'm looking at the RDF export again, and noting it could use a lot of
work if it's going to be really useful beyond local Zotero data
stores.

Just to give an example, here's what I see now, with inline comments:

<dcterms:URI RDF:about="rdf:#$9Un5f2"

RDF:value="http://judiciary.senate.gov/testimony.cfm?id=2416&amp;wit_id=5775"
/>

This is pretty bizarre modeling and identification (the "rdf:" thing
is really bad).

<z:Attachment RDF:ID="item:3333"
z:itemType="attachment"
dcterms:dateSubmitted="2006-10-02 15:43:12"
link:type="text/html"
link:charset="1">
<dc:identifier RDF:resource="rdf:#$9Un5f2"/>
</z:Attachment>

Again, we see some funky indirection here; no doubt a consequence of
Mozilla code, but not very helpful.

<bib:Document
RDF:about="http://judiciary.senate.gov/testimony.cfm?id=2416&amp;wit_id=5775"
z:itemType="webpage"
dc:title="Testimony of Bradford Berenson"
dc:date="2006-09-25 2006-09-25"
dcterms:dateSubmitted="2006-10-02 00:00:00">
<dcterms:isPartOf RDF:resource="rdf:#$6Un5f2"/>
<bib:authors RDF:resource="rdf:#$8Un5f2"/>
<dcterms:isReferencedBy RDF:resource="#item:13871"/>

This is how you're linking a note to an item, but it's not the way to
do it. The note should contain a property that references the item, in
the same way that in the DB is it a separate table with foreign key
references to the item.

E.g.:

<b:Note rdf:about="http://zotero.org/users/doej/notes/1">
<b:annotates rdf:resource="http://judiciary.senate.gov/testimony.cfm?id=2416&amp;wit_id=5775"/>
...
</b:Note>

<link:link RDF:resource="#item:3333"/>
<dc:subject>detention</dc:subject>
<dc:subject>law</dc:subject>
<dc:identifier RDF:resource="rdf:#$aUn5f2"/>

Again: try to use full -- and correct -- global URIs.

</bib:Document>

I know there are plans to address this, and also that some of these
problems are because of the rather bizarre (and old) support for RDF
in Mozilla. Any sense of when, and how?

This business of identification and URIs is particularly critical now
(more so than the particular modeling and serialization), because it
will have consequences for data and document portability.

Right now, for example, Zotero heavily relies on a whole lot of local
identifiers. They're not meaningful, though, when you move documents
around to different machines, or start implementing web services,
collaboration, and so forth.

So I'd urge you to settle on an identification and URI strategy sooner
rather than later. And notes, references, and so forth all need to get
smart (global) URIs.

As an example, you can say this (my strawman):

1. users each have a default URI, which may also be an OpenID. Let's
say "http://zotero.org/users/doe" for an example.

2. all user notes concatenate a "/notes/" and then some sequential
integer to the user base URI

3. all references get a global URI. If it is a web-sourced document,
the URI is the URL, else:

- if a doi, construct an info URI from it
- if collected from a handful of really trusted sources with reliable
URI infrastructure (the only one I know if worldcat.org) use its URI
- else use ISBN as URN

Obviously that would only go so far, but it ought to be a decent
start. Those URIs can then be used as citation identifiers in
documents and such, and they'll actually work across different local
databases, and so forth.

Bruce

Reply all
Reply to author
Forward
0 new messages