[periodicals] New interim dataload

0 views
Skip to first unread message

Chris Clarke

unread,
May 5, 2009, 3:21:57 PM5/5/09
to datain...@googlegroups.com
Still working on refining the periodicals dataset, here's an example
of the progress with the latest cut:

http://periodicals.dataincubator.org/journal/about-campus

* Slugging on URIs as suggested
* dx.doi.org URI moved from bibo:uri to foaf:homepage (more like the
earlier work Leigh did on modeling the NLM and Hirewire data)
* dct:publisher is now a foaf:Organization (see http://periodicals.dataincubator.org/organization/wiley-blackwell-john-wiley--sons)
* where applicable, the publishers are foaf:member of http://periodicals.dataincubator.org/organization/crossref
* Each publisher has a list of titles which they are dct:rightsHolder
of.
* dct:isPartOf void:Dataset relationships removed (following IanD's
lead on the same in the [ol] data)

More work to do (hopefully tomorrow):

* Load the rest of the crossref set (only loaded in about 25% tonight)
* Sort out minor issues with the slugging - still not quite right -
note double '--' in http://periodicals.dataincubator.org/organization/wiley-blackwell-john-wiley--sons
* Add foaf:name to the index fpmap which means users can search for
organisations and well as titles

Chris

Please consider the environment before printing this email.

Find out more about Talis at www.talis.com

shared innovationTM

Any views or personal opinions expressed within this email may not be those of Talis Information Ltd or its employees. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited.

Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.

Ian Davis

unread,
May 5, 2009, 3:44:43 PM5/5/09
to datain...@googlegroups.com
On Tue, May 5, 2009 at 8:21 PM, Chris Clarke <chris....@talis.com> wrote:
* Sort out minor issues with the slugging - still not quite right -
note double '--' in http://periodicals.dataincubator.org/organization/wiley-blackwell-john-wiley--sons

This is the slugify algorithm I tend to use:

http://www.djangosnippets.org/snippets/29/

 

Ross Singer

unread,
May 5, 2009, 8:37:17 PM5/5/09
to datain...@googlegroups.com
Ian, I like the stop words in that.

Chris, if you move your .gsub(/\s/,'-') to the end of that line (that
is, swap the two gsub calls), you should be fine.

I assume that the title is 'Foo - Bar' or something.

Keep the Ruby comin' :)

-Ross.
Reply all
Reply to author
Forward
0 new messages