library was to parse and serialize jstor? If so, then do these jstore
resource maps exist in some form that is accessible on the web?
2. When running "from foresite import *" in python I had to delete a
reference to "from utils import generateAtomContent" in parser.py to
import the library into python.
3. I have tried to install your library and have run into troubles
asking your library to download an existing resource map.
Specifically, the one mentioned in your documentation:
"http://www.openarchives.org/ore/0.9/atom-examples/
atom_dlib_maxi.atom"
My question: When following your link to the JSTOR ReM (Resource Map I
presume) in my browser I didn't find a rdf or atom serialization as I
expected. Also, when treating the example JSTOR link as a resource map
in the foresite library, the parser was unable to parse the uri
making me think that this uri is not a resource map. How does one find
the serialized version of the resource map for say:
http://foresite.cheshire3.org/stable/ore/j100378
?
-Jessica
> 1. Am I correct in understanding that the original intent of the
>>
>> library was to parse and serialize jstor? If so, then do these jstore
>> resource maps exist in some form that is accessible on the web?
>
> Yes, if you go to:
> http://foresite.cheshire3.org/stable/ore/(identifier)
>
> you'll get the ReM for the object with the (identifier) id in JSTOR.
> eg http://foresite.cheshire3.org/stable/ore/j100378
interesting. when following this lin
Ok so here's what seemed to work given the original jstor example uri
URI-A= http://foresite.cheshire3.org/stable/ore/j10037
using the canonical script
rm = ReMDocument('URI-A/[file]')
rp = RdfLibParser()
ap = AtomParser()
rd = rp.parse(rm) or rd = ap.parse(rm)
:
> URI-A +
> /rdfa.html --> simple block of RDFa, suitable for importing into HTML
> /rem.n3 --> n3 style rdf
> /rem.nt --> ntriples
after setting the format type('rdfa', 'n3', 'nt') this parsed fine
> /rdf.xml --> RDF/XML (striped)
> /pretty.xml --> RDF/XML (a more 'pretty' style)
the rdf parser worked without setting the ReMDocument format type
> /rem.turtle --> ttl
this doesn't work but that makes sense since the rdflib you use for
python doesn't have a turtle parser (just a generator)
> /atom.xml --> atom
Here however there seems to be some sort of problem here. Looking at
the file pulled from URI-A/atom.xml it doesn't seem to be an atom
file.
To show what I did I attached a file to this message. Then it can be
completely obvious what I'm doing.
Another unrelated question:
Is there an aggregation describing these all these aggregations (ie.
URI-A)? Basically, I'm curious if your repository has some sort of
batch discovery mechanism. I see that there are some standard
approaches outlined here:
http://www.openarchives.org/ore/1.0/discovery
but all require access to a sitemap, some sort of aggregated resource.
I see there are a number of data providers out there and I imagine
there is some way to browse all those resource aggregations that's
standard.
-Jessica
> URI-A +
> /rdfa.html --> simple block of RDFa, suitable for importing into HTML
> /rem.n3 --> n3 style rdfafter setting the format type('rdfa', 'n3', 'nt') this parsed fine
> /rem.nt --> ntriples
> /rdf.xml --> RDF/XML (striped)the rdf parser worked without setting the ReMDocument format type
> /pretty.xml --> RDF/XML (a more 'pretty' style)
> /rem.turtle --> ttl
this doesn't work but that makes sense since the rdflib you use for
python doesn't have a turtle parser (just a generator)
> /atom.xml --> atom
Here however there seems to be some sort of problem here. Looking at
the file pulled from URI-A/atom.xml it doesn't seem to be an atom
file.
Is there an aggregation describing these all these aggregations (ie.
URI-A)? Basically, I'm curious if your repository has some sort of
batch discovery mechanism. I see that there are some standard
approaches outlined here:
http://www.openarchives.org/ore/1.0/discovery
but all require access to a sitemap, some sort of aggregated resource.
I see there are a number of data providers out there and I imagine
there is some way to browse all those resource aggregations that's
standard.
Ok, that makes sense that they want control. I could make the case
that all an aggregation does is give away the metadata, but I could
see how that in and of itself could be considered too much. Heck,
Facebook doesn't want its social network given away!
> 2. The use case for doing this is to enable visualization and exploration
> of known documents, rather than a discovery/publication service.
Yep! Don't want to set up jstor, just want some examples of Resource
Maps to experiment with. You wouldn't happen to have a lead on some
people who are willing to allow me to use their data . . . ? I've just
started digging into the data providers using the PMH protocol. I ran
a perl script discovering the different metadata formats available
from these data providers
(http://www.openarchives.org/Register/BrowseSites) but no one seems to
be using the ORE resource map format to publish their metadata
content.
>
> Technically:
> 3. It's all static data, so PMH wouldn't be very useful as the publication
> time for every document would be the same.
> 4. Sitemaps have a limit of 50,000 URLs and there's 4+ million to deal
> with! The journal aggregations could go in to a sitemap however.
> 5. Atom Feeds would be static and I'm not sure what would actually go in
> them ... a feed with just the journal aggregations as per sitemaps?
> 6. As the lowest level of AR are all on JSTOR, it's the zero knowledge case
> for Resource Embedding, so that doesn't help either.
>
> And:
> 7. I haven't gotten around to it yet! :)
>
: P Fair enough.
I can't say that I exactly follow all your points about the difficulty
of these formats but the foresite forum doesn't seem like the best
format to have that kind of back-and-forth kind of chat.
>
>>
>> but all require access to a sitemap, some sort of aggregated resource.
>> I see there are a number of data providers out there and I imagine
>> there is some way to browse all those resource aggregations that's
>> standard.
>
> You might be interested, on the browser front, in my firefox/greasemonkey
> plugin:
>
> http://www.csc.liv.ac.uk/~azaroth/foresite-explorer.user.js
>
> Which finds the jstor ID in the URL and adds an SVG viz layer into the
> page. It also works (99%) for flickr and amazon wishlists, as examples
> which everyone can experiment with. Any feedback would be greatly
> appreciated, other than about the lack of documentation, which is coming :)
Hmmm. So, while I have used greasemonkey in the past I've never
tinkered with the scripts myself. I'm not quite clear how this script
is supposed to alter what I see. I imagine it visualizes resource maps
but when I go to a site through firefox (say
http://foresite.cheshire3.org/stable/ore/j100378/rdf.xml) nothing
seems to be different. Perhaps a loading problem? (Specs: Firefox
3.01, Mac OSX) The script does seem to be installed. On the off chance
the script was supposed to alter the way flikr/jstor sites were seen I
went to those sites as well but didn't notice a difference there
either.
>
> I'm also prodding Ross (a PhD student in my dept who wrote the OREsome viz
> client for the RepoCamp challenge) to tidy up and release the processing
> java code for his stuff.
I saw the page for the RepoCamp Challenge and that's what led me to
the foresite library! Congrats to Ross! I didn't realize he was your
student.
>
> Hope that helps!
>
> Rob
>
>
> >
>
Oh yeah, here are the most of the formats supported by various
repositories(see attached file). (The number is their counts). I see
now that two support the rdf format which may or may not be a resource
map (Dspace at MIT and edocUR - Universidad del Rosario). I guess I
should check them out and see if they contain serialized resource
maps.
-Jessica
Wow! Ok, I see the addition now. There's a mini button with the ORE
symbol on it when I got to this page. I like it! It's a nice way to
navigate and certainly easier than browsing the xml. It makes it
easier to think about aggregation objects.
-Jessica
It however works for this journal:
http://www.jstor.org/stable/j100001
Feedback-ish thoughts:
1. On the UI level you might consider putting in a way to "exit" the
visualization mode.
2. does this mean that citations are not included in the jstor
aggregations? The couple jstor articles I've looked at don't seem to
have citation relationships with anything. (Looking at j100378, the
stanford law review, again.)
3. the "included in" relationship makes for a strange graph because
two different nodes represent the same thing after expanding the
"included in" node.
4. when there are many issues in a journal the number of nodes seems
overwhelming. Perhaps you should be able to type in a number?
Cheers! I hope you find some of the feedback helpful.
> Wow! Ok, I see the addition now. There's a mini button with the ORE
> symbol on it when I got to this page. I like it! It's a nice way to
> navigate and certainly easier than browsing the xml. It makes it
> easier to think about aggregation objects.
Feedback-ish thoughts:
1. On the UI level you might consider putting in a way to "exit" the
visualization mode.
2. does this mean that citations are not included in the jstor
aggregations? The couple jstor articles I've looked at don't seem to
have citation relationships with anything. (Looking at j100378, the
stanford law review, again.)
3. the "included in" relationship makes for a strange graph because
two different nodes represent the same thing after expanding the
"included in" node.
4. when there are many issues in a journal the number of nodes seems
overwhelming. Perhaps you should be able to type in a number?