Accessing Fedora RDF triples (RELS-EXT or Resource Index) to selectively add Islandora Collection Model relationships to objects

233 views
Skip to first unread message

Jonathan M. Lane

unread,
Jun 2, 2011, 2:15:14 PM6/2/11
to islandora
Hello,

I've run into an issue when migrating large data sets from an old
Fedora 2.x repository that used an early version of Islandora where
the RDF <hasModel xmlns="info:fedora/fedora-system:def/model#"
rdf:resource="info:fedora/islandora:collectionCModel"></hasModel>
statement was not used to build collections in the Islandora
repository views. Without this RDF statement in the RELS-EXT
datastreams of my collection objects (which are not named in a way
that makes them easy to identify), I cannot see these collection
objects in Drupal, nor any of their child (read: member) objects.

Adding the missing RDF statement in the collection objects will fix my
problem, however I am seeking a way to use the Fedora REST APIs to
pull out the information I need to build the appropriate RDF
statements and update the RELS-EXT datastreams on the appropriate
objects. I'm working with some Ruby libraries that will let me do what
I need to do via the Fedora REST APIs, once I acquire a list of the
unidentified collection objects.

The only way I can think of to identify these collection objects
easily is by searching the Resource Index (Mulgara) for triples of
objects in a specific namespace with a
info:fedora/fedora-system:def/relations-external#isMemberOfCollection
predicate with a subject matching one of my
info:fedora/problematic:collections. I can then use that list to sort
out the unique instances of info:fedora/problematic:collections to
generate a reasonably accurate list of the collection objects needing
to be updated.

Finally, my question is whether or not there is a more efficient way
to pull down this list of collection objects PIDs than just getting a
flat list of all triples across all namespaces and locally filtering
or matching by my parameters? Either by using some sort of pattern
matching query language on the Resource Index or via some other
mechanism that would allow me to determine which objects should be
collections without the hasModel predicate. Ideally, pattern matching
or namespace filtering in the queries to the Resource Index would help
me get the job done without locally parsing all RDF statements from my
repository demonstrating membership in a collection.

Thanks,

Jonathan M. Lane
Information Technology Coordinator
-----
AIRS: Advancing Interdisciplinary Research in Singing (SSHRC MCRI)
Department of Psychology, University of Prince Edward Island
Charlottetown, Prince Edward Island, Canada, C1A 4P3
-----
+1 (902) 566-6023—Lab
+1 (902) 940-2320—Mobile
http://www.airsplace.ca/

Alexander O'Neill

unread,
Jun 2, 2011, 2:30:51 PM6/2/11
to isla...@googlegroups.com
Hi Jonathan,

If you go to the risearch application on the Fedora server, you should be able to do a search for all objects that are members of a collection by going to the 'Find Triples' tab and searching for something like 

* <info:fedora/rel:isMemberOfCollection> <info:fedora/some:pid>

(syntax isn't exact since I no longer run Fedora - this is from memory.) 

This will get you a subset of the whole triple store that is relevant to what you're looking for.

Cheers,

 -- alexander


--
You received this message because you are subscribed to the Google Groups "islandora" group.
To post to this group, send email to isla...@googlegroups.com.
To unsubscribe from this group, send email to islandora+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/islandora?hl=en.




--
Alexander O'Neill - alon...@gmail.com
Blog: http://tvt.blogspot.com/
Twitter: http://twitter.com/alxp

Jonathan M. Lane

unread,
Jun 2, 2011, 3:08:38 PM6/2/11
to isla...@googlegroups.com
On Thu, Jun 2, 2011 at 15:30, Alexander O'Neill <alon...@gmail.com> wrote:
> Hi Jonathan,
> If you go to the risearch application on the Fedora server, you should be
> able to do a search for all objects that are members of a collection by
> going to the 'Find Triples' tab and searching for something like
> * <info:fedora/rel:isMemberOfCollection> <info:fedora/some:pid>
> (syntax isn't exact since I no longer run Fedora - this is from memory.)
> This will get you a subset of the whole triple store that is relevant to
> what you're looking for.
> Cheers,
>  -- alexander

Indeed you are right, Alexander. The root of the problem is that the
collection objects' PIDs are unknown, where as the subject objects
PIDs are easily to access (a long list of PIDs in a specific
namespace), effectively the inverse of your scenario. This is because
the "member" objects all point to the PID of their parent
(collections), but those parents lack the proper RDF statement to
assert their collection status directly (which is causing a problem in
the Islandora scope).

I'd like to avoid pulling down a list of all RDF triples across all
namespaces, by doing something like this in risearch, represented by
pseudo-SPO: "<info:fedora/myns:*>
<info:fedora/fedora-system:def/relations-external#:isMemberOfCollection
*".

I am starting to suspect that filtering the Resource Index the way I
wish to do is only possible with some custom code.

Thanks for your response.

Alexander O'Neill

unread,
Jun 2, 2011, 3:13:18 PM6/2/11
to isla...@googlegroups.com
You could do something like

* <info:fedora/rel:isMemberOfCollection> *

Then capture all of the results from the 3rd column, these will all be collection objects that you can then add a hasModel attribute to.

Jonathan M. Lane

unread,
Jun 6, 2011, 11:06:47 AM6/6/11
to isla...@googlegroups.com
On Thu, Jun 2, 2011 at 16:13, Alexander O'Neill <alon...@gmail.com> wrote:
> You could do something like
> * <info:fedora/rel:isMemberOfCollection> *
> Then capture all of the results from the 3rd column, these will all be
> collection objects that you can then add a hasModel attribute to.

I am doing something similar (iTQL query with a template to give me
triple results). Thanks for the advice.

I think some of the matching functionality I'd like to be able to do
in RISearch is available in SPARQL's regex function
<http://www.w3.org/TR/rdf-sparql-query/#funcex-regex> though I am
uncertain if SPARQL support is adequate enough in Fedora-buddled
Mulgara to expect this functionality to be present.

From what I can tell, SPARQL seems to be the future of RDF query
languages, so perhaps I shouldn't rely on iTQL or SPO queries in the
future.

Reply all
Reply to author
Forward
0 new messages