Fedora::Repository find

Eric James

unread,

Apr 5, 2010, 5:27:13 PM4/5/10

to active...@googlegroups.com

Hi,

I would like to do the following using RubyFedora:

http://localhost:8080/fedora/objects?pid=true&title=true&terms=mssa*&maxResults=20&resultFormat=xml

This returns the following:
<?xml version="1.0" encoding="UTF-8" ?>
<result xmlns="http://www.fedora.info/definitions/1/0/types/">
<resultList>
<objectFields>
<pid>mssa:ms.0001</pid>
<title>Huntington (Ellsworth) Papers</title>
</objectFields>
....con't...

I try to do this in the console:
require = "ruby-fedora"
=>["Fedora"]
Fedora::Repository.register("fedoraAdmin:fedorapw@localhost:8080/fedora")
=>...

ro = Fedora::Repository.new("fedoraAdmin:fedorapw@localhost:8080/fedora")

=> ...

roxml = ro.find_objects("pid=true","title="true","terms=mssa*","maxResults=20","resultFormat=xml")

=> Fedora::ServerError: Failed with 500 Internal Server Error...

Can anyone point out how to get this code working?

Thanks,

Eric

Matthew Zumwalt

unread,

Apr 5, 2010, 6:45:25 PM4/5/10

to active...@googlegroups.com

Eric,

Could you say more about what you want to achieve? There might be higher-level methods that help you to do it. Most people don't actually use Fedora's basic search utility for anything besides getting a list of pids because it can only index a very limited set of metadata. In particular, it only indexes Fedora's DC datastream, which you should think of as a reserved datastream for Fedora internal use. Any of your own descriptive metadata, even DC metadata, should go in separate datastreams. I often put DC metadata into a datastream with dsid "dublin_core". Others use "descMetadata".

Instead of relying on Fedora's built-in basicsearch, you should be indexing your your objects and their metadata in something like Solr and then using that to construct any search-driven functionality. ActiveFedora provides you with many methods to assist with this.

Also, please make sure that you have the latest version of ActiveFedora (1.1.2) installed.

You also might want to tour through the Hydrangea code (work in progress, slated for beta release around July) for some guiding examples.

Matt Zumwalt

MediaShelf, LLC

http://www.yourmediashelf.com

--
You received this message because you are subscribed to the Google Groups "ActiveFedora / Ruby + Fedora Commons" group.
To post to this group, send email to active...@googlegroups.com.
To unsubscribe from this group, send email to active-fedor...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/active-fedora?hl=en.

Matthew Zumwalt

unread,

Apr 5, 2010, 6:49:50 PM4/5/10

to active...@googlegroups.com

BTW: Have you already tried the ActiveFedora console tour? http://projects.mediashelf.us/wiki/active-fedora/ActiveFedora_Console_Tour

The hydrangea code is here: http://github.com/projecthydra/hydrangea

Matt Zumwalt

MediaShelf, LLC

http://www.yourmediashelf.com

On Apr 5, 2010, at 4:27 PM, Eric James wrote:

--

Eric James

unread,

Apr 6, 2010, 4:26:42 PM4/6/10

to active...@googlegroups.com

Matt,

The overall goal is to become familiar with ruby/rails and rubyfedora/activefedora. The specific goal is to add proai support to our finding aids collection by:

1) getting the PIDs for each collection (each collection has its own pid prefix, so do a wildcard search).
2) adding relationships to RELS-EXT required by proai
a) fedora:isMemberOfCollection predicate to the collection object
b) oai:itemID predicate to the itemID

I am familiar with java - and could do this pretty easily using the fedora-client and API-A, API-M, but would like to try this with active fedora.

The finding aids collection is currently set up with a solr index with fedora genericsearch. But for the above task, these changes won't affect the fields indexed by solr. But thanks for the hydra link. I'm very interested in the shelver, and how to customize the foxml->solr indexing in ruby.

Eric

From: matt.z...@yourmediashelf.com
Subject: Re: Fedora::Repository find_objects method
Date: Mon, 5 Apr 2010 17:45:25 -0500
To: active...@googlegroups.com

Matthew Zumwalt

unread,

Apr 6, 2010, 5:21:28 PM4/6/10

to active...@googlegroups.com

Hi Eric,

Shelver & Hydrangea are works in progress and will be changing substantially over the next 3 months. If possible, it would be best to wait for the Beta release around Open Repositories, which will be accompanied by documentation.

I recommend going through the ActiveFedora console tour. It will show you how to use active fedora to define object models, load fedora objects based on those models, add relationships, and save those objects. I have not updated it in a while. Let the list know if anything does not work properly.

Since you're using gsearch, you should turn off activefedora's solr indexing. Right now you do that by setting UPDATE_SOLR_INDEX = false somewhere in your code.

The latest version of active fedora supports custom mapping of metadata fields to solr field names. You could use that to get activefedora to use the gsearch-populated solr info as-is. As with Shelver and Hydrangea, this feature will be better documented in July but it is there now. If you want to explore, try tinkering with the solr_mappings.yml file in the config directory (either within your rails app or within the activefedora gem itself)

In the meantime, as far as search stuff, activefedora is written with the assumption that you will probably prefer to use something like RSolr to get directly at the solr index. You can rely on ActiveFedora's minimal methods, but eventually you will want to use a more task-specific library.

as far as your first item -- getting the pids of the objects in a specific pid prefix,

Fedora::Repository.instance.find_objects("pid~druid*",:limit=>20, :select => [:pid, :title]).each {|fo| puts "#{fo.pid}, #{fo.label} "}

Originally, find_objects gave you an option of grabbing the raw fedora response, but that seems to have disappeared as everyone has drifted in the direction of using higher level libraries. Instead, it returns a wrapper for the Fedora objects in the result set. As a result, the :select limiter doesn't really do much for you. Here is how you would iterate through the result set outputting the pid and the label for each object:

result = Fedora::Repository.instance.find_objects("pid~druid*",:limit=>20])

result.each do |fo|

puts "PID: #{fo.pid}, Label: #{fo.label} "

end

Getting the title is another issue, since that's metadata in a datastream. In that case, you should define an activefedora model corresponding to your objects and then use it to access the metadata.

class MyEADModel < ActiveFedora::Base

# Of course, you could just use the QualifiedDublinCoreDatastream class here, but I'm just giving an example

# Caveat: doing stuff with the DC datastream might break stuff. I recommend putting your descriptive metadata elsewhere.

has_metadata :name => "DC", :type => ActiveFedora::MetadataDatastream do |m|

m.field "title"

end

result = Fedora::Repository.instance.find_objects("pid~druid*",:limit=>20])

result.each do |fo|

ead_object = MyEADModel.load_instance(fo.pid)

puts ead_object.datastreams["DC"].title_values

end

There are numerous other ways to load objects from fedora, but this one spells out many of the low-level methods you could use.

Matt Zumwalt

MediaShelf, LLC

http://www.yourmediashelf.com

Reply all

Reply to author

Forward