Rick Johnson
unread,Oct 13, 2010, 4:51:01 PM10/13/10Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Rick Johnson, matt.z...@yourmediashelf.com, hydra...@googlegroups.com, active...@googlegroups.com
Hi Matt,
Resending now to hydra-tech and active-fedora...
To answer your questions, the primary motivation of load_instance_from_solr is to bypass interacting with datastreams directly from Fedora, and should only be used in read-only views. It was most useful for us in generating browse views based on relationships and metadata values (we created the browse views based on ActiveFedora, not Blacklight). So, while only touching solr we were able to have access to all active-fedora helper methods instead of dealing with the solr symbol that active-fedora generated for us anyway (made our code a lot cleaner). This was also pretty efficient just working with metadata datastream fields defined within model classes.
Until now, I have not tried to add support for any NokogiriDatastreams, but I think I have a decent grasp on how it works after looking at the examples Banu brought back from Hydra Boot Camp last week. Ideally with a NokogiriDatastream as well it does not actually parse or generate any xml (again what is retrieved and stored is never meant to be saved back). Instead, it just stores values in memory for use in the UI.
I have not tested any of the code yet, and because of the new wrinkle for dealing directly with hierarchical structures, it may not increase performance. I have the code mostly written and will be testing soon. I am essentially doing the same thing that solrize_term and solrize_node methods except the final step writes to the datastream (again memory only) instead of writing to the solr_doc.
If all goes well, I should have some code to review soon.
Thanks,
Rick
On Wed, Oct 13, 2010 at 4:37 PM, Rick Johnson
<rick.j...@nd.edu> wrote:
________________________________________
From: Matthew Zumwalt [matt.z...@yourmediashelf.com]
Sent: Wednesday, October 13, 2010 4:21 PM
To: Rick Johnson
Subject: Re: Update to NokogiriDatastream and Solrizer
Hi Rick,
Could you re-send this to either he active-fedora list or the hydra-tech list? We need this type of conversation floating out in the open so people will know what's going on. I will resend the info below in response:
Wasn't the motivation for load_instance_from_solr to expedite loading content into the application? The process you described sounds substantially slower and more prone to bugs than just loading the XML with nokogiri and accessing the values using OM.
I think of Solrizer as a tool to provide to_solr behaviors so that you can transform content into solr documents. Until now I haven't thought of it as a library that would provide from_solr behaviors.
Are you sure that it's even possible to roundtrip data between hierarchical xml and a solr document? That's a difficult thing to navigate and might be an even more difficult to support over the long term.
Matt Zumwalt
MediaShelf, LLC
http://www.yourmediashelf.com
On Oct 13, 2010, at 2:14 PM, Rick Johnson wrote:
Hi Matt,
I am working through adding support to load_instance_from_solr for Nokogiri datastreams. I have figured out most of the ins and outs and am ready to start coding. I am going to mimic the behaviour of solrize_term and solrize_node in order to populate a Nokogiri datastream object from solr. The idea is to pass in a solr doc that contains the objects data, iterate through all mappings defined in terminology, check if the appropriate solr name exists in the doc. Then, instead of updating the solr doc (as is the case to to_solr related methods), it calls update_indexed_attributes with the right term_pointer and value in the solr_doc.
So, my question lies with where the code should live. I am leaning towards putting from_solr and methods it uses in Solrizer::XML::TerminologyBasedSolrizer (especially since the solrize_node method in NokogiriDatastream does not appear to be used anymore). Then, I would remove any implementation of from_solr from NokogiriDatastream class itself. Make sense?
Thanks,
Rick
--
----------------------------------------------------------
Rick Johnson
Unit Manager, Digital Library Applications and Local Programming Unit
Library Information Systems
University of Notre Dame
Michiana Academic Library Consortium
Notre Dame, IN USA 46556
http://www.library.nd.edu
574-631-1086
------------------------------------------------------------