MST to Blacklight

12 views
Skip to first unread message

David Kennedy

unread,
Mar 2, 2010, 9:57:58 AM3/2/10
to The eXtensible Catalog
I am really excited about the MST after seeing it at code4lib last
week. And so we have installed it here at Duke, and are pulling from
some of our local OAI-PMH capable data sources into the MST. Kudos on
the design and the interface, very well done.

I would like to pull then from the MST into a separate solr instance
that is in Blacklight schema. I am writing to see if anyone else has
done this, or if I am conceptually missing something in the XC
toolkits that already does some of this.

It seems to me that the model is to regularly harvest from MST with
OAI-PMH and update the separate solr instance from there, but thought
I would check to see if I am missing something

Thanks
Dave Kennedy

Király Péter

unread,
Mar 2, 2010, 3:25:51 PM3/2/10
to The eXtensible Catalog
Hi David,

There is a module in the XC system called Drupal Toolkit. It is a set of
Drupal modules, and available at http://drupal.org/project/xc.

One part of the module does the same thing you intends to use for:
harvest the MST (or any other OAI-PMH data source), and store the
records in Solr (and in MySQL). I can see a situation, where you don't
want to change your existing Blacklight installation, and you wants use the
Drupal Toolkit only for this purpose.

We use the Solr in a kind of schema-less manner: the only one mandatory
field is the identifier of the record, and the rest is based on dynamic
fields.
We initially map each XC record field to a dynamic type (text, phrase,
date, etc.), and use suffixes for the purpose, that Solr know the proper
type.
For example: the 'dc:title' became dc__title_t. The conversion back and
forth is automatic, and based on simple rules, so it can be imitated in
Blacklight.

It is possible to write a distinct Solr schema, but the XC record schema is
based upon the draft of RDA, so it is possible, that the it will be modified
in the future, that is one of the reason why we use dynamic fields. The
other
is, that we like to keep Drupal Toolkit in schema independent, and easily
plugable way.

If you want to try Drupal Toolkit, please contact me, because it is in alpha
state, and it is always changing, which version is the suggested one: the
development package, the alpha package, or use directly from CVS.
My email is pki...@tesuji.eu, and you can available me on Skype as
kirunews.

Hope this helps!

Regards,
P�ter Kir�ly

> --
> You received this message because you are subscribed to the Google Groups
> "The eXtensible Catalog" group.
> To post to this group, send email to extensibl...@googlegroups.com.
> To unsubscribe from this group, send email to
> extensible-cata...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/extensible-catalog?hl=en.
>

David Kennedy

unread,
Mar 4, 2010, 9:31:14 AM3/4/10
to extensibl...@googlegroups.com
Kiraly,

Thank you for the tip. Before I read your response, I went a different
route, probably a little unorthodox, but seems like it works.

I have an indexing engine running separate from MST and solr. The indexing
engine acts as a queue and items can be queued up from multiple data sources
for indexing. The queueing of items is done through web service calls. And
then the indexing of the queued items is done by call backs to services
running on the various data sources.

In the case of MST as a data source, I have written a MST service that
queues records in the indexing queue as they are harvested from a couple
different local data sources. The second part I haven't completed, but I
intend to have the indexing engine then contact MST's OAI-PMH service to
retrieve the harvested and possibly normalized, frbrized (whatever) record
for building the solr index document. I haven't figured out yet how MST's
OAI-PMH service works, but that is subject for another post.

Dave

Reply all
Reply to author
Forward
0 new messages