PCI within Metalib/Xerxes

6 views
Skip to first unread message

Jonathan Rochkind

unread,
May 1, 2012, 2:56:33 PM5/1/12
to xerxes-portal
We're doing a bit of playing around with Primo Central Index as a
Metalib resource, via Xerxes.

* One thing we note is that, in Xerxes at least, the publisher/database
provider is often showing up in the displayed "language" field.

David, have you noticed that or looked into it at all? Bug in PCI? Bug
in Metalib? Anything we can do to easily work around in Xerxes?


* The other thing we note is that "de-duping" is particular bad for PCI
via Metalib -- the same article often shows up multiple times, right
next to each other. I know PCI tries to do some de-duping within their
index, obviously that's failing. But I also know Metalib tries to do
some de-duping of it's fetched source results, that seems to be failing
too. Xerxes doesn't disable Metalib's de-duping in any way, does it?

(Yes, I am doing a multi-resource search in Metalib, PCI + something
else. But that brings us to the next question....)

* In Xerxes/Metalib, if you are searching only _one_ Metalib resource,
you lose the facets, sorting, etc. I know this was a flaw/lack of
feature in Metalib. But I sort of vaguely remember Metalib maybe fixed
or improved this at some point; is there any way these days to get
Xerxes to use Metalib faceting/ranking/sorting/etc even for a single
database search, or is it still missing func in Metalib?



Thanks!

Jonathan

Walker, David

unread,
May 2, 2012, 7:34:52 PM5/2/12
to xerxes...@googlegroups.com
Hey Jonathan,

I haven't looked at Primo Central via Metalib in a long time -- at least a year or maybe even a year and a half.

I'm certain we can map the database/provider info to the correct field in Xerxes so it doesn't appear as a language note. Any leg work you can do on that (like taking a peek at the Metalib MARC fields for these records) would be appreciated.

We thought the de-duping in PCI was pretty bad too.

We are not (purposefully) doing anything to change Metalib de-duping in Xerxes. But there's always the possibility of a Xerxes, or (more likely) a Metalib X-Server, bug. A quick comparison to results in the native Metalib interface should allow you to rule that out.

The "loss" of facets and sorting when searching a single database is not exactly a Metalib limitation, so much as a choice I made in Xerxes. We could, if we wanted to, create a "merged" set of that individual database result set, and Metalib would dutifully create clusters and sorting for it.

The problem, of course, is that those clusters and sorting only apply to the first 30 records, and, in that way, have always felt like false advertising. You're not really sorting the whole result set. And the "facets" aren't really facets. They're just clustered information extracted from terms in the first few records, and only show you results from the first few records.

It's not a horrible feature, especially if you are applying it to an actual merged set of, say, four or five databases. But it just seems less useful in the individual result set. You also lose the ability to go beyond the first 30 results, something that is only possible in Xerxes when you are in an individual database result set (another decision but one also based on the odd behavior of Metalib.)

Whose ready for a discovery system? ;-)

--Dave

-------------------------
David Walker
Interim Director, Systemwide Digital Library Services
California State University
562-355-4845
--
You received this message because you are subscribed to the Google Groups "xerxes-portal" group.
To post to this group, send email to xerxes...@googlegroups.com.
To unsubscribe from this group, send email to xerxes-porta...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/xerxes-portal?hl=en.

Jonathan Rochkind

unread,
May 3, 2012, 9:49:42 AM5/3/12
to xerxes...@googlegroups.com, Walker, David
Yeah, we're looking at using PCI in Metalib as a temporary stopgap while
we investigate other 'article discovery' options--which could take a
while.

On the "database/provider in language" -- I checked Metalib/V, and it
has the same problem -- although the UI makes it so hard to find the
'language' field in Metalib, that few might notice.

I have filed a support ticket iwth Ex Libris. I expect to never hear back.

So meanwhile, i know Xerxes somehow works around metadata problems in a
resource-specific way, i think it might be worth investigating if this
can easily be done for PCI. What's the easiest way for me to get access
to the individual MARC records returned by Metalib for PCI?

As far as applying facets/sorting to a single db search -- I hear what
you're saying. But still think it could be useful, especially for
certain resources like PCI. How hard would it be to modify Xerxes to
allow optionally applying "merged set" behavior to a single resource
search? Possibly configured on a per-resource basis?

Is it possible to increase the "num items fetched" for a _particular_
resource to be different than the default? PCI is so fast, I'd like to,
if possible, increase the 'num items fetched' for 'merged set' to
significantly more than the default 50 --- either as general 'always'
configuration, or even possibly _just_ to be applied when searching PCI
as a single resource 'merged set'. I don't know if Xerxes can supply the
'num items fetched' on a per-database or per-search basis, or if it's
just global config.

Jonathan

Walker, David

unread,
May 8, 2012, 9:37:23 PM5/8/12
to Jonathan Rochkind, xerxes...@googlegroups.com
Probably the simplest way to look at the MARC is in /V, where there is some kind of link to that in the record.

Alternately, you can add &format=xerxes to the URL In Xerxes, and buried in there is a URL to Metalib. Cut and paste that into your browser, and you can see the MARC-XML.

You can definitely address any mapping issues in lib/Xerxes/metalib/MetalibRecord

Everything else you mentioned is definitely possible. Not sure you want to devote this much effort for PCI in Metalib, though. Just my opinion.

Jonathan Rochkind

unread,
May 9, 2012, 10:52:58 AM5/9/12
to Walker, David, xerxes...@googlegroups.com
Yeah, don't want to devote too much effort to PCI in Metalib (at least
not yet), but a bit.

Wanna start by seeing if I can fix the Language/Publisher issue.

Now I'm having a problem where for some reason my Metalib /X won't let
me access from my browser; I guess it does IP-based authentication
somehow? Okay, time to check the Metalib docs.

On 5/8/2012 9:37 PM, Walker, David wrote:
> Probably the simplest way to look at the MARC is in /V, where there
> is some kind of link to that in the record.
>
> Alternately, you can add&format=xerxes to the URL In Xerxes, and

Jonathan Rochkind

unread,
May 9, 2012, 10:57:14 AM5/9/12
to xerxes...@googlegroups.com
Okay, found the original MARC in Metalib/V.

Indeed this is all jammed into one terrible marc field in the
Metalib-provided MARC, there's not much of anything we can do about it.

546 |a EnglishNaturePublishingGroupCengageLearning,Inc


I have a support ticket with EL that they're ignoring as usual, I'll add
this info to it.

Oh well.

On 5/8/2012 9:37 PM, Walker, David wrote:
> Probably the simplest way to look at the MARC is in /V, where there is some kind of link to that in the record.
>
> Alternately, you can add&format=xerxes to the URL In Xerxes, and buried in there is a URL to Metalib. Cut and paste that into your browser, and you can see the MARC-XML.
Reply all
Reply to author
Forward
0 new messages