New MarcCombiningReader feature

3 views
Skip to first unread message

Demian Katz

unread,
Sep 3, 2010, 11:54:36 AM9/3/10
to solrma...@googlegroups.com

As a follow-up to yesterday’s conversation with Bob, I have attached a patch here:

 

http://jira.projectblacklight.org/jira/browse/SOLRMARC-4

 

This patch adds a new feature to the MarcCombiningReader – you can now configure which fields are used to determine matches when combining records.  This allows, for example, merging a bibliographic record with an associated holdings record whose 004 field matches its 001 field (useful when dealing with full Voyager exports).

 

I would love to commit this so it can go into the next release – it’s going to be extremely useful here at Villanova.  However, since I’m new to marc4j and am unfamiliar with testing procedures related to MarcCombiningReader, I didn’t want to commit it myself for fear of breaking something.

 

If anybody could apply the patch, review and test my changes and provide feedback, I would really appreciate it!  Some additional notes/comments about my work so far can be found attached to the JIRA ticket linked above.

 

Thanks,

Demian

Robert Haschart

unread,
Sep 3, 2010, 1:18:36 PM9/3/10
to solrma...@googlegroups.com
Demian,

I'll happily review, test and check-in your patch.  Thanks.  Do you have any sample records that I can use to test the matching on other fields feature?

-Bob



Demian Katz wrote:
--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.

Demian Katz

unread,
Sep 3, 2010, 1:22:02 PM9/3/10
to solrma...@googlegroups.com

Thanks, Bob!

 

I have attached the sample Voyager dump that I used while testing my change.  When I do not use the new custom setting, SolrMarc imports 88 records.  When I match left 001 to right 004, it merges holdings and imports only 44 records.

 

One test case that I haven’t tried yet, but which we should probably look at, is what happens if one bib record is followed by multiple holdings records.  I can create a fake sample if you want.

 

- Demian

holdings.mrc

Jonathan Rochkind

unread,
Sep 3, 2010, 1:38:17 PM9/3/10
to solrma...@googlegroups.com
Just so naomi doesn't have to say it, since you already have test records and everything, is it possible to write an automated test or two? I know writing tests for solrmarc is still kind of tricky though.
________________________________________
From: solrma...@googlegroups.com [solrma...@googlegroups.com] On Behalf Of Demian Katz [demia...@villanova.edu]
Sent: Friday, September 03, 2010 1:22 PM
To: solrma...@googlegroups.com
Subject: RE: [solrmarc-tech] New MarcCombiningReader feature

Thanks, Bob!

I have attached the sample Voyager dump that I used while testing my change. When I do not use the new custom setting, SolrMarc imports 88 records. When I match left 001 to right 004, it merges holdings and imports only 44 records.

One test case that I haven’t tried yet, but which we should probably look at, is what happens if one bib record is followed by multiple holdings records. I can create a fake sample if you want.

- Demian

From: solrma...@googlegroups.com [mailto:solrma...@googlegroups.com] On Behalf Of Robert Haschart
Sent: Friday, September 03, 2010 1:19 PM
To: solrma...@googlegroups.com
Subject: Re: [solrmarc-tech] New MarcCombiningReader feature

Demian,

I'll happily review, test and check-in your patch. Thanks. Do you have any sample records that I can use to test the matching on other fields feature?

-Bob

Demian Katz wrote:
As a follow-up to yesterday’s conversation with Bob, I have attached a patch here:

http://jira.projectblacklight.org/jira/browse/SOLRMARC-4

This patch adds a new feature to the MarcCombiningReader – you can now configure which fields are used to determine matches when combining records. This allows, for example, merging a bibliographic record with an associated holdings record whose 004 field matches its 001 field (useful when dealing with full Voyager exports).

I would love to commit this so it can go into the next release – it’s going to be extremely useful here at Villanova. However, since I’m new to marc4j and am unfamiliar with testing procedures related to MarcCombiningReader, I didn’t want to commit it myself for fear of breaking something.

If anybody could apply the patch, review and test my changes and provide feedback, I would really appreciate it! Some additional notes/comments about my work so far can be found attached to the JIRA ticket linked above.

Thanks,
Demian
--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.

To post to this group, send email to solrma...@googlegroups.com<mailto:solrma...@googlegroups.com>.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com<mailto:solrmarc-tec...@googlegroups.com>.

Robert Haschart

unread,
Sep 3, 2010, 1:54:58 PM9/3/10
to solrma...@googlegroups.com
Demian,

Something seem to be wrong with the sample file you attached.   When saved the filehas a size of zero bytes.

-Bob

Demian Katz

unread,
Sep 3, 2010, 2:02:49 PM9/3/10
to solrma...@googlegroups.com
I'm willing to write an automated test if somebody will give me some guidance on how to do it -- in fact, that's one of my comments on the JIRA ticket! I haven't even tried yet, though, since I've gathered from your past comments that it is difficult, and I have no prior experience with JUnit.

- Demian

> -----Original Message-----
> From: solrma...@googlegroups.com [mailto:solrmarc-
> te...@googlegroups.com] On Behalf Of Jonathan Rochkind
> Sent: Friday, September 03, 2010 1:38 PM
> To: solrma...@googlegroups.com
> Subject: RE: [solrmarc-tech] New MarcCombiningReader feature
>
> Just so naomi doesn't have to say it, since you already have test
> records and everything, is it possible to write an automated test or
> two? I know writing tests for solrmarc is still kind of tricky though.
> ________________________________________
> From: solrma...@googlegroups.com [solrma...@googlegroups.com]
> On Behalf Of Demian Katz [demia...@villanova.edu]
> Sent: Friday, September 03, 2010 1:22 PM
> To: solrma...@googlegroups.com
> Subject: RE: [solrmarc-tech] New MarcCombiningReader feature
>
> Thanks, Bob!
>
> I have attached the sample Voyager dump that I used while testing my
> change. When I do not use the new custom setting, SolrMarc imports 88
> records. When I match left 001 to right 004, it merges holdings and
> imports only 44 records.
>
> One test case that I haven't tried yet, but which we should probably
> look at, is what happens if one bib record is followed by multiple
> holdings records. I can create a fake sample if you want.
>
> - Demian
>

> From: solrma...@googlegroups.com [mailto:solrmarc-
> te...@googlegroups.com] On Behalf Of Robert Haschart
> Sent: Friday, September 03, 2010 1:19 PM
> To: solrma...@googlegroups.com
> Subject: Re: [solrmarc-tech] New MarcCombiningReader feature
>
> Demian,
>
> I'll happily review, test and check-in your patch. Thanks. Do you
> have any sample records that I can use to test the matching on other
> fields feature?
>
> -Bob
>
>
>
> Demian Katz wrote:
> As a follow-up to yesterday's conversation with Bob, I have attached a
> patch here:
>
> http://jira.projectblacklight.org/jira/browse/SOLRMARC-4
>

> This patch adds a new feature to the MarcCombiningReader - you can now


> configure which fields are used to determine matches when combining
> records. This allows, for example, merging a bibliographic record with
> an associated holdings record whose 004 field matches its 001 field
> (useful when dealing with full Voyager exports).
>

> I would love to commit this so it can go into the next release - it's


> going to be extremely useful here at Villanova. However, since I'm new
> to marc4j and am unfamiliar with testing procedures related to
> MarcCombiningReader, I didn't want to commit it myself for fear of
> breaking something.
>
> If anybody could apply the patch, review and test my changes and
> provide feedback, I would really appreciate it! Some additional
> notes/comments about my work so far can be found attached to the JIRA
> ticket linked above.
>
> Thanks,
> Demian
> --
> You received this message because you are subscribed to the Google
> Groups "solrmarc-tech" group.

> To post to this group, send email to solrmarc-
> te...@googlegroups.com<mailto:solrma...@googlegroups.com>.


> To unsubscribe from this group, send email to solrmarc-

> tech+uns...@googlegroups.com<mailto:solrmarc-
> tech+uns...@googlegroups.com>.


> For more options, visit this group at
> http://groups.google.com/group/solrmarc-tech?hl=en.
>
> --
> You received this message because you are subscribed to the Google
> Groups "solrmarc-tech" group.
> To post to this group, send email to solrma...@googlegroups.com.
> To unsubscribe from this group, send email to solrmarc-

> tech+uns...@googlegroups.com.


> For more options, visit this group at
> http://groups.google.com/group/solrmarc-tech?hl=en.
>
> --
> You received this message because you are subscribed to the Google
> Groups "solrmarc-tech" group.
> To post to this group, send email to solrma...@googlegroups.com.
> To unsubscribe from this group, send email to solrmarc-

> tech+uns...@googlegroups.com.


> For more options, visit this group at
> http://groups.google.com/group/solrmarc-tech?hl=en.
>
> --
> You received this message because you are subscribed to the Google
> Groups "solrmarc-tech" group.
> To post to this group, send email to solrma...@googlegroups.com.
> To unsubscribe from this group, send email to solrmarc-

> tech+uns...@googlegroups.com.

Demian Katz

unread,
Sep 3, 2010, 2:07:06 PM9/3/10
to solrma...@googlegroups.com

Strange… I was just able to download it successfully from the web archive of the Google Group:

 

http://groups.google.com/group/solrmarc-tech/attach/564cecfdba723939/holdings.mrc?part=4

 

(it should be around 55k).

 

I also tried to send it to you via Skype since you appear to be online… but maybe you aren’t seeing my messages for some reason.  If the Google link above doesn’t work, just ping me via Skype and I can try again that way.

 

thanks,
Demian

Naomi Dushay

unread,
Sep 3, 2010, 2:48:13 PM9/3/10
to solrma...@googlegroups.com
I haven't looked at it in a long while, but

http://code.google.com/p/solrmarc/wiki/Testing

might be helpful.

I do have it on my place to make it easier to write and run tests.

- Naomi

> To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com

Demian Katz

unread,
Sep 3, 2010, 3:55:39 PM9/3/10
to solrma...@googlegroups.com
It's entirely possible that I'm missing something, but it seems that the existing tests and related documentation focus on field mapping and indexing... I'm not sure if there's an established methodology for testing the combining reader, and I'm probably not the person to establish it... at least until I take a crash course in JUnit! But maybe this is another subject we should add to the agenda for our next call. I'm willing and interested to help with this... but a bit of real-time discussion will probably get me pointed in the right direction faster than me fumbling around on my own.

- Demian

> -----Original Message-----
> From: solrma...@googlegroups.com [mailto:solrmarc-
> te...@googlegroups.com] On Behalf Of Naomi Dushay
> Sent: Friday, September 03, 2010 2:48 PM
> To: solrma...@googlegroups.com
> Subject: Re: [solrmarc-tech] New MarcCombiningReader feature
>

> I haven't looked at it in a long while, but
>
> http://code.google.com/p/solrmarc/wiki/Testing
>
> might be helpful.
>
> I do have it on my place to make it easier to write and run tests.
>
> - Naomi
>
> On Sep 3, 2010, at 11:02 AM, Demian Katz wrote:
>
> > I'm willing to write an automated test if somebody will give me some
> > guidance on how to do it -- in fact, that's one of my comments on
> > the JIRA ticket! I haven't even tried yet, though, since I've
> > gathered from your past comments that it is difficult, and I have no
> > prior experience with JUnit.
> >
> > - Demian
> >
> >> -----Original Message-----
> >> From: solrma...@googlegroups.com [mailto:solrmarc-
> >> te...@googlegroups.com] On Behalf Of Jonathan Rochkind
> >> Sent: Friday, September 03, 2010 1:38 PM
> >> To: solrma...@googlegroups.com
> >> Subject: RE: [solrmarc-tech] New MarcCombiningReader feature
> >>
> >> Just so naomi doesn't have to say it, since you already have test
> >> records and everything, is it possible to write an automated test or
> >> two? I know writing tests for solrmarc is still kind of tricky
> >> though.
> >> ________________________________________

> >> From: solrma...@googlegroups.com [solrmarc-
> te...@googlegroups.com]

> > To unsubscribe from this group, send email to solrmarc-

Robert Haschart

unread,
Sep 3, 2010, 5:57:14 PM9/3/10
to solrma...@googlegroups.com
The attachment on the mail message was size 0k, however using that URL worked, and the patched code for the MarcCombiningReader is working.

Also that code is now included in the _just_ released 2.1.2 version of SolrMarc.

-Bob Haschart
Reply all
Reply to author
Forward
0 new messages