deleting records on import

112 views
Skip to first unread message

Walker, David

unread,
Jul 27, 2010, 10:58:09 AM7/27/10
to solrma...@googlegroups.com
So, I'm getting a nightly dump of records that have changed in our catalog. And that dump will occasionally include records that have been marked as 'deleted' or 'suppressed'.

This is for an Innovative system, so the 907c will include values indicating if the record is deleted or suppressed. Following this message in the archive [1], I've added this to my indexing properties file:

# remove deleted and suppressed records

bcode3 = 907c, (map.delete_record_map), DeleteRecordIfFieldEmpty
map.delete_record_map.d = null
map.delete_record_map.n = null
map.delete_record_map.s = null
map.delete_record_map = keep

But this doesn't seem to delete existing records in the Solr index (that is records that I indexed previously, but now need to remove from the index). Maybe I've done something wrong?

Or is the above code simply telling SolrMarc to 'skip' (that is, not index) these records, and I need something different that tells it to go back and delete existing records?

--Dave

[1] http://groups.google.com/group/solrmarc-tech/browse_thread/thread/5d31f0f8f3758bfc

==================
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

Robert Haschart

unread,
Jul 27, 2010, 12:31:37 PM7/27/10
to solrma...@googlegroups.com

Dave,

The DeleteRecordIfFieldEmpty should work exactly as you expect. If the
907c field contains a 'd' 'n' or 's' the record should be deleted from
the existing index. This functionality was added because we relied on it
here at UVa for hiding records that were designated as "shadowed", more
recently it was decided to still add those records to the index, but to
flag them as "shadowed" and have the blacklight interface limit its
searches to not-shadowed items. However the functionality should still
be in there and should still work. If this is not the behavior you are
seeing then something must be wrong. I'll look at my local
implementation, an see whether the records seem to be deleted

-Bob

Simon Lamb

unread,
Jun 19, 2012, 10:53:47 AM6/19/12
to solrma...@googlegroups.com
Okay, I think I've figured this... 

We have a local oddity in that sometimes our 001 field doesn't contain an reference or number that can be used as an ID.  Therefore we decided to use our bibliographic record number '907a' for the basis of the ID, which all of our records do have.

Having just tested reverting back the ID to 001, and following the same index/delete pattern, the records have actually disappeared from solr... So it looks like it defaults to delete records using the 001 field within solr, even if the ID is set to something different in the index configuration. 

I'm guessing this may have stung other people in the past, so I guess it might be a case of locally adapting the source.    

Will investigate further...

Thanks
Simon



On Thursday, 14 June 2012 15:07:09 UTC+1, Simon Lamb wrote:
Hi all,

I know this post was a long while back, however I'm running into something very similar and was wondering if the issue was ever resolved (or the cause found).

I have the following configuration in my index file:-

#the custom returnSupressedRecordAsNull returns null, the record will be deleted...
record_status_t = customDeleteRecordIfFieldEmpty, returnSuppressedRecordAsNull

The returnSuppressedRecordAsNull is a simple custom Java method which checks our suppression fields and returns null if we want the record to be suppressed.  (Note, the reason why I didn't use the standard  'record_status_display = 998f, (map.suppressed_record_map), DeleteRecordIfFieldEmpty' approach is because we needed to get the first field of a potentially multiple 998f fields.)

I'm finding that the function works perfectly and it is stopping the indexing of the suppressed records, however it doesn't attempt to Delete any fields from Solr.  As we are going to use this routine for nightly updates, its important that newly suppressed records are caught and deleted from the solr index.  

Here are the last few lines of output from  SolrMarc indexfile routine:-

INFO:  Adding 0 of 46 documents to index
14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter handleAll
INFO:  Deleting 0 documents from index
14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish
INFO: Calling commit (with optimize set to false)
14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish
INFO: Done with the commit, closing Solr
14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter f

Am I missing something (config, java source)? 

Thanks in advance,
Simon

Demian Katz

unread,
Jun 20, 2012, 10:19:15 AM6/20/12
to solrma...@googlegroups.com

Maybe I’m misunderstanding something, but shouldn’t the configuration read:

 

record_status_t = custom, returnSuppressedRecordAsNull, DeleteRecordIfFieldEmpty

 

?

 

I have never used this feature, so maybe I’m confused, but the index configuration you shared doesn’t look quite right to me.

 

- Demian

 

From: solrma...@googlegroups.com [mailto:solrma...@googlegroups.com] On Behalf Of Simon Lamb
Sent: Thursday, June 14, 2012 10:07 AM
To: solrma...@googlegroups.com
Subject: Re: [solrmarc-tech] deleting records on import

 

Hi all,

 

I know this post was a long while back, however I'm running into something very similar and was wondering if the issue was ever resolved (or the cause found).

 

I have the following configuration in my index file:-

 

#the custom returnSupressedRecordAsNull returns null, the record will be deleted...

record_status_t = customDeleteRecordIfFieldEmpty, returnSuppressedRecordAsNull

 

The returnSuppressedRecordAsNull is a simple custom Java method which checks our suppression fields and returns null if we want the record to be suppressed.  (Note, the reason why I didn't use the standard  'record_status_display = 998f, (map.suppressed_record_map), DeleteRecordIfFieldEmpty' approach is because we needed to get the first field of a potentially multiple 998f fields.)

 

I'm finding that the function works perfectly and it is stopping the indexing of the suppressed records, however it doesn't attempt to Delete any fields from Solr.  As we are going to use this routine for nightly updates, its important that newly suppressed records are caught and deleted from the solr index.  

 

Here are the last few lines of output from  SolrMarc indexfile routine:-

 

INFO:  Adding 0 of 46 documents to index

14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter handleAll

INFO:  Deleting 0 documents from index

14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish

INFO: Calling commit (with optimize set to false)

14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish

INFO: Done with the commit, closing Solr

14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter f

 

Am I missing something (config, java source)? 

 

Thanks in advance,

Simon

 

On Tuesday, 27 July 2010 17:31:37 UTC+1, Robert Haschart wrote:

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To view this discussion on the web visit https://groups.google.com/d/msg/solrmarc-tech/-/oYN6MJDJgoUJ.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.

Robert J. Haschart

unread,
Jun 20, 2012, 11:23:29 AM6/20/12
to solrma...@googlegroups.com
I think the syntax Simon is using is correct. In fact the results that he
is seeing might actually be correct, but the logging messages he is seeing
may well be misleading.

This log message:
> INFO: Deleting 0 documents from index
Only lists the records deleted by having been listed in a .del file not
those that occur in a marc record file (or a marcxml file) where the data in
the marc records specify that the record ought to be deleted.

Simon, can you try to verify the actual behavior that is occurring. ie.
add a couple of records, and then process a couple of records that ought to
trigger the deletion via the custom rule, and then check whether the records
are actually present in the solr index.

-Bob Haschart

Simon Lamb

unread,
Jun 20, 2012, 12:18:23 PM6/20/12
to solrma...@googlegroups.com
Thanks for the replies guys.

Just to sum up my findings so far... With the following configuration:-

id = custom, getBibRecordNo
#If the custom returnSupressedRecordAsNull returns null, the record will be deleted...
record_status_t = customDeleteRecordIfFieldEmpty, returnSuppressedRecordAsNull


I can index a group of marc records with no issue, and the ID for the solr documents are the Bibliographic record ID from the Marc record (907a). However when I run the code again with the records responding with a null entry for the record_status_t, solr-marc correctly doesn't add the records (because they are suppressed), however it doesn't delete the said records from Solr.


If I change the configuration to this:-

id = 001, first 
record_status_t = customDeleteRecordIfFieldEmpty, returnSuppressedRecordAsNull

And re-index the fresh batch of marc records so that the solr ID is using the 001 field.  When I run the solr-marc indexfile again with the returnSuppressedRecordAsNull returning null for each of the records, it correctly doesn't add them to the index, and it also correctly deletes them from the Solr index.

Therefore I surmise that what the delete is doing, is deleting from solr the records based upon the 001 rather then the one in _index.properties file.  So when my ID's are set to 907a, it won't delete them because they don't match the 001 field.  

This is unfortunate because we do need to use the 907a field as the basis for our solr id's because it is the consistently used within our catalog.  I'm guessing somewhere that the Delete code will be deleting a solr document using - String id = record.getControlNumber();  ?

Again many thanks for the response,
Simon

   




>For more options, visit this group at
>http://groups.google.com/group/solrmarc-tech?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
>"solrmarc-tech" group.
> To post to this group, send email to solrma...@googlegroups.com.
> To unsubscribe from this group, send email to

Robert J. Haschart

unread,
Jun 20, 2012, 5:57:38 PM6/20/12
to solrma...@googlegroups.com, Simon Lamb
That is a helpful analysis. I am on vacation in Maine this week and
probably won't be able to fix anything until I get back, but if I run out of
stuff to read I might read through the code to see whether I can see the
problem.

-Bob Haschart
>> >solrmarc-tec...@googlegroups.com.
>> >For more options, visit this group at
>> >http://groups.google.com/group/solrmarc-tech?hl=en.
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> >"solrmarc-tech" group.
>> > To post to this group, send email to solrma...@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> >solrmarc-tec...@googlegroups.com.
>> >For more options, visit this group at
>> >http://groups.google.com/group/solrmarc-tech?hl=en.
>> >
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
>"solrmarc-tech" group.
> To view this discussion on the web visit
>https://groups.google.com/d/msg/solrmarc-tech/-/oK-nu4ui0y4J.
> To post to this group, send email to solrma...@googlegroups.com.
> To unsubscribe from this group, send email to
>solrmarc-tec...@googlegroups.com.

Simon Lamb

unread,
Jun 20, 2012, 6:35:34 PM6/20/12
to solrma...@googlegroups.com
It can certainly wait a week or more, so please enjoy your vacation :-)

Thanks again for responding,
Simon


On Wed, Jun 20, 2012 at 10:57 PM, Robert J. Haschart <rh...@virginia.edu> wrote:
That is a helpful analysis.  I am on vacation in Maine this week and probably won't be able to fix anything until I get back, but if I run out of stuff to read I might read through the code to see whether I can see the problem.

-Bob Haschart



On Wed, 20 Jun 2012 09:18:23 -0700 (PDT)
 Simon Lamb <s.l...@hull.ac.uk> wrote:
Thanks for the replies guys.

Just to sum up my findings so far... With the following configuration:-

id = custom, getBibRecordNo
#If the custom returnSupressedRecordAsNull returns null, the record will be deleted...
record_status_t = customDeleteRecordIfFieldEmpty, returnSuppressedRecordAsNull

My custom methods are here - https://github.com/uohull/blah-solrmarc/blob/master/src/BlahIndexer.java

I can index a group of marc records with no issue, and the ID for the solr documents are the Bibliographic record ID from the Marc record (907a). However when I run the code again with the records responding with a null entry for the record_status_t, solr-marc correctly doesn't add the records (because they are suppressed), however it doesn't delete the said records from Solr.


If I change the configuration to this:-

id = 001, first record_status_t = customDeleteRecordIfFieldEmpty, returnSuppressedRecordAsNull

And re-index the fresh batch of marc records so that the solr ID is using the 001 field.  When I run the solr-marc indexfile again with the returnSuppressedRecordAsNull returning null for each of the records, it correctly doesn't add them to the index, and it also correctly deletes them from the Solr index.

Therefore I surmise that what the delete is doing, is deleting from solr the records based upon the 001 rather then the one in _index.properties file.  So when my ID's are set to 907a, it won't delete them because they don't match the 001 field.  
This is unfortunate because we do need to use the 907a field as the basis for our solr id's because it is the consistently used within our catalog. I'm guessing somewhere that the Delete code will be deleting a solr document using - String id = record.getControlNumber();  ?

Again many thanks for the response,
Simon

 




On Wednesday, 20 June 2012 16:23:29 UTC+1, Robert J. Haschart wrote:

I think the syntax Simon is using is correct.  In fact the results that he is seeing might actually be correct, but the logging messages he is seeing may well be misleading.
This log message: > INFO:  Deleting 0 documents from index Only lists the records deleted by having been listed in a .del file not those that occur in a marc record file (or a marcxml file) where the data in the marc records specify that the record ought to be deleted.
Simon, can you try to verify the actual behavior that is occurring.  ie.  add a couple of records, and then process a couple of records that ought to trigger the deletion via the custom rule, and then check whether the records are actually present in the solr index.
-Bob Haschart


On Wed, 20 Jun 2012 14:19:15 +0000   Demian Katz <demia...@villanova.edu> wrote: > Maybe I'm misunderstanding something, but shouldn't the configuration >read: > > record_status_t = custom, returnSuppressedRecordAsNull, >DeleteRecordIfFieldEmpty > > ? > > I have never used this feature, so maybe I'm confused, but the index >configuration you shared doesn't look quite right to me. > > - Demian > >From: solrma...@googlegroups.com >[mailto:solrmarc-tech@googlegroups.com] On Behalf Of Simon Lamb > Sent: Thursday, June 14, 2012 10:07 AM > To: solrma...@googlegroups.com > Subject: Re: [solrmarc-tech] deleting records on import > > Hi all, > > I know this post was a long while back, however I'm running into something >very similar and was wondering if the issue was ever resolved (or the cause >found). > > I have the following configuration in my index file:- > > #the custom returnSupressedRecordAsNull returns null, the record will be >deleted... > record_status_t = customDeleteRecordIfFieldEmpty, >returnSuppressedRecordAsNull > > The returnSuppressedRecordAsNull is a simple custom Java method which >checks our suppression fields and returns null if we want the record to be >suppressed.  (Note, the reason why I didn't use the standard > 'record_status_display = 998f, (map.suppressed_record_map), >DeleteRecordIfFieldEmpty' approach is because we needed to get the first >field of a potentially multiple 998f fields.) > > I'm finding that the function works perfectly and it is stopping the >indexing of the suppressed records, however it doesn't attempt to Delete >any fields from Solr.  As we are going to use this routine for nightly >updates, its important that newly suppressed records are caught and deleted >from the solr index. > > Here are the last few lines of output from  SolrMarc indexfile routine:- > > INFO:  Adding 0 of 46 documents to index > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter handleAll > INFO:  Deleting 0 documents from index > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish > INFO: Calling commit (with optimize set to false) > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish > INFO: Done with the commit, closing Solr > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter f > > Am I missing something (config, java source)? > > Thanks in advance, > Simon > > On Tuesday, 27 July 2010 17:31:37 UTC+1, Robert Haschart wrote: > > Dave, > > The DeleteRecordIfFieldEmpty should work exactly as you expect.  If the > 907c field contains a 'd' 'n' or 's' the record should be deleted from > the existing index. This functionality was added because we relied on it > here at UVa for hiding records that were designated as "shadowed", more > recently it was decided to still add those records to the index, but to > flag them as "shadowed" and have the blacklight interface limit its > searches to not-shadowed items.  However the functionality should still > be in there and should still work.   If this is not the behavior you are > seeing then something must be wrong.    I'll look at my local > implementation, an see whether the records seem to be deleted > > -Bob > > Walker, David wrote: > >>So, I'm getting a nightly dump of records that have changed in our catalog. >> And that dump will occasionally include records that have been marked as >>'deleted' or 'suppressed'. >> >>This is for an Innovative system, so the 907c will include values >>indicating if the record is deleted or suppressed.  Following this message >>in the archive [1], I've added this to my indexing properties file: >> >>  # remove deleted and suppressed records >> >>  bcode3 = 907c, (map.delete_record_map), DeleteRecordIfFieldEmpty >>  map.delete_record_map.d = null >>  map.delete_record_map.n = null >>  map.delete_record_map.s = null >>  map.delete_record_map = keep >> >>But this doesn't seem to delete existing records in the Solr index (that is >>records that I indexed previously, but now need to remove from the index). >> Maybe I've done something wrong? >> >>Or is the above code simply telling SolrMarc to 'skip' (that is, not index) >>these records, and I need something different that tells it to go back and >>delete existing records? >> >>--Dave >> >>[1] >>
http://groups.google.com/group/solrmarc-tech/browse_thread/thread/5d31f0f8f3758bfc >> >>================== >>David Walker >>Library Web Services Manager >>California State University >>http://xerxes.calstate.edu >> >> >> > -- > You received this message because you are subscribed to the Google Groups >"solrmarc-tech" group. > To view this discussion on the web visit >https://groups.google.com/d/msg/solrmarc-tech/-/oYN6MJDJgoUJ. > To post to this group, send email to solrma...@googlegroups.com. > To unsubscribe from this group, send email to >solrmarc-tech+unsubscribe@googlegroups.com. >For more options, visit this group at >http://groups.google.com/group/solrmarc-tech?hl=en. > > -- > You received this message because you are subscribed to the Google Groups >"solrmarc-tech" group. > To post to this group, send email to solrma...@googlegroups.com. > To unsubscribe from this group, send email to >solrmarc-tech+unsubscribe@googlegroups.com. >For more options, visit this group at >http://groups.google.com/group/solrmarc-tech?hl=en. >


--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To view this discussion on the web visit https://groups.google.com/d/msg/solrmarc-tech/-/oK-nu4ui0y4J.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.


--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.

Robert J. Haschart

unread,
Jun 21, 2012, 10:01:39 AM6/21/12
to solrma...@googlegroups.com
A quick glance at the code shows that you are right. In the file
MarcImporter.java shows the following method, where if the indexer.map call
throws an exception indication that the records ought to be deleted, it then
attempts to delete it by the records Control Number (aka the 001 field
value).

This is further evidence for the need of a method to get the value assigned
for the SolrID or (as Demian suggested) to get the value for any solr field
value.
Although the latter would be much more difficult given the random order that
the solr field values are computed.

I'll get on that as soon as I return so that it can be included in a release
before the beginning of July.

-Bob Haschart


private boolean addToIndex(Record record) throws IOException
{
try {
Map<String, Object> fieldsMap = indexer.map(record, errors);
String docStr = addToIndex(fieldsMap);

if (verbose || justIndexDontAdd)
{
if (verbose)
{
System.out.println(record.toString());
logger.info(record.toString());
}
System.out.println(docStr);
logger.info(docStr);
}
return(true);
}
catch (SolrMarcIndexerException e)
{
if (e.getLevel() == SolrMarcIndexerException.IGNORE)
{
throw(e);
}
else if (e.getLevel() == SolrMarcIndexerException.DELETE)
{
String id = record.getControlNumber(); //<--- get the
control number
if (id != null)
{
solrProxy.delete(id, true, true); // <-- and delete a
record based on it. FAIL.
}
throw(e);
}
else if (e.getLevel() == SolrMarcIndexerException.EXIT)
{
throw(e);
}
}
return(true);
}

On Wed, 20 Jun 2012 23:35:34 +0100
Simon Lamb <simon...@gmail.com> wrote:
> It can certainly wait a week or more, so please enjoy your vacation :-)
>
> Thanks again for responding,
> Simon
>
>
> On Wed, Jun 20, 2012 at 10:57 PM, Robert J. Haschart
><rh...@virginia.edu>wrote:
>
>> That is a helpful analysis. I am on vacation in Maine this week and
>> probably won't be able to fix anything until I get back, but if I run out
>> of stuff to read I might read through the code to see whether I can see
>>the
>> problem.
>>
>> -Bob Haschart
>>
>>
>>
>> On Wed, 20 Jun 2012 09:18:23 -0700 (PDT)
>> Simon Lamb <s.l...@hull.ac.uk> wrote:
>>
>>> Thanks for the replies guys.
>>>
>>> Just to sum up my findings so far... With the following configuration:-
>>>
>>> id = custom, getBibRecordNo
>>> #If the custom returnSupressedRecordAsNull returns null, the record will
>>> be deleted...
>>> record_status_t = customDeleteRecordIfFieldEmpty**,
>>> returnSuppressedRecordAsNull
>>>
>>> My custom methods are here - https://github.com/uohull/**
>>> blah-solrmarc/blob/master/src/**BlahIndexer.java<https://github.com/uohull/blah-solrmarc/blob/master/src/BlahIndexer.java>
>>>
>>> I can index a group of marc records with no issue, and the ID for the
>>> solr documents are the Bibliographic record ID from the Marc record
>>>(907a).
>>> However when I run the code again with the records responding with a null
>>> entry for the record_status_t, solr-marc correctly doesn't add the records
>>> (because they are suppressed), however it doesn't delete the said records
>>> from Solr.
>>>
>>>
>>> If I change the configuration to this:-
>>>
>>> id = 001, first record_status_t = customDeleteRecordIfFieldEmpty**,
>>>> >From: solrma...@googlegroups.com >[mailto:solrmarc-tech@**
>>>> googlegroups.com <solrma...@googlegroups.com>] On Behalf Of Simon
>>>> Lamb > Sent: Thursday, June 14, 2012 10:07 AM > To:
>>>> solrma...@googlegroups.com > Subject: Re: [solrmarc-tech] deleting
>>>> records on import > > Hi all, > > I know this post was a long while back,
>>>> however I'm running into something >very similar and was wondering if the
>>>> issue was ever resolved (or the cause >found). > > I have the following
>>>> configuration in my index file:- > > #the custom
>>>> returnSupressedRecordAsNull returns null, the record will be >deleted... >
>>>> record_status_t = customDeleteRecordIfFieldEmpty**,
>>>> http://groups.google.com/**group/solrmarc-tech/browse_**
>>>> thread/thread/5d31f0f8f3758bfc<http://groups.google.com/group/solrmarc-tech/browse_thread/thread/5d31f0f8f3758bfc>>>
>>>>>>================== >>David Walker >>Library Web Services Manager
>>>> >>California State University >>http://xerxes.calstate.edu >> >> >> >
>>>> -- > You received this message because you are subscribed to the Google
>>>> Groups >"solrmarc-tech" group. > To view this discussion on the web visit
>>>>>
>>>> https://groups.google.com/d/**msg/solrmarc-tech/-/**oYN6MJDJgoUJ<https://groups.google.com/d/msg/solrmarc-tech/-/oYN6MJDJgoUJ>.
>>>> > To post to this group, send email to solrma...@googlegroups.com**.
>>>> > To unsubscribe from this group, send email to >
>>>> solrmarc-tech+unsubscribe@**googlegroups.com<solrmarc-tech%2Bunsu...@googlegroups.com>.
>>>> >For more options, visit this group at >http://groups.google.com/**
>>>> group/solrmarc-tech?hl=en<http://groups.google.com/group/solrmarc-tech?hl=en>.
>>>> > > -- > You received this message because you are subscribed to the
>>>>Google
>>>> Groups >"solrmarc-tech" group. > To post to this group, send email to
>>>> solrma...@googlegroups.com**. > To unsubscribe from this group,
>>>> send email to
>>>>>solrmarc-tech+unsubscribe@**googlegroups.com<solrmarc-tech%2Bunsu...@googlegroups.com>.
>>>> >For more options, visit this group at >http://groups.google.com/**
>>>> group/solrmarc-tech?hl=en<http://groups.google.com/group/solrmarc-tech?hl=en>.
>>>> >
>>>>
>>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "solrmarc-tech" group.
>>> To view this discussion on the web visit https://groups.google.com/d/**
>>> msg/solrmarc-tech/-/oK-**nu4ui0y4J<https://groups.google.com/d/msg/solrmarc-tech/-/oK-nu4ui0y4J>
>>> .
>>> To post to this group, send email to solrma...@googlegroups.com**.
>>> To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@*
>>> *googlegroups.com <solrmarc-tech%2Bunsu...@googlegroups.com>.
>>> For more options, visit this group at http://groups.google.com/**
>>> group/solrmarc-tech?hl=en<http://groups.google.com/group/solrmarc-tech?hl=en>
>>> .
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "solrmarc-tech" group.
>> To post to this group, send email to solrma...@googlegroups.com**.
>> To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@**
>> googlegroups.com <solrmarc-tech%2Bunsu...@googlegroups.com>.
>> For more options, visit this group at http://groups.google.com/**
>> group/solrmarc-tech?hl=en<http://groups.google.com/group/solrmarc-tech?hl=en>
>> .
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
>"solrmarc-tech" group.
> To post to this group, send email to solrma...@googlegroups.com.
> To unsubscribe from this group, send email to
>solrmarc-tec...@googlegroups.com.

Demian Katz

unread,
Jun 21, 2012, 10:04:14 AM6/21/12
to solrma...@googlegroups.com
Thanks, Bob -- sounds like a plan.

Out of curiosity, what determines the order in which the field values are computed? Is it not related to the order of the defined properties?

Anyway, don't go out of your way to answer that question -- it's just idle curiosity. Enjoy your vacation for now!

- Demian

> -----Original Message-----
> From: solrma...@googlegroups.com [mailto:solrma...@googlegroups.com]
> On Behalf Of Robert J. Haschart
> Sent: Thursday, June 21, 2012 10:02 AM
> To: solrma...@googlegroups.com
> Subject: Re: [solrmarc-tech] deleting records on import
>
> tech+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/solrmarc-
> tech?hl=en.

Robert J. Haschart

unread,
Jun 21, 2012, 10:30:42 AM6/21/12
to solrma...@googlegroups.com
Since they are properties, and are traversed by an iterator over the set of
properties, the order is determined by the Java iterator code the
documentation of which states:

public Iterator iterator()
Returns an iterator over the elements in this set. The elements are returned
in no particular order (unless this set is an instance of some class that
provides a guarantee).

So imposing an overall ordering in general may be difficult given that the
underlying mechanisms discard the ordering, instead it may be better to
implement a cache for values and if one solr value relies on another, that
hasn't been evaluated yet, to compute that value as needed and cache the
results for later. Although this introduces the possibility of dependency
loops which (if not caught) could lead to infinite recursion.

-Bob Haschart


On Thu, 21 Jun 2012 14:04:14 +0000

Demian Katz

unread,
Jun 21, 2012, 11:10:31 AM6/21/12
to solrma...@googlegroups.com
Sounds like a classical computer science problem of some sort or other -- hopefully it will prove fun to solve. :-)

Jonathan Rochkind

unread,
Jun 21, 2012, 11:25:37 AM6/21/12
to solrma...@googlegroups.com
If ordering ought to matter, can't you just store these in an ordered
data structure instead of a Java Set? Or is that kind of a hard to
change implication of using the standard java properties parser?

Just storing in an ordered data structure seems preferable to storing in
a set but then adding workarounds on top with caching and attempts to
resolve inter-property dependencies. If simply storing them ordered
instead would work instead.

Simon Spero

unread,
Jun 21, 2012, 12:58:24 PM6/21/12
to solrma...@googlegroups.com
On Thu, Jun 21, 2012 at 11:25 AM, Jonathan Rochkind <roch...@jhu.edu> wrote:
If ordering ought to matter, can't you just store these in an ordered data structure instead of a Java Set?  Or is that kind of a hard to change implication of using the standard java properties parser?

Just storing in an ordered data structure seems preferable to storing in a set but then adding workarounds on top with caching and attempts to resolve inter-property dependencies. If simply storing them ordered instead would work instead.

I don't think this is an ordering issue per se; the problem is the use of  getControlNumber() as the record Id (this is the only place where it is used for something other than error messages. 

The way that record deletion is handled is problematic. An exception is used to short-circuit the construction of the index map if the record is going to be deleted.  Throwing the exception discards the index map that was being constructed.  If the thrown exception were to include the map constructed prior to that point,  it would be easy to implement an order related bug :-)

It is straightforward to add a method to SolrIndexer.java to find the value that will be added to the index "id":

      public String getIdFromRecord(Record record) {
        String fieldVal[] = fieldMap.get("id");
        if(fieldVal ==null) {
            return null;
        }
        String indexParm = fieldVal[2];
        String mapName = fieldVal[3];

        String recordId = getFirstFieldVal(record, mapName, indexParm);
        return recordId;
    }

The line in the exception handler in MarcImporter.java then changes from:

                String id = record.getControlNumber();
 
To:
                String id = indexer.getIdFromRecord(record);


Observation:

The 001 field is required, and is supposed to be unique within a given organisation (in the 003 field).  So, if the records are scoped to a single org, the lack of a valid 001 field is a sign of bad MARC data. Is there any other kind?

Simon

Robert Haschart

unread,
Jul 5, 2012, 5:11:48 PM7/5/12
to solrma...@googlegroups.com, Simon Lamb
Simon,

I have implemented a solution that should fix this problem.   The required code is checked into the SolrMarc GoogleCode repository. 
I plan to create an official release in the next day or so containing this new code, as well as several other unrelated changes.
So you can either build from a SVN checkout or grab the new release in a day or so.

-Bob Haschart
On Wed, 20 Jun 2012 14:19:15 +0000   Demian Katz <demia...@villanova.edu> wrote: > Maybe I'm misunderstanding something, but shouldn't the configuration >read: > > record_status_t = custom, returnSuppressedRecordAsNull, >DeleteRecordIfFieldEmpty > > ? > > I have never used this feature, so maybe I'm confused, but the index >configuration you shared doesn't look quite right to me. > > - Demian > >From: solrma...@googlegroups.com >[mailto:solrma...@googlegroups.com] On Behalf Of Simon Lamb > Sent: Thursday, June 14, 2012 10:07 AM > To: solrma...@googlegroups.com > Subject: Re: [solrmarc-tech] deleting records on import > > Hi all, > > I know this post was a long while back, however I'm running into something >very similar and was wondering if the issue was ever resolved (or the cause >found). > > I have the following configuration in my index file:- > > #the custom returnSupressedRecordAsNull returns null, the record will be >deleted... > record_status_t = customDeleteRecordIfFieldEmpty, >returnSuppressedRecordAsNull > > The returnSuppressedRecordAsNull is a simple custom Java method which >checks our suppression fields and returns null if we want the record to be >suppressed.  (Note, the reason why I didn't use the standard > 'record_status_display = 998f, (map.suppressed_record_map), >DeleteRecordIfFieldEmpty' approach is because we needed to get the first >field of a potentially multiple 998f fields.) > > I'm finding that the function works perfectly and it is stopping the >indexing of the suppressed records, however it doesn't attempt to Delete >any fields from Solr.  As we are going to use this routine for nightly >updates, its important that newly suppressed records are caught and deleted >from the solr index. > > Here are the last few lines of output from  SolrMarc indexfile routine:- > > INFO:  Adding 0 of 46 documents to index > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter handleAll > INFO:  Deleting 0 documents from index > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish > INFO: Calling commit (with optimize set to false) > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish > INFO: Done with the commit, closing Solr > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter f > > Am I missing something (config, java source)? > > Thanks in advance, > Simon > > On Tuesday, 27 July 2010 17:31:37 UTC+1, Robert Haschart wrote: > > Dave, > > The DeleteRecordIfFieldEmpty should work exactly as you expect.  If the > 907c field contains a 'd' 'n' or 's' the record should be deleted from > the existing index. This functionality was added because we relied on it > here at UVa for hiding records that were designated as "shadowed", more > recently it was decided to still add those records to the index, but to > flag them as "shadowed" and have the blacklight interface limit its > searches to not-shadowed items.  However the functionality should still > be in there and should still work.   If this is not the behavior you are > seeing then something must be wrong.    I'll look at my local > implementation, an see whether the records seem to be deleted > > -Bob > > Walker, David wrote: > >>So, I'm getting a nightly dump of records that have changed in our catalog. >> And that dump will occasionally include records that have been marked as >>'deleted' or 'suppressed'. >> >>This is for an Innovative system, so the 907c will include values >>indicating if the record is deleted or suppressed.  Following this message >>in the archive [1], I've added this to my indexing properties file: >> >>  # remove deleted and suppressed records >> >>  bcode3 = 907c, (map.delete_record_map), DeleteRecordIfFieldEmpty >>  map.delete_record_map.d = null >>  map.delete_record_map.n = null >>  map.delete_record_map.s = null >>  map.delete_record_map = keep >> >>But this doesn't seem to delete existing records in the Solr index (that is >>records that I indexed previously, but now need to remove from the index). >> Maybe I've done something wrong? >> >>Or is the above code simply telling SolrMarc to 'skip' (that is, not index) >>these records, and I need something different that tells it to go back and >>delete existing records? >> >>--Dave >> >>[1] >>
http://groups.google.com/group/solrmarc-tech/browse_thread/thread/5d31f0f8f3758bfc >> >>================== >>David Walker >>Library Web Services Manager >>California State University >>http://xerxes.calstate.edu >> >> >> > -- > You received this message because you are subscribed to the Google Groups >"solrmarc-tech" group. > To view this discussion on the web visit >https://groups.google.com/d/msg/solrmarc-tech/-/oYN6MJDJgoUJ. > To post to this group, send email to solrma...@googlegroups.com. > To unsubscribe from this group, send email to >solrmarc-tec...@googlegroups.com. >For more options, visit this group at >http://groups.google.com/group/solrmarc-tech?hl=en. > > -- > You received this message because you are subscribed to the Google Groups >"solrmarc-tech" group. > To post to this group, send email to solrma...@googlegroups.com. > To unsubscribe from this group, send email to >solrmarc-tec...@googlegroups.com. >For more options, visit this group at >http://groups.google.com/group/solrmarc-tech?hl=en. >


--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To view this discussion on the web visit https://groups.google.com/d/msg/solrmarc-tech/-/oK-nu4ui0y4J.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.


--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com.

Simon Lamb

unread,
Jul 5, 2012, 5:41:17 PM7/5/12
to solrma...@googlegroups.com
Bob, Simon et al. 

Apologies, I've been busy re-skinning our Blacklight instance over the last few days, so had taken my eye off this thread.  Thank you for all your contributions, and indeed for coming up with a fix, this will help us at Hull greatly with our deployment of Solr-Marc and Blacklight.  

I'll be returning to this area of work in a few days, so I'll grab the new release when it arrives and hit it with some test data.    

Again, many thanks for looking at this Bob.

Simon

Simon Lamb

unread,
Jul 27, 2012, 10:25:01 AM7/27/12
to solrma...@googlegroups.com
Just to follow up (better late then never!).  This week I've managed to test the latest version of solr-marc and all seems to be great - Delete now works perfectly with configurations that use non-standard id fields.  

Thank you for the fix.

Simon  
On Wed, 20 Jun 2012 14:19:15 +0000   Demian Katz <demia...@villanova.edu> wrote: > Maybe I'm misunderstanding something, but shouldn't the configuration >read: > > record_status_t = custom, returnSuppressedRecordAsNull, >DeleteRecordIfFieldEmpty > > ? > > I have never used this feature, so maybe I'm confused, but the index >configuration you shared doesn't look quite right to me. > > - Demian > >From: solrma...@googlegroups.com >[mailto:solrmarc-tech@googlegroups.com] On Behalf Of Simon Lamb > Sent: Thursday, June 14, 2012 10:07 AM > To: solrma...@googlegroups.com > Subject: Re: [solrmarc-tech] deleting records on import > > Hi all, > > I know this post was a long while back, however I'm running into something >very similar and was wondering if the issue was ever resolved (or the cause >found). > > I have the following configuration in my index file:- > > #the custom returnSupressedRecordAsNull returns null, the record will be >deleted... > record_status_t = customDeleteRecordIfFieldEmpty, >returnSuppressedRecordAsNull > > The returnSuppressedRecordAsNull is a simple custom Java method which >checks our suppression fields and returns null if we want the record to be >suppressed.  (Note, the reason why I didn't use the standard > 'record_status_display = 998f, (map.suppressed_record_map), >DeleteRecordIfFieldEmpty' approach is because we needed to get the first >field of a potentially multiple 998f fields.) > > I'm finding that the function works perfectly and it is stopping the >indexing of the suppressed records, however it doesn't attempt to Delete >any fields from Solr.  As we are going to use this routine for nightly >updates, its important that newly suppressed records are caught and deleted >from the solr index. > > Here are the last few lines of output from  SolrMarc indexfile routine:- > > INFO:  Adding 0 of 46 documents to index > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter handleAll > INFO:  Deleting 0 documents from index > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish > INFO: Calling commit (with optimize set to false) > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter finish > INFO: Done with the commit, closing Solr > 14-Jun-2012 15:02:33 org.solrmarc.marc.MarcImporter f > > Am I missing something (config, java source)? > > Thanks in advance, > Simon > > On Tuesday, 27 July 2010 17:31:37 UTC+1, Robert Haschart wrote: > > Dave, > > The DeleteRecordIfFieldEmpty should work exactly as you expect.  If the > 907c field contains a 'd' 'n' or 's' the record should be deleted from > the existing index. This functionality was added because we relied on it > here at UVa for hiding records that were designated as "shadowed", more > recently it was decided to still add those records to the index, but to > flag them as "shadowed" and have the blacklight interface limit its > searches to not-shadowed items.  However the functionality should still > be in there and should still work.   If this is not the behavior you are > seeing then something must be wrong.    I'll look at my local > implementation, an see whether the records seem to be deleted > > -Bob > > Walker, David wrote: > >>So, I'm getting a nightly dump of records that have changed in our catalog. >> And that dump will occasionally include records that have been marked as >>'deleted' or 'suppressed'. >> >>This is for an Innovative system, so the 907c will include values >>indicating if the record is deleted or suppressed.  Following this message >>in the archive [1], I've added this to my indexing properties file: >> >>  # remove deleted and suppressed records >> >>  bcode3 = 907c, (map.delete_record_map), DeleteRecordIfFieldEmpty >>  map.delete_record_map.d = null >>  map.delete_record_map.n = null >>  map.delete_record_map.s = null >>  map.delete_record_map = keep >> >>But this doesn't seem to delete existing records in the Solr index (that is >>records that I indexed previously, but now need to remove from the index). >> Maybe I've done something wrong? >> >>Or is the above code simply telling SolrMarc to 'skip' (that is, not index) >>these records, and I need something different that tells it to go back and >>delete existing records? >> >>--Dave >> >>[1] >>
http://groups.google.com/group/solrmarc-tech/browse_thread/thread/5d31f0f8f3758bfc >> >>================== >>David Walker >>Library Web Services Manager >>California State University >>http://xerxes.calstate.edu >> >> >> > -- > You received this message because you are subscribed to the Google Groups >"solrmarc-tech" group. > To view this discussion on the web visit >https://groups.google.com/d/msg/solrmarc-tech/-/oYN6MJDJgoUJ. > To post to this group, send email to solrma...@googlegroups.com. > To unsubscribe from this group, send email to >solrmarc-tech+unsubscribe@googlegroups.com. >For more options, visit this group at >http://groups.google.com/group/solrmarc-tech?hl=en. > > -- > You received this message because you are subscribed to the Google Groups >"solrmarc-tech" group. > To post to this group, send email to solrma...@googlegroups.com. > To unsubscribe from this group, send email to >solrmarc-tech+unsubscribe@googlegroups.com. >For more options, visit this group at >http://groups.google.com/group/solrmarc-tech?hl=en. >


--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To view this discussion on the web visit https://groups.google.com/d/msg/solrmarc-tech/-/oK-nu4ui0y4J.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.


--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages