Using copyfield for sorting

475 views
Skip to first unread message

Chris Burns

unread,
Dec 20, 2017, 9:55:35 AM12/20/17
to islandora
Hi all,

I thought I had finally figured out how to create a copyfield for sorting. However, a month after what seemed like a successful implementation, we are now getting warnings like the following from Solr:

Field mods_originInfo_dateCreated_sort_s is not multivalued and destination for multiple copyFields (14)

I'm not sure what to do about this. I pulled from _s and _dt fields, which I thought were single valued, but perhaps pulling from so many single valued fields to one copyfield (although it shouldn't be pulling more than one per record) is now triggering this warning. Can we ignore the warning?

 Here is the copyfield that triggered that warning.

<field name="mods_originInfo_dateCreated_sort_s" type="string" multiValued="false" indexed="true" stored="true" />
   <copyField source="mods_originInfo_keyDate_yes_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_dateIssued_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_relatedItem_original_originInfo_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_keyDate_yes_qualifier_approximate_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_keyDate_yes_point_start_qualifier_inferred_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_keyDate_yes_point_start_qualifier_questionable_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_keyDate_yes_qualifier_questionable_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_keyDate_yes_qualifier_inferred_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_keyDate_yes_dateIssued_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_relatedItem_host_originInfo_dateIssued_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_encoding_w3cdtf_keyDate_yes_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_encoding_w3cdtf_keyDate_yes_qualifier_inferred_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />
   <copyField source="mods_originInfo_encoding_w3cdtf_keyDate_yes_point_start_qualifier_inferred_dateCreated_s" dest="mods_originInfo_dateCreated_sort_s" />

Has anyone else run into this. I came upon this solution from other Islandora threads, but I'd also welcome any alternative solutions to this problem that handle it more efficiently.

Thanks.

Chris


dp...@metro.org

unread,
Dec 22, 2017, 9:43:52 AM12/22/17
to islandora
Hi Chris,

I don't think that what you are doing is actually possible. 
If you copy multiple fields into a single one, and the "source fields" have values then the result will need to be multivalued (and that is the reason it is complaining), 
Probably you had at some time mods_originInfo_dateCreated_sort_s actually set (implicitly) as a multivalued field, means you did not define the opposite and by default in such cases if Solr 4.x < gets multiple values for a field, even if _s, it will make it multivalued) and that is the reason it was not complaining, or, only one of the source fields had values.
In this case mods_originInfo_dateCreated_sort_s is getting data from multiple sources, which means also, even if what you want would be possible, it would make little sense for sorting? 

Some ideas there
I found this example , which takes two sources and concats them into a single one into another field. You can research on how to make that closer to your needs
<updateRequestProcessorChain name="composite-position">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">lat</str>
    <str name="source">lng</str>
    <str name="dest">store</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">store</str>
    <str name="delimiter">;</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
2. Do it on the fedoragsearch side.
Actually, i like that approach better. Keep track of all the dates inside your mods slurp all xslt file, do the logic there, discern, count, manipulate, etc (you can use a lot of power there) and finally put just a single, _dt date into a single mods_dateCreated_iwanttosort_dt (output directly from the XSLT) that you know will contain just the single best candidate, hopefully, formatted the way you want.

3. Do it on PHP
Add some alter hook to any MODS ingest that via PHP code calculates the best possible (amongs all dates) date for sorting and add a single, always single, date into like a mods extension? of your invention xml element. 
best

Diego Pino
Metro

Diego

Chris Burns

unread,
Dec 22, 2017, 10:32:49 AM12/22/17
to islandora
Hi Diego,

Thanks for all of this.

I stumbled onto creating a copyfield for sorting dates here - https://groups.google.com/forum/#!searchin/islandora/sort%7Csort:date/islandora/eBhYg7A9TeE/Mr1wZMRsAwAJ.

Digging a little deeper, it looks like Barnard (perhaps with the assistance of DiscoveryGarden) may have ended up using a different approach, similar to what you outlined in #2. If I'm understanding this approach correctly, it looks like it will grab all the originInfo/dateCreated variations and stick them in a new single valued field. We're going to test that first and if it works we will have to figure out how to also grab originInfo/dateIssued variations.

https://github.com/discoverygarden/barnard-basic-solr-config

schema.xml
<!-- Custom sorting field for Barnard -->
<field name="mods_originInfo_dateCreated_sort" type="date" indexed="true" stored="true" multiValued="false" sortMissingLast="true"/>
</fields>

slurp_all_MODS_to_solr.xslt
<!-- Custom Barnard date sorting field -->
<xsl:template match="mods:originInfo/mods:dateCreated" mode="barnard_slurping_MODS">
<xsl:variable name="textValue">
<xsl:call-template name="get_ISO8601_date">
<xsl:with-param name="date" select="normalize-space(text())"/>
</xsl:call-template>
</xsl:variable>

<xsl:if test="not(normalize-space($textValue)='')">
<xsl:variable name="field_name_sort">mods_originInfo_dateCreated_sort</xsl:variable>
<xsl:if test="java:add($single_valued_hashset, $field_name_sort)">
<field>
<xsl:attribute name="name">
<xsl:value-of select="$field_name_sort"/>
</xsl:attribute>
<xsl:value-of select="$textValue"/>
</field>
</xsl:if>
</xsl:if>
</xsl:template>

Chris
Reply all
Reply to author
Forward
0 new messages