Documentation question: list of indexed fields for Islandora Solr settings and metadata display

419 views
Skip to first unread message

Robin Dean

unread,
Apr 29, 2015, 7:52:21 PM4/29/15
to isla...@googlegroups.com

Hi all,

 

The Islandora documentation wiki has an out-of-date Appendix that lists terms for all the indexed fields in Solr to help with Solr configuration:

https://wiki.duraspace.org/display/ISLANDORA714/APPENDIX+D+-+SOLR+SCHEMA+%28SEARCH%29+Term+Reference

 

Repository administrators need to be able to enter Solr field terms in the interface in order to configure Solr Settings (admin/islandora/search/islandora_solr/settings) and Metadata Display (admin/islandora/search/islandora_solr/metadata)

 

Those pages in Drupal have “add another item” fields that suggest terms after a few characters are typed, but there is no list of all possible terms and their meanings. We would like to add documentation to the wiki to help repository administrators know what the terms mean and how they should be used when configuring search results and metadata displays.

 

Can anyone help the Documentation Interest Group with answers to the following questions:

 

1.      How are the Solr field terms being generated?

2.      Is there a way to generate or export a list of all possible Solr field terms from an Islandora installation?

3.      How might the Solr field terms change if you edit or customize certain Solr files?

4.      What do the suffixes on the terms mean? (_ms, _mt, _mdt, etc)

 

Also, can the committers confirm that the 7.x-1.5 release is using https://github.com/discoverygarden/basic-solr-config? Are the Solr field values being generated from the islandora_transforms somehow?

 

Thanks!

 

Robin

 

Jared Whiklo

unread,
Apr 29, 2015, 8:18:22 PM4/29/15
to isla...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Robin,

1) The field names are generated by the gsearch configuration.

2) You could generate a list based on a default installation. Not sure
that it would be very easy (read: non-manual process).

3) You can change everything about the fields being fed to Solr by
customizing your GSearch configuration.

4) Generally I understand those suffixes to mean
_ms : multi-value string
_mt : multi-value text
_mdt : multi-value date

To see all the possible values you have to know what data is in each
object.

For example the discoverygarden/basic-solr-config MODS_to_solr.xslt
does a simple.

mods_<element localname>_<suffix>

So this XSLT

https://github.com/discoverygarden/basic-solr-config/blob/modular/island
ora_transforms/MODS_to_solr.xslt#L81-L88

says to grab every non-empty mods:abstract element. Then it generates
a solr field with a name equal to $prefix + local-name() + $suffix.

$prefix and $suffix are defined near the top as "mods_" and "_ms"

https://github.com/discoverygarden/basic-solr-config/blob/modular/island
ora_transforms/MODS_to_solr.xslt#L23-L24

local-name returns the element name without any namespace prefixes, so
in this case "abstract".

So this generates a Solr field "mods_abstract_ms" with the text of the
abstract element.

I hope that helps.

cheers,
jared
> -- For more information about using this group, please read our
> Listserv Guidelines:
> http://islandora.ca/content/welcome-islandora-listserv --- You
> received this message because you are subscribed to the Google
> Groups "islandora" group. To unsubscribe from this group and stop
> receiving emails from it, send an email to
> islandora+...@googlegroups.com
> <mailto:islandora+...@googlegroups.com>. Visit this group
> at http://groups.google.com/group/islandora. For more options,
> visit https://groups.google.com/d/optout.

- --
Jared Whiklo
jwh...@gmail.com
- --------------------------------------------------
You know you're from Winnipeg when...You measure distance in hours.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAlVBdMwACgkQqhIY384dF1aPXwCggpWbNx8xU7jCcVRCK/1ulNQJ
2zIAoJskugkNG2bdmBEeYWVpwsoCMeQp
=xuX+
-----END PGP SIGNATURE-----

Robin Dean

unread,
Apr 29, 2015, 8:30:32 PM4/29/15
to isla...@googlegroups.com
Hi Jared,

Thanks, this does help!

If I understand your answers correctly, in order to generate a list of all the possible fields, one would have to look through all the XSLTs for the fields being fed to Solr, and it would also vary based on what metadata existed in each Islandora instance.

That makes me think that the Documentation group should take a different approach to this Appendix. Instead of a list of terms, we should document the methods and patterns used to generate the terms, give a few examples of commonly used terms or patterns, and link to the XSLTs included with the Islandora release.

Thoughts on this?

Best,
Robin
To unsubscribe from this group and stop receiving emails from it, send an email to islandora+...@googlegroups.com.

Bridger Dyson-Smith

unread,
Apr 29, 2015, 9:22:05 PM4/29/15
to isla...@googlegroups.com
Hi Robin and Jared,

What I've done is visit the Solr admin interface and run queries there. For example, if I want to know all of the fields that start with mods_*, then I'd run the following (as a URL - I'll try to get a breakdown of the admin interface later):

http://localhost:8080/solr/collection1/select?q="PID:42"&fl=mods_name_*&df=PID&wt=xml&indent=true

This should return an XML document of all fields start with mods_name in the PID:42 Solr doc. If you wanted to focus on the _ms fields, you could run the following:

http://localhost:8080/solr/collection1/select?q="PID:42"&fl=mods_*_ms&df=PID&wt=xml&indent=true

Granted, these are manual and they only pull fields for a single PID. I'm not sure, off the cuff, how you would structure the URL to include multiple PIDs, but I'm sure it's possible. Also, you'll want to use the URL of your Solr instance (and a real PID, too). :)

You can probably suss out how to use the text inputs in the Solr admin interface based on these examples, but quickly:
'q' = your query
'fq' = not used
'sort' = if I'm returning multiple docs, I'll usually sort by 'PID asc' ($field-name + asc|desc (ascending or descending, respectively))
'start, rows' = basically the number of docs you'd like back from Solr; if you're querying for a single PID then don't worry with this. Otherwise, use 'rows' at your own risk. :)
'fl' = "field list" are the fields you want returned for the query; e.g. as above when you want to see mods_name_namePart_*_ms for PID:231. You can put in multiple values, separated by commas; e.g. mods_name_namePart_*_ms,mods_subject_term_*_s,dc.description.
'df' = default field to be queried; i.e. if your q above is a PID, use PID here. If you're looking for the mods_*_ms fields for a document with 'feldspar' in a mods_abstract, you might try mods_abstract_s.

I've not incorporated the 'Raw Query Parameters' option into my limited testing, so I'm not sure what to do with this.

'wt' = how would you like your response returned? As XML? CSV? There are a number of choices.

I leave the radio buttons for dismax, edismax, hl, facet, spatial, and spellcheck alone.

---
All of this has been helpful for me in addition to reviewing the MODS stylesheets. One caveat, that may or may not apply to your case, is that the MODS_to_Solr.xslt has a note that it is deprecated in favor of the slurp_all_MODS_to_solr stylesheet. YMMV. 

I hope this is somewhat helpful.
Best,
Bridger

Bridger Dyson-Smith

unread,
Apr 29, 2015, 9:47:54 PM4/29/15
to isla...@googlegroups.com

And of course I overlook the 'documentation' in the title. Apologies.

Maybe having a Solr query section in the documentation would be helpful?

Sorry for the noise.
Best,
Bridger

Nick Ruest

unread,
Apr 29, 2015, 10:33:44 PM4/29/15
to isla...@googlegroups.com
> That makes me think that the Documentation group should take a
different approach to this Appendix. Instead of a list of terms, we
should document the methods and patterns used to generate the terms,
give a few examples of commonly used terms or patterns, and link to the
XSLTs included with the Islandora release.

+1

My vote would be to remove the current index. Solr fields are wholly
dependent on the repository implementation, and said repository's metadata.

I think what Jared outlined would be very helpful :-)

-nruest

Daniel Aitken

unread,
Apr 30, 2015, 8:44:40 AM4/30/15
to isla...@googlegroups.com
+1 for removing the appendix. Since it's https://github.com/discoverygarden/basic-solr-config -centric, a better set of documentation could be created for that repository with an overview specific to that Solr config. Other than that, something to that effect (here's how the XSLTs work! Here's what Solr fields look like! Good luck!) might be helpful.

Some other tips:
  • If you go to yoursite.com:8080/solr/admin (replacing 8080 with whatever port Tomcat/Jetty/Whatever is serving Solr is on), you'll be provided with a Solr admin interface, including a a schema browser. In Solr 3.x, the schema browser is at solr/admin/schema.jsp, and in 4.x it's at solr/#/whatever-the-collection-name-is/schema. In 3.x it's horribly themed and difficult to navigate, and in 4.x it's AJAX-y, difficult to navigate, and breaks on field names with forward slashes in them (really? you don't escape those characters in the callback?), which means you often can't browse RDF fields. Thanks Solr!
  • Because I'm a lazy turd (quote-unquote 'streamlining my job'), I made https://github.com/qadan/islandora_solr_devel, which when enabled adds a menu option to objects' Manage tabs (at the menu path islandora/object/PID/manage/solr_fields). It just segments all the Solr fields for the object into vertical tabs based on the underscored- or dotted-prefix, and then throws the field names and values into a table for easy viewing. That might be useful if you don't have access to the Solr port from your browser, but it should ONLY be used for development and should NOT be enabled on production boxes.
Hope something there helps!

- QA Dan

Jared Whiklo

unread,
Apr 30, 2015, 9:23:02 AM4/30/15
to isla...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

+1
I like this, it may have the side benefit of helping people to
understand how these gsearch XSLTs work and allow them to get a
starting point for doing any customizations of their own.

Perhaps their could be a small section of tools/tips for the things
that Bridger and Daniel both mentioned to help understand your Solr
records.

cheers,
jared

On 2015-04-30 7:44 AM, Daniel Aitken wrote:
> +1 for removing the appendix. Since it's
> https://github.com/discoverygarden/basic-solr-config -centric, a
> better set of documentation could be created for that repository
> with an overview specific to that Solr config. Other than that,
> something to that effect (here's how the XSLTs work! Here's what
> Solr fields look like! Good luck!) might be helpful.
>
> Some other tips:
>
> * If you go to yoursite.com:8080/solr/admin (replacing 8080 with
> whatever port Tomcat/Jetty/Whatever is serving Solr is on), you'll
> be provided with a Solr admin interface, including a a schema
> browser. In Solr 3.x, the schema browser is at
> solr/admin/schema.jsp, and in 4.x it's at
> solr/#/whatever-the-collection-name-is/schema. In 3.x it's horribly
> themed and difficult to navigate, and in 4.x it's AJAX-y, difficult
> to navigate, and breaks on field names with forward slashes in them
> (really? you don't escape those characters in the callback?), which
> means you often can't browse RDF fields. Thanks Solr! * Because I'm
>> <javascript:>
> [mailto:isla...@googlegroups.com <javascript:>] On Behalf Of Jared
> Whiklo
>> Sent: Wednesday, April 29, 2015 6:18 PM To:
>> isla...@googlegroups.com <javascript:> Subject: Re: [islandora]
>> Documentation question: list of indexed
> fields for Islandora Solr settings and metadata display
>>
> Hi Robin,
>
> 1) The field names are generated by the gsearch configuration.
>
> 2) You could generate a list based on a default installation. Not
>> sure that it would be very easy (read: non-manual process).
>
> 3) You can change everything about the fields being fed to Solr by
>> customizing your GSearch configuration.
>
> 4) Generally I understand those suffixes to mean _ms :
>> multi-value string _mt : multi-value text _mdt : multi-value
>> date
>
> To see all the possible values you have to know what data is in
>> each object.
>
> For example the discoverygarden/basic-solr-config
>> MODS_to_solr.xslt does a simple.
>
> mods_<element localname>_<suffix>
>
> So this XSLT
>
>
>> https://github.com/discoverygarden/basic-solr-config/blob/modular/isl
and
>>
>>
<https://github.com/discoverygarden/basic-solr-config/blob/modular/islan
d>
>
> ora_transforms/MODS_to_solr.xslt#L81-L88
>
> says to grab every non-empty mods:abstract element. Then it
>> generates a solr field with a name equal to $prefix +
>> local-name() + $suffix.
>
> $prefix and $suffix are defined near the top as "mods_" and "_ms"
>
>
>> https://github.com/discoverygarden/basic-solr-config/blob/modular/isl
and
>>
>>
<https://github.com/discoverygarden/basic-solr-config/blob/modular/islan
d>
>> <https://github.com/discoverygarden/basic-solr-config>? Are the
>> Solr
>>>> field values being generated from the islandora_transforms
>>>> somehow?
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Robin
>>>>
>>>>
>>>>
>>>> -- For more information about using this group, please read
>>>> our Listserv Guidelines:
>>>> http://islandora.ca/content/welcome-islandora-listserv
>> <http://islandora.ca/content/welcome-islandora-listserv> --- You
>>>> received this message because you are subscribed to the
>>>> Google
>> Groups
>>>> "islandora" group. To unsubscribe from this group and stop
>>>> receiving emails from it, send an email to
>>>> islandora+...@googlegroups.com <javascript:>
>>>> <mailto:islandora+...@googlegroups.com <javascript:>>. Visit
>>>> this
>> group at
>>>> http://groups.google.com/group/islandora
>> <http://groups.google.com/group/islandora>. For more options,
>> visit
>>>> https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>
>>
>> -- For more information about using this group, please read our
> Listserv Guidelines:
> http://islandora.ca/content/welcome-islandora-listserv
> <http://islandora.ca/content/welcome-islandora-listserv>
>> --- You received this message because you are subscribed to the
>> Google
> Groups "islandora" group.
>> To unsubscribe from this group and stop receiving emails from
>> it,
> send an email to islandora+...@googlegroups.com <javascript:>.
> <http://groups.google.com/group/islandora>.
>> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>>
>
> -- For more information about using this group, please read our
> Listserv Guidelines:
> http://islandora.ca/content/welcome-islandora-listserv --- You
> received this message because you are subscribed to the Google
> Groups "islandora" group. To unsubscribe from this group and stop
> receiving emails from it, send an email to
> islandora+...@googlegroups.com
> <mailto:islandora+...@googlegroups.com>. Visit this group
> at http://groups.google.com/group/islandora. For more options,
> visit https://groups.google.com/d/optout.
- --
Jared Whiklo
jwh...@gmail.com
- --------------------------------------------------
Quantum mechanics: The dreams stuff is made of.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAlVCLLUACgkQqhIY384dF1ZdZQCeJqgsLEVn6ebBhXnqiFMWWTql
rCgAoMFCGURA2I+xtULUkekZyTZ229td
=boOJ
-----END PGP SIGNATURE-----

Diego Pino

unread,
Apr 30, 2015, 9:42:44 AM4/30/15
to isla...@googlegroups.com
Also as reference, user can check for fields they have available in their Solr Index, and histogram, values, etc (for a particular core/collection)


If you want only the schema defined ones, then 

Best

Diego

Mark Jordan

unread,
Apr 30, 2015, 9:47:55 AM4/30/15
to isla...@googlegroups.com
+1 from here too. If https://github.com/discoverygarden/basic-solr-config is required, or even recommended, for a starter Islandora install, that should be clear in the documentation. If it's not, that should also be clear, and the implications of not using it should also be clear.

Mark


Robin Dean

unread,
Apr 30, 2015, 1:42:34 PM4/30/15
to isla...@googlegroups.com

I would like to echo Mark’s question at this point about whether basic-solr-config is assumed to be part of a default recommended Islandora installation.

 

1.      Is it included in the sandbox and the Islandora VM?

2.      Is it included in the official Islandora releases?

3.      If it isn’t included in any of the “default” installation instructions for the releases, what files are being used instead?

 

The 7.x-1.5 documentation currently says “The xslt and solrschema.xml documents that come packaged with GSearch should be used in configuration. These files are designed to work with our solution packs.”

 

Can anyone confirm that all the MODS fields created by the default Islandora solution pack metadata forms will be indexed if you only install the default files in these directories:

 

/usr/local/fedora/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal

/usr/local/fedora/solr/conf/schema.xml

 

Thanks to everyone who shared their Solr troubleshooting expertise! I will work on turning Appendix D into a guide to configuring Solr locally in Islandora.

 

Best,

Robin

 

From: isla...@googlegroups.com [mailto:isla...@googlegroups.com] On Behalf Of Mark Jordan
Sent: Thursday, April 30, 2015 7:48 AM
To: isla...@googlegroups.com
Subject: Re: [islandora] Documentation question: list of indexed fields for Islandora Solr settings and metadata display

 

+1 from here too. If https://github.com/discoverygarden/basic-solr-config is required, or even recommended, for a starter Islandora install, that should be clear in the documentation. If it's not, that should also be clear, and the implications of not using it should also be clear.

--

Nick Ruest

unread,
May 1, 2015, 11:16:51 AM5/1/15
to isla...@googlegroups.com
> basic-solr-config is assumed to be part of a default recommended
> Islandora installation.

This is the tough question we've never really come to an agreement on
community wise.

My feelings are that every repository is different w/r/t metadata
requirements (MODS, PBCore, DarwinCore, etc.), indexing, and search. So,
it is difficult to come up with a generic solution. But, what dgi
provides in their repo is an excellent starting point, or could just
work for lots of folks.

1. It is now about of the release VM. Not sure about the sandbox,
Melissa might know.

2. No.

3. Those are the are the files created during a baseline GSearch deployment.

> Can anyone confirm that all the MODS fields created by the default
> Islandora solution pack metadata forms will be indexed if you only
> install the default files in these directories:

I would assume no, because there are no xslts to do the actual
transformation to create a solrdoc to push to Solr.

-nruest

On 15-04-30 01:42 PM, Robin Dean wrote:
> I would like to echo Mark’s question at this point about whether
> basic-solr-config is assumed to be part of a default recommended
> Islandora installation.
>
> 1.Is it included in the sandbox and the Islandora VM?
>
> 2.Is it included in the official Islandora releases?
>
> 3.If it isn’t included in any of the “default” installation instructions
> for the releases, what files are being used instead?
>
> The 7.x-1.5 documentation currently says “The xslt and solrschema.xml
> documents that come packaged with GSearch should be used in
> configuration. These files are designed to work with our solution packs.”
>
> Can anyone confirm that all the MODS fields created by the default
> Islandora solution pack metadata forms will be indexed if you only
> install the default files in these directories:
>
> /usr/local/fedora/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal
>
> /usr/local/fedora/solr/conf/schema.xml
>
> Thanks to everyone who shared their Solr troubleshooting expertise! I
> will work on turning Appendix D into a guide to configuring Solr locally
> in Islandora.
>
> Best,
>
> Robin
>
> *From:*isla...@googlegroups.com [mailto:isla...@googlegroups.com]
> *On Behalf Of *Mark Jordan
> *Sent:* Thursday, April 30, 2015 7:48 AM
> *To:* isla...@googlegroups.com
> *Subject:* Re: [islandora] Documentation question: list of indexed
> fields for Islandora Solr settings and metadata display
>
> +1 from here too. If
> https://github.com/discoverygarden/basic-solr-config is required, or
> even recommended, for a starter Islandora install, that should be clear
> in the documentation. If it's not, that should also be clear, and the
> implications of not using it should also be clear.
>
> Mark
>
> ------------------------------------------------------------------------
>
> +1 for removing the appendix. Since it's
> https://github.com/discoverygarden/basic-solr-config -centric, a
> better set of documentation could be created for that repository
> with an overview specific to that Solr config. Other than that,
> something to that effect (here's how the XSLTs work! Here's what
> Solr fields look like! Good luck!) might be helpful.
>
> --
> For more information about using this group, please read our Listserv
> Guidelines: http://islandora.ca/content/welcome-islandora-listserv
> ---
> You received this message because you are subscribed to the Google
> Groups "islandora" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to islandora+...@googlegroups.com
> <mailto:islandora+...@googlegroups.com>.
> Visit this group at http://groups.google.com/group/islandora.
> For more options, visit https://groups.google.com/d/optout.
>
> --
> For more information about using this group, please read our Listserv
> Guidelines: http://islandora.ca/content/welcome-islandora-listserv
> ---
> You received this message because you are subscribed to the Google
> Groups "islandora" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to islandora+...@googlegroups.com
> <mailto:islandora+...@googlegroups.com>.

Jaime Pinto

unread,
May 1, 2015, 12:34:36 PM5/1/15
to isla...@googlegroups.com
I'd like to go a step further:

On the Enterprise installation doc I'm still using GSearch 2.6, [1], however in the Release Notes and Downloads we're already listing support for GSearch 2.6.2, HEAD

I could use a hand with that page [1], such as the link where to download GSearch 2.6.2, as well as with a section to upgrade from GSearch 2.6 to GSearch 2.6.2 on the Migration/Upgrade area [2]

Everything Solr and GSearch based is always obscure, in particular the Solr index and Customization. I find it hard to get a sense of cause and effect from the perspective of the person doing the installation.

Thanks
Jaime

[1] https://wiki.duraspace.org/display/ISLANDORA715/milestone+6+-+Installing+Solr+and+GSearch

[2] https://wiki.duraspace.org/pages/viewpage.action?pageId=68063509

Reply all
Reply to author
Forward
0 new messages