Group facets? Highlight searched terms? Multilingual facets?

32 views
Skip to first unread message

Irene Vagionakis

unread,
Aug 8, 2020, 7:44:03 AM8/8/20
to EFES users
Dear all,
does any of you know if there is a way to:
- group the facets in the search page in separate sections? (e.g. a group of facets pertaining to material aspects of the inscriptions, another one for the elements mentioned in the inscriptions etc.)
- highlight the searched terms in the results pages? (e.g. displaying in red the searched terms in the single inscription pages after opening them from the results list)

Also, does any of you know how to make multilingual the facets? I have seen that in IOSPE this was done by creating more facet_query.xml files, one for each used language (e.g. facet_query.en.xml, facet_query.ru.xml), in each of which the field names ended with '-' and the language code (e.g. <facet.field>location-en</facet.field>), but I don't know what else has to be changed in order to make them work.

Thanks in advance for any suggestion!
Best,
Irene

Jamie Norrish

unread,
Aug 8, 2020, 11:28:09 PM8/8/20
to efes-...@googlegroups.com
On Sat, 2020-08-08 at 04:44 -0700, Irene Vagionakis wrote:

> - group the facets in the search page in separate sections?

Modify the search.xml template to include the sections and explicitly
apply templates to each facet in the place and order that you wish,
rather than using the current apply-templates to all facets.

> - highlight the searched terms in the results pages?

You need to pass the searched terms from the search results link to the
inscription page in the querystring, and then have either modify the
map:match for an inscription to read this information and add
appropriate markup, or use some JavaScript to handle the highlighting
on the client end. I would probably go the latter route.

> Also, does any of you know how to make multilingual the facets?

Do you mean the facet names, or the values? For the former, you just
need to supply a translation as for any other bit of translated text.
The i18n keys are of the form "facet-<name of facet>".

For the facet values, if you are storing references to authority files
that are harvested into RDF with language information, this will happen
automatically if the facet name is listed in rdf-facet-lookup-fields in
config.xmap.

Jamie

Irene Vagionakis

unread,
Aug 11, 2020, 11:56:40 AM8/11/20
to EFES users
Thank you so much for your help, your suggestions have been extremely useful!

As for the facet values, in the two tei-to-solr.xsl files I have some templates like the following, pointing to the relevant element of the authority list item (in this case a <place> containing a <placeName xml:lang="en"> and a <placeName xml:lang="it">):

  <xsl:template match="tei:repository" mode="facet_current_location">
    <xsl:variable name="repo-id" select="substring-after(@ref,'#')"/>
    <xsl:variable name="repository-id" select="document('../../../content/xml/authority/listPlace.xml')//tei:place[@xml:id=$repo-id]/tei:placeName[@xml:lang='en']"/>
    <field name="current_location">
      <xsl:choose>
        <xsl:when test="$repository-id"><xsl:value-of select="$repository-id" /></xsl:when>
        <xsl:otherwise><xsl:apply-templates select="."/></xsl:otherwise>
      </xsl:choose>
    </field>
  </xsl:template>

What can I use instead of the language code in `tei:placeName[@xml:lang='en']` in order to point to the currently used language?

Thanks,
Irene

Jamie Norrish

unread,
Aug 11, 2020, 6:10:06 PM8/11/20
to efes-...@googlegroups.com
On Tue, 2020-08-11 at 08:56 -0700, Irene Vagionakis wrote:

> What can I use instead of the language code in
> `tei:placeName[@xml:lang='en']` in order to point to the currently
> used language?

When indexing, there is no currently used language.

Rather than doing any trickery with document(), just index your $repo-
id, and leave it up to stylesheets/solr/results-to-html.xsl (display-
facet-value named template) to get the correct language (at that point
there is a currently used language). All this requires is harvesting
your listPlace authority file into RDF and listing the
"current_location" field in the rdf-facet-lookup-fields in
sitemaps/config.xmap.

So your template would look something like:


<xsl:template match="tei:repository" mode="facet_current_location">
<field name="current_location">
<xsl:value-of select="substring-after(@ref, '#')" />
</field>
</xsl:template>


(I don't think there's much point supplying a fallback in case the
tei:repository lacks an @ref; rather, I'd enforce that it must have one
through the schema or whatever other means.)

Jamie

Irene Vagionakis

unread,
Aug 12, 2020, 7:18:18 AM8/12/20
to EFES users
No, without document() I am getting as facet values the values of the @ref attributes without the #.
I thought that this could be caused by the fact that each AL item contains not only the en/it item names but also other additional data, e.g.
<item xml:id="block">
  <term xml:lang="en">Block</term>
  <term xml:lang="it">Blocco</term>
  <ref>https://www.eagle-network.eu/voc/objtyp/lod/189</ref>
</item>
but now that I am trying again, the same happens also for the AL items that have just the en/it <term>s or <placeName>s...

Jamie Norrish

unread,
Aug 12, 2020, 5:18:36 PM8/12/20
to efes-...@googlegroups.com
On Wed, 2020-08-12 at 04:18 -0700, Irene Vagionakis wrote:

> No, without document() I am getting as facet values the values of the
> @ref attributes without the #.

Whenever this happens, the problem is going to be somewhere along the
rendering path:

* The RDF hasn't been harvested from the source of the reference.
* The RDF is incorrect. Look in stylesheets/rdf/authority-to-rdf.xsl
* The RDF hasn't been fetched from the triple store.
* The lookup within the RDF data hasn't worked. Look in
stylesheets/solr/results-to-html.xsl

A handy tool for debugging is to use Cocoon views[1] to see the XML
content for a page, without any subsequent processing. So for example:

http://localhost:9999/en/search/?cocoon-view=content

will give you the menu, Solr search results, HTTP request data, and RDF
facet data in XML. That allows you to check easily whether you are
getting the data you want in RDF or not.


[1] https://cocoon.apache.org/2.1/userdocs/concepts/views.html


Jamie

Irene Vagionakis

unread,
Aug 13, 2020, 3:56:31 AM8/13/20
to EFES users
OK, I'll have a look to check what is wrong there, thanks a lot!

Irene

Irene Vagionakis

unread,
May 13, 2021, 10:01:50 AM5/13/21
to EFES users
Dear all,
following Jamie's advice, I have added some JavaScript in order to highlight a searched querystring in the inscription page that is accessed from the results page, but I have not been able to write an effective RegEx to ignore all the brackets, underdots etc. deriving from the markup:

$(document).ready(function highlight() {
        const params = new URLSearchParams(window.location.search);
        var q = params.get("q");
        var qq = q.replace(new RegExp('"|\\*', 'g'), '');
        var str = document.getElementById("inscription_full_edition").innerHTML;
        var results = str.replace(new RegExp('('+qq+"(?![^&lt;&gt;]*&gt;)"+')', 'gi'), '<span class="highlight">$1</span>');
        document.getElementById("inscription_full_edition").innerHTML = results;
        });

Does any of you know how to improve it? Or would it be easier to follow the other approach suggested by Jamie, that is to modify the map:match of the inscriptions?

Thanks!
Irene

Pietro Liuzzo

unread,
May 13, 2021, 5:14:51 PM5/13/21
to efes-...@googlegroups.com
Dear Irene,

I do not know EFES at all, but it sounds to me like map:match is the way to go.

For a similar purpose with jquery in a different environment I clean the query term and use contains like $('span:contains("' + query + '")').toggleClass('highlight') which will highlight the parent node, not the match or a matching word e.g. 



If I understand what you are trying to achieve, I think this is hard to do with a regex, but perhaps you can build enough variants in your js building each with replacements on the query string and adding them as alternatives?

I have also tried for example cleaning the query (I tested with κοσμοις on the html of your seg_64_798) with q.normalize("NFD").replace(/[\u0300-\u036f\[\]]/g, "") and using the result of that in the regex to catch a 'best match', e.g. new RegExp(/([κοσμοις\[\]\{\}]{4,})/, 'gi') which would match 4 to undefined of the specified characters in the list, which include the normalized query and brackets.   in the sample text it matches κό̣[σ]μ̣οι[ς] and κόσμοις]
However I did not manage in my js to pass this construct to the first parameter of  RegExp()... 

I hope this can help you somehow, and I would also be interested in reading about other ideas.


all best
Pietro

 

this is the kind of thing which I would leave it possible to the i
Pietro Maria Liuzzo (egli/lui,he/him,er/ihn)
cel (DE): +49 (0) 176 61 000 606
Skype: pietro.liuzzo (Quingentole)
ORCID: https://orcid.org/0000-0001-5714-4011
Academia: https://uni-hamburg.academia.edu/PietroMariaLiuzzo





Irene Vagionakis

unread,
May 14, 2021, 3:27:43 AM5/14/21
to efes-...@googlegroups.com
Thanks Pietro! I'll try the fuzzy match approach.


Da: efes-...@googlegroups.com <efes-...@googlegroups.com> per conto di Pietro Liuzzo <pietro...@gmail.com>
Inviato: giovedì 13 maggio 2021 23:14
A: efes-...@googlegroups.com <efes-...@googlegroups.com>
Oggetto: Re: [efes-users] Highlight searched terms?
 
--
You received this message because you are subscribed to the Google Groups "EFES users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to efes-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/efes-users/A9803411-8F34-40BF-8DEE-655B7641ECC6%40gmail.com.
Reply all
Reply to author
Forward
0 new messages