ES Query issues in 20.11.1

17 views
Skip to first unread message

Falzone, Chris

unread,
Jul 9, 2021, 12:38:22 PMJul 9
to dot...@googlegroups.com
I am working on an upgrade from 5.1.1 and somewhere along the line a bunch of changes to ES Analyzing came into place.  So far I have been able to adjust for them, but this one has me stumped.

In this query I have a "must_not" filter on identifier:

{ "query": {
    "bool": {
        "filter": {
            "bool": {
                "must": [
                  {"term": { "contenttype" : "news" }},
                  {"term": { "languageid" : "1" }},
                  {"term": { "conhost" : "2e0f4385-743d-4cdf-a7e9-b04013403fdb" }},
                  {"range": { "news.postdate" : { "lte" : "2021-07-09" }}}
                ],
                "must_not": [
                  {"term": { "identifier" : "645931b3-fa4d-42ac-95df-cd41c75c3f4a"}}
                ]
            }
        }
    }
}, "sort" : [ {"news.postdate" : {"order" : "desc"}} ] }


But the very first result is that identifier:
Screen Shot 2021-07-09 at 12.22.41 PM.png

What's going on there?  We follow this pattern a few places on our sites where we pull the latest "Featured" item for the page and then in the main pull exclude it by identifier. Feels like the identifier field is no longer indexed as a whole but tokenized as chunks or something.

The changes to the way fields are analyzed in ES have set us way back on our upgrade.  We've had to switch a ton of queries to use _dotraw and/or use esCustomMapping to set it to a keyword field.  

--

Christopher Falzone

DevOps Engineer

A Q U E N T  /  VITAMIN T

Falzone, Chris

unread,
Jul 9, 2021, 12:43:43 PMJul 9
to dot...@googlegroups.com
So, it appears that identifier_dotraw is a thing and might work here -- but why would the identifier and inode fields in the index be using a tokenized analyzer? 
Reply all
Reply to author
Forward
0 new messages