Hi group
NOTE: Using a text index, the process of deleting some letters at the end is a language-specific feature called "stemming".
Why using a Text Index in Spanish the stemming of "filología" is "filolog" and the stemming of "filologia" (without accent mark) is "filologi"?
Words ending in -ía are very common in Spanish and searching without accent marks in very common too.
> db.Series.find({ "$text": { "$search": 'filologia clásica', "$language": "es" } }, {indexlanguage: 1}).explain("executionStats") ...
"terms" : [
"filologi",
"clasic"
],
...
> db.Series.find({ "$text": { "$search": 'filología clásica', "$language": "es" } }, {indexlanguage: 1}).explain("executionStats")
...
"terms" : [
"filolog",
"clasic"
],
...
Is it a bug in the stemming process or in my Text Index configuration?:
> db.User.getIndexes()
[
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "$**_text",
"ns" : "test.User",
"weights" : {
"$**" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
]
Thank you.