Tokenizer doesn't include the underscore character

47 views
Skip to first unread message

Sean Bannister

unread,
Nov 8, 2015, 9:07:25 AM11/8/15
to mongodb-dev
I noticed when performing a full text search that the tokenizer doesn't include the underscore character and confirmed it in the source code:
https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/tokenizer.cpp

Is there any reason for this? To me it makes sense to include the underscore.

The only other character not included is the single quote ' which makes sense due to contraction like "don't".


Asya Kamsky

unread,
Nov 20, 2015, 10:18:14 PM11/20/15
to mongo...@googlegroups.com
Sean:

Probably best would be to file an enhancement request in jira.mongodb.org for the Text Search component of SERVER project.

I couldn't find the issue already there after a very brief search.

Asya


--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev...@googlegroups.com.
To post to this group, send email to mongo...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-dev.
For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers
Reply all
Reply to author
Forward
0 new messages