How to search by an exact text field value?

136 views
Skip to first unread message

Yuriy Mann

unread,
Mar 18, 2020, 7:02:07 AM3/18/20
to RediSearch Discussion Forum
How can I specify a search by an exact field value?
For example:
> FT.CREATE idx1 NOHL NOFREQS SCHEMA f1 TEXT NOSTEM
> FT.ADD idx1 doc1 1 FIELDS f1 aa.bb
> FT.ADD idx1 doc2 1 FIELDS f1 aa
> FT.ADD idx1 doc3 1 FIELDS f1 cc.aa

This search returns all 3 documents because "aa" is recognised as a term in all of them.
> FT.SEARCH idx1 @f1:aa

But in my case I may have thousands of "aa.bb" and a few "aa" mixed in between them. How can I query only those where f1=="aa"?

Note that some of SQL mappings here are incorrect in this sense. E.g. @x:foo and WHERE x='foo' have different semantics. 



Guy Korland

unread,
Mar 18, 2020, 11:59:06 AM3/18/20
to redis...@googlegroups.com
You can use Tag Field

27.0.0.1:6379> FT.CREATE idx1 NOHL NOFREQS SCHEMA f1 TAG
OK
127.0.0.1:6379> FT.ADD idx1 doc1 1 FIELDS f1 aa.bb
OK
127.0.0.1:6379> FT.ADD idx1 doc2 1 FIELDS f1 aa
OK
127.0.0.1:6379> FT.ADD idx1 doc3 1 FIELDS f1 cc.aa
OK
127.0.0.1:6379> FT.SEARCH idx1 @f1:{aa}
1) (integer) 1
2) "doc2"
3) 1) "f1"
   
2) "aa"

Yuriy Mann

unread,
Mar 18, 2020, 12:45:12 PM3/18/20
to RediSearch Discussion Forum
With a tag field, I guess I won't be able to find "aa.bb" (but not "bb.aa") unless I add it as a separate "compound" tag?

Do you mean there is no other way? What is the best way to request for it to be added to the plan - create a github issue?

The point of migrating our redis search from custom indices to Redisearch was to greatly simplify the implementation of a generic search through our task queue datasets. Redisearch is pretty well fit for this purpose. If I have to use tag fields to enable certain search features on fields which otherwise don't have tag semantics, I will be implementing low-level indexing again which defies the purpose. 

E.g. some fields in our existing data contain something like "aa/bb@cc:dd". This works perfectly with the normal text search in Redisearch. The only thin we're missing is the ability to say "now give me records where the field value is exactly bb@cc". 

Thanks

Yuriy Mann

unread,
Mar 19, 2020, 5:24:03 AM3/19/20
to RediSearch Discussion Forum
Created a GitHub issue for tracking: https://github.com/RediSearch/RediSearch/issues/1130

Michael Masouras

unread,
Mar 20, 2020, 2:34:52 PM3/20/20
to Yuriy Mann, RediSearch Discussion Forum
You could also try duplicating the field, and setting one of them to TEXT and the other to TAG. 

Since you will know from the input if the user expects exact or partial match, you can alter your search query to perform a text or a tag search.

This might get expensive if you have millions of different text possibilities. An improvement there is to hash the text to an appropriate resolution that is coarser, and do post-filtering in your code for the matches you got. 

Michael


--
You received this message because you are subscribed to the Google Groups "RediSearch Discussion Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redisearch+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/redisearch/54c9d76e-f81a-473b-9e20-daa0729b6fb8%40googlegroups.com.

Mark Nunberg

unread,
Mar 20, 2020, 9:46:06 PM3/20/20
to Yuriy Mann, RediSearch Discussion Forum
You could probably do it through the FILTER feature in FT.AGGREGATE (which can easily be ported to FT.SEARCH); however this filtering takes place at query time. I can think of some creative ways to do some ‘exact matching’ semantics at index time.. for example keep track of how many terms are in each field (this would require you to remove the NOHL keyword, since we store this data for highlighting) and then compare the number of terms in the field with the number of terms searched for; if the number is not equal then it can be excluded, but as you can see, this involves something of a memory cost; and perhaps adding a new attribute (maybe call it exclusive?).

However, getting a perfectly exact match with full text search is problematic since we ignore word separators, stopwords, etc. so perhaps a tag field might be better suited for this- though again, with the adding of extra data to the document. We also have a C API (but not a command API) to allow indexing a field as both TAG and TEXT (but again, you pay for this in indexing the data twice).

Github would be a good place to discuss this further.

-- 
You received this message because you are subscribed to the Google Groups "RediSearch Discussion Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redisearch+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/redisearch/54c9d76e-f81a-473b-9e20-daa0729b6fb8%40googlegroups.com.



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.

Reply all
Reply to author
Forward
0 new messages