Aggregation based upon text query

32 views
Skip to first unread message

Bohdan Szymanik

unread,
May 15, 2012, 8:07:34 PM5/15/12
to rav...@googlegroups.com
Bit confused about how to do this and just wondering what the correct approach is.

I can represent the problem in the blogs/comments scenario.

I want to count all instances of comments which contain the word xyz where this is can be discovered through use of wildcards, or ideally a regular expression.

When I wen to try this I could easily create a linq index that projects just the comments but how do I do a wildcard search in linq for comments satisfying a search criteria, or alternatively perform an  aggregation in association with lucene? I'm a little confused...

Itamar Syn-Hershko

unread,
May 16, 2012, 5:18:43 AM5/16/12
to rav...@googlegroups.com
Querying using regular expressions are not supported yet, but will be once Lucene.NET has them (Java Lucene already has it, at least in a beta version)

To count all instances with a certain word, just query a full-text search enabled index with that word, the Query statistics object will contain the total count of docs satisfying that query

Bohdan Szymanik

unread,
May 17, 2012, 8:24:06 PM5/17/12
to rav...@googlegroups.com
Hmm, my example wasn't so good - I was really thinking of a different scenario which I should've used in the problem. I was thinking of sum/avg/min/max. Where it's come about is from having a list of transactions where the transactions are tagged with additional information - could be merchant, could be customer entered data, could be a code for the type of the transaction eg groceries/ travel/ accommodation etc. The transactions could be loaded into a document that was identifiable by customer/year/month and I wondered if I could identify the tag and calculate the aggregation with a mapreduce index. Maybe this would be better done with a field in each document that gets dynamically updated as transactions get added.



Matt Warren

unread,
May 18, 2012, 7:02:24 AM5/18/12
to rav...@googlegroups.com
If you have additional info that is "dynamic", i.e. without a fixed schema (merchant, customer entered, type etc) you should take a look at dynamic fields, see  http://ravendb.net/docs/client-api/advanced/dynamic-fields
Reply all
Reply to author
Forward
0 new messages