Regex queries and Hbase filters performance

537 views
Skip to first unread message

Siddartha Guthikonda

unread,
Jul 24, 2013, 5:13:31 PM7/24/13
to open...@googlegroups.com
Hi,

I was looking into OpenTSDB queries and as we need a regex for tags, I looked into filters for HBase which take a regex comparator. I would like to know if someone has looked into this already or is it going to be a good idea of using Hbase filters for developing OpenTSDB queries (on performance basis). Does Hbase filters scan the entire rowkey and give out the results or are there filters which perform this much efficiently. (By performance I mean large scale, about more that 100 million datapoints).

Thanks
Sid.

ManOLamancha

unread,
Jul 25, 2013, 11:25:27 AM7/25/13
to open...@googlegroups.com
On Wednesday, July 24, 2013 5:13:31 PM UTC-4, Siddartha Guthikonda wrote:
I was looking into OpenTSDB queries and as we need a regex for tags, I looked into filters for HBase which take a regex comparator. I would like to know if someone has looked into this already or is it going to be a good idea of using Hbase filters for developing OpenTSDB queries (on performance basis). Does Hbase filters scan the entire rowkey and give out the results or are there filters which perform this much efficiently. (By performance I mean large scale, about more that 100 million datapoints).

A scanner will scan from the start and stop keys regardless of what your filters are set to. To implement regex on tags you would first have to scan the entire 'tsdb-uid:id' column family and use your regex as the filter. That should return the IDs for the tags that matched your filter. Then you would create a binary regex filter for the actual query scanner that matches time series with the UIDs for the tags.
Reply all
Reply to author
Forward
0 new messages