Hi Ding,
On Thu, May 16, 2013 at 9:00 PM, Haifeng Ding <
hank...@gmail.com> wrote:
> 8287176 points retrieved, 38111 points plotted in 89273ms.
That's just under 100k points per second. Not impressive by any
standards. How fast does the query return if you don't filter on any
tags? Or if you just scan the underlying HBase table for the key
range appropriate to your query?
> Query logs:
> 2013-05-17 11:13:51,470 INFO [New I/O worker #3] TsdbQuery:
> TsdbQuery(start_time=1368676800, end_time=1368718200, metric=[0, 0, 3]
> (test.metric), tags={}, rate=false, aggregator=sum, group_bys=()) matched
> 557462 rows in 68253 spans
> 2013-05-17 11:15:09,772 INFO [Gnuplot #7] Plot: Wrote Gnuplot script to
> /home/data1/build/tmp/tsd/bd6214b7.gnuplot
> 2013-05-17 11:15:09,857 INFO [Gnuplot #7] HttpQuery: [id: 0xa893a6c0,
> /
172.21.206.53:55282 => /
10.42.230.49:8402] HTTP
> /q?start=2013/05/16-12:00:00&end=2013/05/16-23:30:00&m=sum:10m-avg:test.metric&o=&yrange=%5B0:%5D&wxh=1328x484&json
> done in 89273ms
Hmm, this is particularly disappointing because you're only querying
one metric. If you were querying multiple metrics at a time, just
bear in mind that right now each metric gets handled sequentially
(even though in theory they could be handled in parallel), which can
contribute to slower response times than would be possible under
optimal circumstances.
> I also made several CPU profiling with the query process. I found that HBase
> was responding fast enough, while most of the time was spent on generating
I still find this dubious that we don't see the HBase access at all in
the screenshot you shared. It cannot possibly be so fast as to be
invisible to the profiler, can it? And since this code is still
written in a blocking fashion, the profiler should be able to see it
wait for HBase.
> My questions are:
> 1. Is it reasonable with the query performance and profiling results above?
Not really.
> 2. Is there any suggestion or best practice to improve query performance of
> OpenTSDB? For example, is it feasible to reach 10 seconds for the above
> query execution?
I would like to say that the answer is yes, it's possible, but we
first need to determine exactly why it's so slow right now.
Can you share your test data set maybe?
--
Benoit "tsuna" Sigoure