On how OpenTSDB stores data in HBase

912 views
Skip to first unread message

sabyasachim...@gmail.com

unread,
Jun 20, 2013, 4:55:21 AM6/20/13
to open...@googlegroups.com
Hello, I am curious as to how OpenTSDB stores the data in HBase. To find that, I made a new metric named "shell.random.number2" and inserted two values into it as shown:

hduser@desktop:/usr/share/opentsdb/bin$ sudo ./tsdb query 2013/06/20 sum shell.random.number2
[sudo] password for hduser:
shell.random.number2 1371713110 12421 {host=master}
shell.random.number2 1371715456 21308 {host=master}


 
This is  the output of 'tsdb-uid':

hbase(main):019:0> scan 'tsdb-uid'
ROW                                         COLUMN+CELL                                                                                                                
 \x00                                       column=id:metrics, timestamp=1371712342083, value=\x00\x00\x00\x00\x00\x00\x00\x02                                         
 \x00                                       column=id:tagk, timestamp=1371625769655, value=\x00\x00\x00\x00\x00\x00\x00\x01                                            
 \x00                                       column=id:tagv, timestamp=1371625769662, value=\x00\x00\x00\x00\x00\x00\x00\x01                                            
 \x00\x00\x01                               column=name:metrics, timestamp=1371625735697, value=shell.random.number                                                    
 \x00\x00\x01                               column=name:tagk, timestamp=1371625769658, value=host                                                                      
 \x00\x00\x01                               column=name:tagv, timestamp=1371625769664, value=master                                                                    
 \x00\x00\x02                               column=name:metrics, timestamp=1371712342148, value=shell.random.number2                                                   
 host                                       column=id:tagk, timestamp=1371625769660, value=\x00\x00\x01                                                                
 master                                     column=id:tagv, timestamp=1371625769667, value=\x00\x00\x01                                                                
 shell.random.number                        column=id:metrics, timestamp=1371625735700, value=\x00\x00\x01                                                             
 shell.random.number2                       column=id:metrics, timestamp=1371712342152, value=\x00\x00\x02              

Now, when I do a 'scan tsdb' from the HBase shell, I get the followings:

 \x00\x00\x02Q\xC2\xA8p\x00\x00\x01\x00\x00 column=t:^g, timestamp=1371713112030, value=\x00\x00\x00\x00\x00\x000\x85                                                  
 \x01                                                                                                                                                                  
 \x00\x00\x02Q\xC2\xB6\x80\x00\x00\x01\x00\ column=t:\x10\x07, timestamp=1371715457928, value=\x00\x00\x00\x00\x00\x00S<                                               
 x00\x01        

From what I can tell:

 \x00\x00\x02 = shell.random.number2
Q\xC2\xA8p or B6 = ?
\x00\x00\x01 = host
\x00\x00\x01 = master

And the timestamp is when it was stored in the HBase database. The second part of the 4-part-string above would be the timestamp, but I can't get a hold of how it is doing so.

However, I don't understand how the values are being stored, as well as what the second part of the UID and the column signify and how they work... Can anyone help me out? Thanks in advance, and my apologies if I am asking something which I shouldn't.

Regards,
Sabyasachi

Kevin Ortman

unread,
Jun 20, 2013, 7:31:35 AM6/20/13
to sabyasachim...@gmail.com, open...@googlegroups.com
This should help you out.


Sent from Mailbox for iPhone

sabyasachim...@gmail.com

unread,
Jun 20, 2013, 8:17:45 AM6/20/13
to open...@googlegroups.com, sabyasachim...@gmail.com
 Um, I am more interested in using HBase, rather than using OpenTSDB itself. Thanks anyway though. :)

ManOLamancha

unread,
Jun 20, 2013, 8:34:08 AM6/20/13
to open...@googlegroups.com, sabyasachim...@gmail.com
On Thursday, June 20, 2013 8:17:45 AM UTC-4, sabyasachim...@gmail.com wrote:
 Um, I am more interested in using HBase, rather than using OpenTSDB itself. Thanks anyway though. :)

Kevin's link describes the schema OpenTSDB uses so you can understand how it stores the data in HBase. HBase in Action (http://www.manning.com/dimidukkhurana/) has a whole chapter on OpenTSDB and why the schema works well with HBase. Give those a read and let us know if you still have questions about the schema.
Reply all
Reply to author
Forward
0 new messages