trying to implement aan adpater for kairosdb

Skip to first unread message

kunal mahale

May 9, 2014, 10:25:40 AM5/9/14

Hi Julian,

 I am trying to create an adapter for Kairosdb. The kairosdb deals with json as input and output.

I have used Mongo-db adpter as reference.

The relations pushed down to kairos are :

1.       Filter Relation

2.       Join Relation

Filter Relation : Using somewhat same case in mongo-db adapter.

Join Relation : In kairosdb the data for a columnfamily can be huge, so its not a good idea to bring all the data in memory for handling Joins. So I tried a solution like this:

                Keys are required to handle JOIN. In kairos db we can get keys from url, something like : http://[host]:[port]/api/v1/metricnames

So the flow to implement joins is like:

1.                       Identify the keys to compare from the other non kairos Table (which may be other datasource)

2.                       Push these keys to the kairos-enumerator, (I am using reflection for this. In linq4j project, EnumerableDefaults class has join_() and lookup_() methods. We can get an instance of kairos-enumerator here. From here we can call an utility method seyKeys() defined in kairos-enumerator)  

3.                      Get all Keys from kairosdb .

4.                      Compare the two sets and get the disjunction of these two  sets.

5.                      Send a request to Kairos for data of these keys

6.                       Create an ArrayList of result of objects and pass on to enumerator


Note : Attached here are the two files :

1.       Linq4j . EnumerableDefaults :

a.       Two methods. =>  toLookup2_() : This is newly added method(Corresponding changes in superclasses are not attached herewith),   join_() : Some changes to pass keys at runtime.

2.       KairosEnumerator = > setKey() : This is the method called from the join_() mentioned method above. This is used to set keys.

 So my question is, Is this the right approach to implement an adapter for kairosdb. Or Is there any better way to do this ?

And will this be a realiable approach for newer versions of Optiq core and linq4j ?

Thanking you,

Julian Hyde

May 9, 2014, 5:45:40 PM5/9/14

I'd recommend that you start simple -- get scans working first (the equivalent to MongoTableScan), then get filters working, and check that you can push down basic expressions to KairosDB, in particular time ranges. Write a test suite based on a small data set in KairosDB.

Then get joins working in Optiq (using EnumerableJoinRel). Only then turn your attention to pushing down joins.

Distributed joins are very hard. The biggest problem is preventing huge amounts of data flying across the network. That said, your approach (forming the set of intersecting keys) seems to be a reasonable one.

I haven't looked at your proposed changes to EnumerableDefaults in detail. Maybe you could fork linq4j and submit them as a github pull request. That said, you don't need linq4j to be modified -- you could create a copy of one or more of the linq4j classes, and your join operator could generate code that references that class.

Reply all
Reply to author
0 new messages