How to write Cubert UDFs?

33 views
Skip to first unread message

Santhosh Swaminathan

unread,
Jul 7, 2015, 10:57:47 AM7/7/15
to cubert...@googlegroups.com
Here is my problem at hand:

Data A: Key1, Key2, Timestamp (t1)
Data B: key1, Key3, Timestamp (t2)

I have co partitioned on Key 1. Now I want to find the nearest Key3 in Data B for each Key2 in Data A (by nearest I mean temporally). A Hash join on Key1 and ,calculating t1-t2 and then selecting the smallest non zero value is an option,but very costly. The other alternative is to write a UDF and then pass the block of Data B and then find the nearest record. For this I was looking for some examples on how to write UDF's which takes the block and returns the nearest key 3. Could you please also suggest any other way of solving this?

Thanks,
Santhosh



Reply all
Reply to author
Forward
0 new messages