Hash joins

35 views
Skip to first unread message

David Kincaid

unread,
Jul 14, 2016, 8:26:00 AM7/14/16
to cascalog-user
I was looking through the operations.clj again today and came across the hash-join* function. That is exactly what I need right now. Is it possible to use this in a Cascalog query somehow? I'm not sure how to do it since the first argument is "flows" so I'm not sure how to get that in there. Anyone using hash-joins in Cascalog?

Thanks,

Dave

Sam Ritchie

unread,
Jul 14, 2016, 8:51:33 PM7/14/16
to cascal...@googlegroups.com
Hey, I don't have an example handy right now, but you can actually pass a Cascalog query in to any place that asks for a flow. Check the tests out for how to use that function, and just swap in your Cascalog query for the flow example.

Then, you can use the result as a Cascalog query in the next part of your job.

--
You received this message because you are subscribed to the Google Groups "cascalog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Kincaid

unread,
Jul 15, 2016, 10:49:21 PM7/15/16
to cascalog-user
I see. That makes sense. So create two Casacalog queries, one for each side of the join, use hash-join* to join them and use the result in another query (or sink the result). I'll give it a shot!

Thanks,

Dave

Igor Postelnik

unread,
Aug 3, 2016, 9:50:12 AM8/3/16
to cascalog-user
How did this work out for you?

David Kincaid

unread,
Aug 6, 2016, 10:33:26 PM8/6/16
to cascalog-user
Works great!  I was able to get it working and it is really amazing. The queries I've used it in run so much faster.

- Dave
Reply all
Reply to author
Forward
0 new messages