load-hbase ?

7 views
Skip to first unread message

b...@benmabey.com

unread,
Jun 1, 2016, 10:27:09 PM6/1/16
to PigPen Support
Hi all,
How would I use the built-in HBaseStorage in Pig? For example you can call this in a regular Pig script:

user_links = load 'hbase://users'
using org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'user_info:name, links:*', '-loadKey true -gt 10000')
as (id, name:chararray, links:map[]);

I see the wiki pages on 'Custom Loaders' and 'Custom Storage' but I'm not sure what the correct implementations for PigPenLocalLoader and PigPenLocalStorage would be. Would anyone who has used HBaseStorage with PigPen share the code they used to get it to work?

Thanks,
Ben

Matt Bossenbroek

unread,
Jun 2, 2016, 11:56:38 AM6/2/16
to PigPen Support, b...@benmabey.com
I don’t have any guidance on what an appropriate local version of that would be, but it is entirely optional to implement.

For some storage that is very difficult to reproduce locally, I’ll use something else to test (like pig/return or pig/load-clj) and then swap it out for the real version when generating a script. You’ll just need to be careful to watch that all the types are the same between the two of them (int vs long, array handling, etc).

You could also take a look at the local implementation of the parquet storage [1], which just uses the hadoop stuff directly. You might be able to just drop in the HBase formats there & see what happens.


HTH

--
You received this message because you are subscribed to the Google Groups "PigPen Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pigpen-suppor...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ben Mabey

unread,
Jun 2, 2016, 1:07:59 PM6/2/16
to PigPen Support
Ah, I didn't realize that the local version was optional (I see that now after reading the wiki more carefully). I have been doing all my local testing with pig/return and just want to hook it up to HBase for staging and production runs. So to do that all I need to provide that is the multimethod for pigpen.pig.script/storage->script and the load-hbase/store-hbase functions, correct?


-Ben

Matt Bossenbroek

unread,
Jun 2, 2016, 1:10:49 PM6/2/16
to PigPen Support, Ben Mabey
Yep, that sounds right to me

-Matt

Reply all
Reply to author
Forward
0 new messages