How to specify a different storage (like mongodb-hadoop) in pigpen targeting pig scripts?

17 views
Skip to first unread message

Rafik NACCACHE

unread,
Nov 22, 2015, 6:20:05 AM11/22/15
to pigpen-...@googlegroups.com
Hi,

I wanted to know If there is a way to specify a special storage other than local/hdfs when spitting pig scripts.

I am interested by using pigpen (much easier for me to use thanks to Clojure), but I am stuck with the store.XXX functions that only spit to filesystems (local or hdfs, depending on how you'd use pig script.pig).

I want to spit the results of my relations into a mongodb collection using https://github.com/mongodb/mongo-hadoop/wiki/Pig-Usage, where I would put my computations results, via mongodb inserts queries, into this DB.

Can you help me please, or show me how I can write a custom storage for pigpen?

Thank you

Matt Bossenbroek

unread,
Nov 23, 2015, 1:06:57 PM11/23/15
to Rafik NACCACHE, pigpen-...@googlegroups.com
Hi, it's definitely possible, but I don't have docs for custom storage yet. It would be very similar to a custom loader as seen here:


The pattern is that you define operators that create a raw 'store' command, with a specified storage type (a keyword). Then you extend various multimethods that dispatch based on that storage type & provide the details for that platform.



The built-in storage operators use the same pattern as custom ones though, so they should make good examples. Here's the primary storage we use:


And then you'll need to register various multimethods to set up the different platforms:

For local (repl):


For pig:


For cascading:



Let me know if that's not clear & I can come up with a better example.


-Matt

--
You received this message because you are subscribed to the Google Groups "PigPen Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pigpen-suppor...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Bossenbroek

unread,
Dec 10, 2015, 5:45:38 PM12/10/15
to Rafik NACCACHE, pigpen-...@googlegroups.com
Finally got back to this - sorry about the lengthy delay; conferences & holidays got in the way for a bit there…



Let me know if that covers your use case, or if you have any other questions.

-Matt

On Monday, November 23, 2015 at 10:27 AM, Rafik NACCACHE wrote:

HI Matt,

Thank you for your response.

I more or less see the process of defining a custom output storage, But a little example is defintely welcome (if you don't mind of course) !

Can you elaborate a bit further on this ?

Thank you

Rafik

Rafik NACCACHE

unread,
Dec 11, 2015, 4:53:26 PM12/11/15
to Matt Bossenbroek, pigpen-...@googlegroups.com

Hi Matt,
Thank you very much,
Cheers
Rafik

Reply all
Reply to author
Forward
0 new messages