question related to pi/map command

7 views
Skip to first unread message

Demidov, Igor V

unread,
Jun 30, 2016, 9:07:21 AM6/30/16
to pigpen-...@googlegroups.com

HI Guys,

 

I am trying to replace the PIG scripting with pigpen but have some problems

 

Example:

 

My csv file have

 

1,Projects.Key1,1,1234124

2,Projects.Key1,1,1234124

3,Projects.Key1,2,1234124

 

Then I am trying to execute something like :

 

(let [

      bag0_ProjectsKey1 (pig/map (fn [{:keys [a b c d]}]

                                   {:a a

                                    :b b

                                    :c c

                                    :d d})

                                 (pig/load-csv "/home/docker/clojure/1.8/workspace/immutables/subv_test5/Projects.Key1.csv"))

 

      ]

  (pig/dump bag0_ProjectsKey1) )

 

I have

 

({:a nil, :b nil, :c nil, :d nil} {:a nil, :b nil, :c nil, :d nil} {:a nil, :b nil, :c nil, :d nil})

 

The pig/load-csv works fine

 

(pig/dump (pig/load-csv "/home/docker/clojure/1.8/workspace/immutables/subv_test5/Projects.Key1.csv" ))

=> (["1" "Projects.Key1" "1" "1234124"] ["2" "Projects.Key1" "1" "1234124"] ["3" "Projects.Key1" "2" "1234124"])

 

Why mapping doesn’t have a real values ? Do I pass incorrect {:keys [a b c d]}  ?

 

 

Best Regards

 

Igor Demidov | Optum Technology

Big Data Engineer, CODA

 

13625 Technology Drive Eden Prairie, MN 55344

W : 952-917-8533

igor.d...@optum.com

www.optum.com

 

 

 

From: Matt Bossenbroek [mailto:mbosse...@netflix.com]
Sent: Wednesday, June 29, 2016 3:53 PM
To: pigpen-...@googlegroups.com; Demidov, Igor V
Subject: Re: question about the PIG queue command

 

You can set pig options in a script using the following command: http://netflix.github.io/PigPen/pigpen.pig.html#var-set-options

 

There’s a slight difference in behavior between pig’s dump & store commands and pigpen’s. In pig, either will trigger an evaluation of the script. In pigpen, you can have multiple store commands but each will return only a query - you still need to call pig/dump to execute the query.

 

 

For example:

 

;; return data in the repl

(->> data

  (pig/dump))

 

;; write to a file & then return data in the repl

(->> data

  (pig/store-csv “out.csv")

  (pig/dump))

 

 

This is so you can do the following:

 

(->> data

  (pig/store-csv “out.csv")

  (pigpen.pig/write-script “script.pig"))

 

 

If the store command returned data, we wouldn’t be able to write the script.

 

HTH

 

-Matt

 

 

On June 29, 2016 at 1:30:08 PM, Demidov, Igor V (igor.d...@optum.com) wrote:

Hello,

I am working with pipen for couple weeks on a MapR Hadoop environment. I cannot run the pigpen in a mapreduce mode for some reasons but all the mocking in local mode works fine.

Do you know how to run PIG command like  

 SET mapred.job.queue.name 'coda_q1'  

Without this the MapR security doesn’t allow to run it.

When I generate the PIG script from my pigpen code and add this SET queue command everything works.

2) The second question is when I use DUMP command like (pipen/dump a) everything works (as it in local mode)

As soon as I replace it with something like (pigpen/store-csv) it stops working.

If you need more details I can send you additional info.

 

Best Regards

 

Igor Demidov | Optum Technology

Big Data Engineer, CODA

 

13625 Technology Drive Eden Prairie, MN 55344

W : 952-917-8533

igor.d...@optum.com

www.optum.com

 

 

 

 


This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

--
You received this message because you are subscribed to the Google Groups "PigPen Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pigpen-suppor...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

Matt Bossenbroek

unread,
Jun 30, 2016, 11:46:31 AM6/30/16
to pigpen-...@googlegroups.com, Demidov, Igor V
The load-csv command returns a vector, so you can’t use keys destructuring.

Try this instead:

(->>
  (pig/load-csv "/home/docker/clojure/1.8/workspace/immutables/subv_test5/Projects.Key1.csv”)
  (pig/map (fn [[a b c d]] {:a a, :b b, :c c, :d d}))
  (pig/dump))


-Matt

Reply all
Reply to author
Forward
0 new messages