(newb) command line mapping

33 views
Skip to first unread message

Caleb Foong

unread,
Jan 22, 2013, 3:47:06 AM1/22/13
to pic...@googlegroups.com
Hi guys,

suppose that I have a lot of files in my volume:my-vol and I have a command line program,f,  that process them. How can I do this in parallel?
I figure it might be something like this

picloud mapexec -v my-vol -n w=f1,f2,f3,.... f {w}

But the question os how do I pass the list of file to the -n parameter?

Thanks you 

Aaron Staley

unread,
Jan 22, 2013, 4:04:31 PM1/22/13
to pic...@googlegroups.com
Hi Caleb,

You have the right idea. 

running the command:
picloud mapexec -v my-vol -n w=f1,f2,f3 f {w}

will create three jobs on PiCloud.

One will call:
f f1

another will call:
f f2

and the 3rd will call:
f f3

Was there something else you wanted to know?

Best,
Aaron Staley
PiCloud, Inc.

--
Aaron Staley
PiCloud, Inc.

Caleb Foong

unread,
Jan 23, 2013, 2:01:45 AM1/23/13
to pic...@googlegroups.com, aa...@picloud.com
Hi Aaron,

The problem is that I have some thousand of files that I want to process. I can't manually type them in, can I?

If I use a for-loop, then I am not using the parallelism.

Maybe I can do it in python, collecting all the filenames in a list, and calling the program inside python. But I would like to do it in command line for there are some stdout output that I want to capture.

Thanks for your help.

Caleb

Aaron Staley

unread,
Jan 23, 2013, 3:02:20 AM1/23/13
to Caleb Foong, pic...@googlegroups.com
Hi Caleb,

If you wish to generate a list of files in the current directory delimited by a ',' you can use the following command:

FILELIST= `ls -1 | paste -sd ","`

$FILELIST will then be set something like:
f1,f2

Regards,
Aaron

Caleb Foong

unread,
Jan 23, 2013, 9:44:28 AM1/23/13
to pic...@googlegroups.com, Caleb Foong, aa...@picloud.com
Thanks Aaron, this is just what I need.

By the way, my jobs create files. Is there a way to store it on cloud without first downloading it to my local machine?

Ken Elkabany

unread,
Jan 23, 2013, 12:01:21 PM1/23/13
to pic...@googlegroups.com, Caleb Foong, Aaron Staley
Hi Caleb,

If you're simply looking to store a single file per job as its result, then you can use: http://docs.picloud.com/cli.html#using-a-file-as-a-job-s-result

Otherwise, if you want to store an arbitrary number of files and have them accessible through one of our two data storage methods (or your own), please see: http://docs.picloud.com/storage.html

Since using one of our storage methods will require you to have a multiple-line mapexec command (running your commands, and then adding the data to our storage), you'll need to use a script: http://docs.picloud.com/cli.html#using-a-script

Best,
Ken
Reply all
Reply to author
Forward
0 new messages