Ken Elkabany <k...@picloud.com> Nov 27 03:26PM -0800
Hi Eric,
Yes, you're correct. Each job should be responsible for saving its own data
to cloud.files.
An efficient way to do what you want is to generate the CSV in memory using
the Python csv module and then upload it using cloud.files.putf. This
avoids writing the file to disk, and then redundantly reading the file from
disk to upload it. Here's a quick example:
import csv
from cStringIO import StringIO
# create a file-like object that resides purely in memory
f = StringIO()
# create a csv writer object
w = csv.writer(f)
# you can call this function repeatedly to write as many rows as you want
# this writes a row with values from 0 to 9
w.writerow(range(10))
# (optional) you can see that the csv writer is writing to the StringIO obj
print f.getvalue()
# save your csv to cloud.files
cloud.files.putf(f, 'name_for_data')
# you can retrieve the data at a later time
# get will save the csv to a file of the same name
cloud.files.get('name_for_data')
Helpful links:
http://docs.python.org/library/stringio.html
http://docs.python.org/library/csv.html
http://docs.picloud.com/moduledoc.html#module-cloud.files
How large is the input data for each function call? Are the functions
taking in a CSV and outputting a CSV? How long does each run take when you
haven't batched jobs together?
Ken
> the end of the day, it would be nice to have the collected data - or
> chunks of it - in a nice convenient CSV file for later analysis on my
> local machine.
Batching any number of runs so I don't have to call 10,000 jobs meansYou received this message because you are subscribed to the Google Group picloud.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.