Bug in addFile() ?

4 views
Skip to first unread message

Sébastien Maret

unread,
Oct 7, 2007, 7:22:22 PM10/7/07
to py...@googlegroups.com
Hi,

I am trying to run a code on a XGrid using PyXG. The codes uses an
input file as an argument. I have several input files named
input_001.ini, input_002.ini, etc. I want to create a batch with a
different task for each input file. For this I wrote the following
script:

import xg
import glob
import os.path

controller = xg.Controller(xg.Connection(hostname='localhost'))
grid = controller.grid(0)

jobspec = xg.JobSpecification()

for inputfile in glob.glob("input_*.ini"):
jobspec.addTask(cmd = "/full/path/to/my/code",
args = inputfile)
jobspec.addFile(filepath=".", filename=inputfile, isExecutable=0)

jobspec.files()
job = grid.batch(jobspec)

When I run this I obtain:

>>> xgrid -h localhost -grid list
xgrid -h localhost -gid 0 -job batch /tmp/tmpGX4rYL
Traceback (most recent call last):
File "/tmp/py4979Riq", line 22, in <module>
job = grid.batch(jobspec)
File "/sw/lib/python2.5/site-packages/xg.py", line 859, in batch
id = j.batch(specification, self.gridID)
File "/sw/lib/python2.5/site-packages/xg.py", line 1152, in batch
jobinfo = xgridParse(cmd)
File "/sw/lib/python2.5/site-packages/xg.py", line 237, in xgridParse
raise XgridError("xgrid command error: %s" % result[0])
XgridError: Xgrid Error: xgrid command error: 256

Note that I have set PYXGRID_DEBUG to True. I have also fixed the
small bug that was reported on this list:
http://groups.google.com/group/pyxg/t/52ec114d08eb1a8a

It seems that /tmp/tmpGX4rYL is empty.

The same piece of code works fine if I comment out the jobspec.addFile line:

xgrid -h localhost -grid list
xgrid -h localhost -gid 0 -job batch /tmp/tmp9i02Ck
Job submitted with id: 178

but then of course the code returns immediately because it expects an
input file.

Any idea on how I could fix this?

I am using Python 2.5.1 on MacOSX 10.4.10 and PyObjC 1.4.

Sébastien

Brian Granger

unread,
Oct 7, 2007, 9:12:32 PM10/7/07
to py...@googlegroups.com
Thanks for the report, I will try to have a look at this. Will get
back to you as I find out more.

Brian


--
Brian E. Granger, Ph.D.
Research Scientist
Tech-X Corporation
phone: 720-974-1850
bgra...@txcorp.com
elli...@gmail.com

Francis E Reyes

unread,
Oct 7, 2007, 10:24:50 PM10/7/07
to py...@googlegroups.com
Hmm I think the issue is adding the task before adding the file. Heres my code for the same script that creates a job specification for lots of files. 

It's not the cleanest, but I've used it to submit 12,000 files before. 

FR



#!/usr/bin/python


###################################
#
#
# xgridsubmit
#
# Written by: Francis E. Reyes
# Name: grid.py
#
#
#
# This script takes the command to be run as an argument.
# For now we restrict the command syntax to be
#
#       grid  <command> <args> <files>
#
# where <command> must exist in the the current working directory or in $PATH
# and <files> are the files on which issue the command on. As a default, we will upload
# all files in the local directory as <command> may need files outside those specified
# in the command line.


# Step - Modules

import os
import sys
import glob
import xg

# Functions

def printUsage():
        print """
Usage: grid 'command' 'arguments' files'

        command                 command to be submitted to cluster
        arguments               arguments for the command
        files                   files to operate command on (wildcards and absolute paths OK)

grid.py is a Python script that makes xml batch property lists for xgrid.  There must be exactly three arguments to this script enclosed by '' .
        """
        sys.exit()

# Step - Variables

path = os.getenv('PATH').split(':')
command = ''
arguments = ''
files = []
# Step - Process parameters

if sys.argv.__len__() < 4 or sys.argv.__len__() > 4:
        printUsage()
else :
        # We've met syntax requirements, now process arguments


        # sys.argv[1] contains the full path of command on remote host
        if sys.argv[1][0] != '/' :
                print 'ERROR: You must include the full path of the executable as it appears on the remote host!'
                print ''
                printUsage()

        command = sys.argv[1]


        #sys.argv[2] contains our command arguments

        arguments = sys.argv[2].split()

        #sys.argv[3] contains the list files in which to apply command to
        files = glob.glob(sys.argv[3])
        if files.__len__() == 0 :
                print "ERROR: No files matched in third argument"
                printUsage()
        else :
                print "Including " + str(files.__len__()) +" files"

        # done processing


connection = xg.Connection(hostname='foo.bar, password='foobard')
controller = xg.Controller(connection)
grid = controller.grid(0)

# append the command

specification = xg.JobSpecification()

for file in files :
        tempargs = []
        for arg in arguments :
                tempargs.append(arg)
        # add the file to the agent
        specification.addFile(file, os.path.basename(file))
        tempargs.append(os.path.basename(file))


        specification.addTask(command, tempargs)

job = grid.batch(specification)
#print specification.jobSpec()
#submit the job
#       job = grid.batch(specification)
#       print job.info()
#       print job.results()





---------------------------------------------
Francis Reyes M.Sc.
215 UCB
University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC  686F 78FD 6669 67BA 8D5D



Sébastien Maret

unread,
Oct 8, 2007, 12:16:09 AM10/8/07
to py...@googlegroups.com
On 10/7/07, Francis E Reyes <Franci...@colorado.edu> wrote:

> Hmm I think the issue is adding the task before adding the file. Heres my
> code for the same script that creates a job specification for lots of
> files.

Thank you for your prompt answer. I had a look at your script and I
find the problem in mine. In fact , I misenderstood the docstring of
addFile(). The first parameter of the function is 'filepath', which is
the full path of the file on the client (local) computer. I thought
this meant a directory, not a full filename. It turns out that if you
pass a filename that does exists (or a directory for that matter),
PyXG does not complain, but it returns a cryptic error when one submit
the batch. Maybe one could test that the file exist and that it is
readable when it is added.

A side question: if I understand the doc correctly, the files added to
a job get copied to all the nodes. Is there a way to add a file only
to a given task? What I would like to do is to copy a different input
file on each of the nodes, instead of copying all the input files on
all the nodes.

Best,
Sébastien

Francis E Reyes

unread,
Oct 8, 2007, 11:19:01 AM10/8/07
to py...@googlegroups.com
I don't think this is possible, since Grid -> node communication is handled exclusively by the Xgrid Controller.  You could instead setup different grid's containing each node and then use controller.grid(gridid) where gridid is specific to each node. Why have a different input file for each of the nodes?

---------------------------------------------
Francis Reyes M.Sc.
215 UCB
University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC  686F 78FD 6669 67BA 8D5D



Sébastien Maret

unread,
Oct 8, 2007, 11:34:01 AM10/8/07
to py...@googlegroups.com
On 10/8/07, Francis E Reyes <Franci...@colorado.edu> wrote:

> I don't think this is possible, since Grid -> node communication is handled
> exclusively by the Xgrid Controller.

So it's a xgrid limitation then?

> You could instead setup different
> grid's containing each node and then use controller.grid(gridid) where
> gridid is specific to each node. Why have a different input file for each of
> the nodes?

The input files contain the code parameters, and I want to run the
code for a range of different parameters. So I have created a bunch of
different input files, and each node run the code for one of these
input files.

If you have a limited number of input files, then copying all the
input files on all the nodes is probably fine. But if you have several
hundred thousands input files, copying all the files to all the nodes
probably makes a significant time overhead in the computation.

Sébastien

Francis E Reyes

unread,
Oct 8, 2007, 11:48:22 AM10/8/07
to py...@googlegroups.com
Someone more knowledgeable in xgrid can probably verify, but I imagined that the xgrid controller will only send a single task to a single node. Hence, all files associated with the task via addTask, will get sent to the node. I haven't verified these for myself, as most of my tasks run too quickly to see the files on each node. 

In your case, if you have 30,000 input files, you should have 30,000 different tasks, all of which get uploaded to the controller (all files and tasks), but are individually distributed to each node by the controller.



FR



---------------------------------------------
Francis Reyes M.Sc.
215 UCB
University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC  686F 78FD 6669 67BA 8D5D


Brian Granger

unread,
Oct 16, 2007, 4:09:48 PM10/16/07
to py...@googlegroups.com
On 10/7/07, Sébastien Maret <sebasti...@gmail.com> wrote:
>
> On 10/7/07, Francis E Reyes <Franci...@colorado.edu> wrote:
>
> > Hmm I think the issue is adding the task before adding the file. Heres my
> > code for the same script that creates a job specification for lots of
> > files.
>
> Thank you for your prompt answer. I had a look at your script and I
> find the problem in mine. In fact , I misenderstood the docstring of
> addFile(). The first parameter of the function is 'filepath', which is
> the full path of the file on the client (local) computer. I thought
> this meant a directory, not a full filename. It turns out that if you
> pass a filename that does exists (or a directory for that matter),
> PyXG does not complain, but it returns a cryptic error when one submit
> the batch. Maybe one could test that the file exist and that it is
> readable when it is added.

I have renamed the argument to localFilePath to make this clearer and
have also put in a test to make sure the file exists.

> A side question: if I understand the doc correctly, the files added to
> a job get copied to all the nodes. Is there a way to add a file only
> to a given task? What I would like to do is to copy a different input
> file on each of the nodes, instead of copying all the input files on
> all the nodes.

I will have a look at this and get back to you.

> Best,

Brian Granger

unread,
Oct 16, 2007, 6:07:49 PM10/16/07
to py...@googlegroups.com
> A side question: if I understand the doc correctly, the files added to
> a job get copied to all the nodes. Is there a way to add a file only
> to a given task? What I would like to do is to copy a different input
> file on each of the nodes, instead of copying all the input files on
> all the nodes.

I don't know of a way to copy files on a per task basis. But, there
is a bigger issue...

If you have files that tasks need there are two possibilities:

1. The files are few and small. In this case, there is not a problem
submitting them with each job/task.

2. The files are big and/or many. In this case, you _really_ want to
put the files on the agents (probably on a shared file system) before
submitting the jobs). Otherwise the overhead of moving things around
will kill you.

Brian

> Best,

Reply all
Reply to author
Forward
0 new messages