/home/centos/gc3libs.L4Kvjx/.gc3pie_shellcmd/wrapper_script.sh: line 4: exec: None: not found

32 views
Skip to first unread message

Jody Weissmann

unread,
Nov 21, 2016, 5:51:42 AM11/21/16
to gc3pie
Hi

I have built a SequentialTaskCollection consisting of 6 Applications.

When i run it on localhost it works fine, but on the science cloud i get a message like 
/home/centos/gc3libs.L4Kvjx/.gc3pie_shellcmd/wrapper_script.sh: line 4: exec: None: not found
for every application in the SequentialTaskCollection

A new instance is started in the science cloud and the various output directories as well as stdout and stderr files are created.

The instance created is a fresh Centos 7.2 installationto which i added some packages needed for my software to run (i.e. no gc3lib or so installed).

Does anybody know how to solve this problem?

cheers
  Jody

Jody Weissmann

unread,
Nov 21, 2016, 8:49:56 AM11/21/16
to gc3pie
Update:
I created a new snapshot from the previous one after installing gc3pie and adding the line
  . /home/centos/gc3pie/bin/activate
in my .bshrc

I still get the same messages in stderr outputs

Jody

Riccardo Murri

unread,
Nov 22, 2016, 4:01:47 PM11/22/16
to gc3...@googlegroups.com
Hi Jody,

(Jody Weissmann, Mon, Nov 21, 2016 at 02:51:42AM -0800:)
> I have built a *SequentialTaskCollection* consisting of 6 Applications.
>
> When i run it on localhost it works fine, but on the science cloud i get a
> message like
> /home/centos/gc3libs.L4Kvjx/.gc3pie_shellcmd/wrapper_script.sh: line 4: exec
> : None: not found
> for every application in the *SequentialTaskCollection*

That script is auto-generated by GC3Pie; line 4 is effecting
stdin/stdout/stderr redirections. By looking at the code, I cannot
imagine how a Python ``None`` would land in there, but apparently there
is a way...

Can you please post one of these failing `wrapper_script.sh`? It might
shed some light about what is actually going wrong.

Thanks,
R

--
Riccardo Murri, Schwerzenbacherstrasse 2, CH-8606 Nänikon, Switzerland

Jody Weissmann

unread,
Nov 23, 2016, 5:08:36 AM11/23/16
to gc3pie
Hi Ricardo
as those wrapper_script.sh files sit on the science cloud instances, they disappear when the application terminate.
I added a copy command to the terminated() method of QDF2PlinkApp.py:
    newout = os.path.join(self.final_dir, "wrapper_script.sh")
    move
("./.gc3pie_shellcmd/wrapper_script.sh", newout)

But this doesn't work according to the output:
gc3.gc3libs: ERROR: Ignored error in fetching output of task 'QDF2PlinkApp.3411': IOError: [Errno 2] No such file or directory: './.gc3pie_shellcmd/wrapper_script.sh'
gc3
.gc3libs: DEBUG: (Original traceback follows.)
Traceback (most recent call last):
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/core.py", line 1594, in progress
    changed_only
=self.retrieve_changed_only)
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/core.py", line 592, in fetch_output
    app
, download_dir, overwrite, changed_only, **extra_args)
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/core.py", line 671, in __fetch_output_application
   
return Task.fetch_output(app, download_dir)
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/__init__.py", line 424, in fetch_output
   
self.execution.state = Run.State.TERMINATED
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/__init__.py", line 1812, in fset
    getattr
(self._ref, handler)()
 
File "/home/centos/analysis/QDF2PlinkApp.py", line 59, in terminated
    move
("./.gc3pie_shellcmd/wrapper_script.sh", newout)
 
File "/usr/lib64/python2.7/shutil.py", line 301, in move
    copy2
(src, real_dst)
 
File "/usr/lib64/python2.7/shutil.py", line 130, in copy2
    copyfile
(src, dst)
 
File "/usr/lib64/python2.7/shutil.py", line 82, in copyfile
   
with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: './.gc3pie_shellcmd/wrapper_script.sh'

I don't know if the path is correct, ot if the instance is still alive, once the method terminate() is executed....

Is there a way to keep an instance alive after its application has finished?

Regards

  Jody



On Monday, November 21, 2016 at 11:51:42 AM UTC+1, Jody Weissmann wrote:

Riccardo Murri

unread,
Nov 23, 2016, 5:11:52 AM11/23/16
to gc3...@googlegroups.com
Hi Jody,

> as those wrapper_script.sh files sit on the science cloud instances, they
> disappear when the application terminate.

Yes, but you can grab a copy via the `outputs=...` parameter::

Application.__init__(
# ...
outputs=['.gc3pie_shellcmd/wrapper_script.sh'],
# ...
)

Ciao,

Jody Weissmann

unread,
Nov 23, 2016, 7:13:29 AM11/23/16
to gc3pie
Hi
With the 'outputs=...' option it worked - i attached the file to this mail.

Looks like line 4 is the bad guy:
exec None -o /home/centos/gc3libs.rHDmwt/.gc3pie_shellcmd/resource_usage.txt -f 'WallTime=%es

Indeed the logfile was full of messages about resuorce_usage.txt as well
gc3.gc3libs: ERROR: Could not open wrapper file '/home/centos/gc3libs.FIq0z7/.gc3pie_shellcmd/resource_usage.txt' for task 'GUnzipApp.3437': Could not open file '/home/centos/gc3libs.FIq0z7/.gc3pie_shellcmd/resource_usage.txt' on host '172.23.64.96': IOError: [Errno 2] No such file
gc3
.gc3libs: DEBUG: Error getting status of application 'GUnzipApp.3437': InvalidValue: Could not open wrapper file '/home/centos/gc3libs.FIq0z7/.gc3pie_shellcmd/resource_usage.txt' for task 'GUnzipApp.3437': Could not open file '/home/centos/gc3libs.FIq0z7/.gc3pie_shellcmd/resource_usage.txt' on host '172.23.64.96': IOError: [Errno 2] No such file
Traceback (most recent call last):

 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/core.py", line 443, in __update_application
    state
= lrms.update_job_state(app)
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/backends/openstack.py", line 776, in update_job_state
   
return self.subresources[app.os_instance_id].update_job_state(app)
 
File "/home/centos/gc3pie/lib/python2.7/site-packages/gc3libs/backends/shellcmd.py", line 662, in update_job_state
   
% (wrapper_filename, app, err), do_log=True)

Regards
  Jody


On Monday, November 21, 2016 at 11:51:42 AM UTC+1, Jody Weissmann wrote:
wrapper_script.sh

Riccardo Murri

unread,
Nov 23, 2016, 8:18:09 AM11/23/16
to gc3...@googlegroups.com
Hi Jody,

> Looks like line 4 is the bad guy:
> exec None -o /home/centos/gc3libs.rHDmwt/.gc3pie_shellcmd/resource_usage.txt

Looks like somehow `time_cmd` is set to `None` in your configuration
file?

What version of GC3Pie are you using?

Jody Weissmann

unread,
Nov 24, 2016, 11:17:26 AM11/24/16
to gc3pie
Hi
How do i find out what the gc3pie version is?

My config file has no entry for "time_cmd" - i attach the config file.

Thanks

Jody

On Monday, November 21, 2016 at 11:51:42 AM UTC+1, Jody Weissmann wrote:
gc3pie.conf

Riccardo Murri

unread,
Nov 24, 2016, 3:41:22 PM11/24/16
to gc3...@googlegroups.com
(Jody Weissmann, Thu, Nov 24, 2016 at 08:17:25AM -0800:)
> Hi
> How do i find out what the gc3pie version is?

Run any GC3Pie utility command (e.g., `gservers`) with the `--version`
option::

$ gservers --version
gservers development version (SVN $Revision$)


> My config file has no entry for "time_cmd" - i attach the config file.

Can you please try adding this line after the `image_id=...` line::

time_cmd=/usr/bin/time

Riccardo Murri

unread,
Nov 24, 2016, 4:06:03 PM11/24/16
to gc3...@googlegroups.com
Hi Jody,

> My config file has no entry for "time_cmd" - i attach the config file.

I have started a VM with the snapshot you're using on Science Cloud,
and the `/usr/bin/time` program, which is required by GC3Pie to run,
is indeed missing from it.

So:

1. This is nonetheless a bug in GC3Pie, which should detect that
`/usr/bin/time` is missing and report that, instead of using ``None``.

2. You can fix the issue by running the following command on a VM
instance and then making a new snapshot to use with `image_id=...`::

sudo yum install time

Hope this helps,

Jody Weissmann

unread,
Nov 25, 2016, 5:43:23 AM11/25/16
to gc3pie
Hi Riccardo
My gcrpie version:
gservers 2.4.2 version (SVN $Revision: 4328 $)

More and more problems.
I (re-)installed time (it was already there)
I added the time_cmd entry in my main instance's gc3pie.conf

When i started my script, i now got:
gc3.gc3libs: ERROR: Could not create resource 'sciencecloud': OpenStack backend has been requested but no `python-novaclient` package was found. Please, install `python-novaclient` with`pip install python-novaclient` or `easy_install python-novaclient` and try again, or update your configuration file.. Configuration file problem?
gc3
.gc3libs: WARNING: Failed creating backend for resource 'sciencecloud' of type 'openstack+shellcmd': ConfigurationError: OpenStack backend has been requested but no `python-novaclient` package was found. Please, install `python-novaclient` with`pip install python-novaclient` or `easy_install python-novaclient` and try again, or update your configuration file.

So i followed the advice and did `pip install python-novaclient`.
It is quite a mistery for me since i used the previous main instance (from which i took the snapshot for the current main  instance) for many successful runs of gqhg...

Anyway, after this install the script started, but was unable to connect to the spawned instance:
gc3.gc3libs: ERROR: Could not create ssh connection to 172.23.65.142: SSHException: not a valid EC private key file
gc3
.gc3libs: ERROR: Key file /home/centos/.ssh/cloud_jw not accepted by remote host 172.23.65.142. Please check your setup.
gc3
.gc3libs: ERROR: Could not create ssh connection to 172.23.65.142: SSHException: not a valid EC private key file
gc3
.gc3libs: ERROR: Key file /home/centos/.ssh/cloud_jw not accepted by remote host 172.23.65.142. Please check your setup.
gc3
.gc3libs: ERROR: Got error in submitting task 'GUnzipApp.240', informing scheduler: LRMSSkipSubmissionToNextIteration: Delaying submission until some of the VMs currently pending is ready. Pending VM ids: f72c4453-5e01-4df1-813c-369efda14766
Which is strange because manually i can log in (with cloud_jw key) to  172.23.65.142 from my main instance (172.23.64.111), but i had to type the passphrase.
Do i have to start ssh-agent on my main instance? If yes, what is the correct command for this?

Regards

  Jody



On Monday, November 21, 2016 at 11:51:42 AM UTC+1, Jody Weissmann wrote:
Hi

I have built a SequentialTaskCollection consisting of 6 Applications.

When i run it on localhost it works fine, but on the science cloud i get a message like 
/home/centos/gc3libs.L4Kvjx/.gc3pie_shellcmd/wrapper_script.sh: line 4: exec: None: not found
for every application in the SequentialTaskCollection

A new instance is started in the science cloud and the various output directories as well as stdout and stderr files are created.

The instance created is a fresh Centos 7.2 installation to which i added some packages needed for my software to run (i.e. no gc3lib or so installed).

Riccardo Murri

unread,
Nov 25, 2016, 5:58:05 AM11/25/16
to gc3...@googlegroups.com
Hi Jody,

(Jody Weissmann, Fri, Nov 25, 2016 at 02:43:23AM -0800:)
> Hi Riccardo
> My gcrpie version:
> gservers 2.4.2 version (SVN $Revision: 4328 $)

That's rather old, but wait before upgrading. I'm not sure the "time"
issue is solved in the latest version.

> More and more problems.
> I (re-)installed time (it was already there)

No, you have to install it on the *snapshot* used for creating the
worker machines. It's not there - I checked yesterday evening.


> When i started my script, i now got:
> gc3.gc3libs: ERROR: Could not create resource 'sciencecloud': OpenStack
> backend has been requested but no `python-novaclient` package was found.
> Please, install `python-novaclient` with`pip install python-novaclient` or `easy_install
> python-novaclient` and try again, or update your configuration file..
> Configuration file problem?
> [...]
>
> So i followed the advice and did `pip install python-novaclient`.
> It is quite a mistery for me since i used the previous main instance (from
> which i took the snapshot for the current main instance) for many
> successful runs of gqhg...

Did you reinstall GC3Pie or switch to a different Python virtual environment?


> Anyway, after this install the script started, but was unable to connect to
> the spawned instance:
> gc3.gc3libs: ERROR: Could not create ssh connection to 172.23.65.142:
> SSHException: not a valid EC private key file
> gc3.gc3libs: ERROR: Key file /home/centos/.ssh/cloud_jw not accepted by
> remote host 172.23.65.142. Please check your setup.

I suspect that the worker VMs might have some restrictions on the kind
of SSH key that can be used for login (e.g. Ubuntu 16.04 won't accept
DSA keys). You can print out the SSH key type with::

ssh-keygen -B -f $HOME/.ssh/cloud_jw

Ciao,

Riccardo Murri

unread,
Nov 28, 2016, 11:47:33 AM11/28/16
to gc3...@googlegroups.com
Hi all,

for the record, I'm writing the resolution of the issues on Jody's
computer (we met this afternoon):

> > Anyway, after this install the script started, but was unable to connect to
> > the spawned instance:
> > gc3.gc3libs: ERROR: Could not create ssh connection to 172.23.65.142:
> > SSHException: not a valid EC private key file
> > gc3.gc3libs: ERROR: Key file /home/centos/.ssh/cloud_jw not accepted by
> > remote host 172.23.65.142. Please check your setup.

This really stems from the way Paramiko (the SSH library that GC3Pie is
using) handles key files. Quick fix: use `ssh-agent` always. More
information in GC3Pie bug #589, <https://github.com/uzh/gc3pie/issues/589>


> > I (re-)installed time (it was already there)

The issue with the `time` command arose from an incomplete installation
(`/usr/bin/time` was there, but it had 0 length). Nonetheless, there is
a GC3Pie bug in that the message should have been more explicit and the
(sub)resource should have been shut down; see GC3Pie bug #590 for more
info, <https://github.com/uzh/gc3pie/issues/590>
Reply all
Reply to author
Forward
0 new messages