Thanks Anubhav, this was very helpful. I now have MongoDB installed on the standalone machine, and Fireworks on the both the standalone machine and the cluster login node. I'm able to run jobs on the standalone machine, and on the cluster by using qlaunch on the login node in reservation/offline mode.
Now I'd like to ask if you could clarify what the --remote_host argument to qlaunch does. It seems like this should make it possible to run qlaunch on the standalone machine and have it execute jobs on the cluster. Using that I could do all the work of adding jobs to the workflow, and launching them, from the standalone machine. Is that correct?
I ran into a couple of issues trying to use --remote_host (-rh). On the standalone machine I ran this ("tiger" is the cluster login node):
qlaunch -rh tiger -rc <path-to-config-dir-on-cluster> -ru mcahn -r singleshot
First, I find that only rapidfire will work. singleshot complains that there are extra arguments to qlaunch. Something seems to add some arguments that are specific to rapidfire:
[tiger] run: qlaunch --reserve singleshot --maxjobs_queue None --maxjobs_block None --nlaunches None
[tiger] out: qlaunch: error: unrecognized arguments: --maxjobs_queue None --maxjobs_block None --nlaunches None
With rapidfire I get a complaint from queue_adapter.py. (In this example I've modified queue_adapter.py to print self.command and
kwargs). It seems to be complaining about the -1 (which is
subprocess.PIPE) in kwargs:
qlaunch -rh tiger -rc <path-to-config-dir-on-cluster> -ru mcahn -r rapidfire
[tiger] out: 2015-11-16 15:43:49,699 INFO getting queue adapter
[tiger] out: 2015-11-16 15:43:49,701 INFO Found previous block, using /path/to/queue_tests/block_2015-11-16-20-28-37-685034
[tiger] out: 2015-11-16 15:43:49,719 ERROR ----|vvv|----
[tiger] out: 2015-11-16 15:43:49,721 ERROR Error trying to get the number of jobs in the queue
[tiger] out: The error response reads: Traceback (most recent call last):
[tiger] out: File "/path/to/fireworks/fireworks/queue/queue_adapter.py", line 59, in target
[tiger] out: self.process = subprocess.Popen(self.command, **kwargs)
[tiger] out: File "/usr/lib64/python2.7/subprocess.py", line 710, in __init__
[tiger] out: errread, errwrite)
[tiger] out: File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
[tiger] out: raise child_exception
[tiger] out: TypeError: execv() arg 2 must contain only strings
[tiger] out:
[tiger] out: [u'squeue', u'-o "%u"', u'-u', 'mcahn', u'-p', None, u'-h']
[tiger] out: {u'stderr': -1, u'stdout': -1}[tiger] out: 2015-11-16 15:43:49,722 ERROR ----|^^^|----Any guidance on using qlaunch with --remote_host, and on what might be going wrong in queue_adapter.py would be much appreciated.
Best,
Matthew