Configuring qlaunch

30 views
Skip to first unread message

Ralph Nicolai Nasara

unread,
Dec 7, 2017, 7:44:08 AM12/7/17
to fireworkflows
Hello,


When submitting jobs in our local cluster, I only execute a command $ qsub parallel.sh and the job already goes into our queue. I'm having problems incorporating FireWorks into this.  I have reached the tutorial about launching rockets through a queue. 


This is what returned after $ qlaunch singleshot

Database at /data_piglet/lansan/atomate/config/FW_config.yaml is getting selected.
Found many potential paths for LAUNCHPAD_LOC: ['/data_piglet/lansan/atomate/config/my_launchpad.yaml', '/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_launchpad.yaml']
Choosing as default: /data_piglet/lansan/atomate/config/my_launchpad.yaml
Found many potential paths for FWORKER_LOC: ['/data_piglet/lansan/atomate/config/my_fworker.yaml', '/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_fworker.yaml']
Choosing as default: /data_piglet/lansan/atomate/config/my_fworker.yaml
Found many potential paths for QUEUEADAPTER_LOC: ['/data_piglet/lansan/atomate/config/my_qadapter.yaml', '/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_qadapter.yaml']
Choosing as default: /data_piglet/lansan/atomate/config/my_qadapter.yaml
2017-12-07 20:36:10,186 INFO moving to launch_dir /data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests
/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/queue/queue_adapter.py:142: UserWarning: Key logdir has been specified in qadapter but it is not present in template, please check template (/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/user_objects/queue_adapters/SLURM_template.txt) for supported keys.
  .format(subs_key, self.template_file))
2017-12-07 20:36:10,187 INFO submitting queue script
2017-12-07 20:36:10,191 ERROR ----|vvv|----
2017-12-07 20:36:10,192 ERROR Running the command: sbatch caused an error...
2017-12-07 20:36:10,194 ERROR Traceback (most recent call last):
  File "/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/user_objects/queue_adapters/common_adapter.py", line 204, in submit_to_queue
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib64/python3.4/subprocess.py", line 856, in __init__
    restore_signals, start_new_session)
  File "/usr/lib64/python3.4/subprocess.py", line 1460, in _execute_child
    raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'sbatch'

2017-12-07 20:36:10,194 ERROR ----|^^^|----
2017-12-07 20:36:10,195 ERROR ----|vvv|----
2017-12-07 20:36:10,195 ERROR Error writing/submitting queue script!
2017-12-07 20:36:10,196 ERROR Traceback (most recent call last):
  File "/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/queue/queue_launcher.py", line 136, in launch_rocket_to_queue
    raise RuntimeError('queue script could not be submitted, check queue '
RuntimeError: queue script could not be submitted, check queue script/queue adapter/queue server status!


Seems like a lot of errors. I am fairly new to FireWorks and would really like to configure this to our cluster.

Best,
Ralph
parallel_group2.sh

Anubhav Jain

unread,
Dec 7, 2017, 8:06:30 AM12/7/17
to Ralph Nicolai Nasara, fireworkflows
Hi Ralph

There are many types of queue management software (e.g., SLURM, PBS, etc)

Based on your parallel.sh you have PBS, but based on the error message trying to execute sbatch it looks like your my_qadapter.yaml file is configured to try SLURM.

You need to:
1. Locate your my_qadapter.yaml file. The message is telling you it's located at:  /data_piglet/lansan/atomate/config/my_launchpad.yaml. However, the messages before that indicate that you might have installed default locations for this file in several places. You may want to fix that at a later point, e.g. by deleting locations you don't plan to use.
2. Change the "_fw_q_type" parameter in this file to be "PBS" (probably currently says SLURM)
3. Modify the other parameters in my_qadapter.yaml to match your desired queue settings in terms of number of nodes, walltime, etc. The types of parameters you can specify depends on which queuing system you have set in step 2. For example, for PBS the possible template variables are found in the code: https://github.com/materialsproject/fireworks/blob/master/fireworks/user_objects/queue_adapters/PBS_template.txt
These parameters should match the types of settings that are already in your parallel.sh file.
4. Try again


--
You received this message because you are subscribed to the Google Groups "fireworkflows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fireworkflows+unsubscribe@googlegroups.com.
To post to this group, send email to firewo...@googlegroups.com.
Visit this group at https://groups.google.com/group/fireworkflows.
To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/9fd66c9e-32cf-436d-9610-693bbd3a97d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Best,
Anubhav
Reply all
Reply to author
Forward
0 new messages