custodian.custodian.ValidationError during launching calculations

128 views
Skip to first unread message

Sandeep

unread,
Aug 2, 2019, 1:03:39 PM8/2/19
to atomate
Hi,
I am a new user for atomate. After installation, I tried to launch  calculations (example of MgO band structure) using atomate. I follow the instruction (https://atomate.org/running_workflows.html#prerequisites). But after running some time, I found the "FIZZLED" state instead of running.  To Add workflow to LaunchPad and running the workflow, I  used below step:
lpad reset
python mgo_bandstructure.py
rlaunch singleshot

 Below are mongodb functioning of fireworks:

  1. _id
     : 
    5d4348845d924746e1f79dd0
  2.  
     
    spec
     : 
    Object
  3. fw_id
     : 
    1
  4. created_on
     : 
    "2019-08-01T20:16:03.198678"
  5. updated_on
     : 
    "2019-08-01T20:16:03.198679"
  6. name
     : 
    "MgO-nscf line"
  7.  
     
    launches
     : 
    Array
  8.  
     
    archived_launches
     : 
    Array
  9. state
     : 
    "WAITING"
 
  1. _id
     : 
    5d4348845d924746e1f79dd1
  2.  
     
    spec
     : 
    Object
  3. fw_id
     : 
    2
  4. created_on
     : 
    "2019-08-01T20:16:03.198584"
  5. updated_on
     : 
    "2019-08-01T20:16:03.198585"
  6. name
     : 
    "MgO-nscf uniform"
  7.  
     
    launches
     : 
    Array
  8.  
     
    archived_launches
     : 
    Array
  9. state
     : 
    "WAITING"
 
  1. _id
     : 
    5d4348845d924746e1f79dd2
  2.  
     
    spec
     : 
    Object
  3. fw_id
     : 
    3
  4. created_on
     : 
    "2019-08-01T20:16:03.198475"
  5. updated_on
     : 
    "2019-08-01T20:16:03.198476"
  6. name
     : 
    "MgO-static"
  7.  
     
    launches
     : 
    Array
  8.  
     
    archived_launches
     : 
    Array
  9. state
     : 
    "WAITING"
 
   1.       _id : 5d4348845d924746e1f79dd3
           spec : Object
   2.       fw_id : 4
   3.       created_on : "2019-08-01T20:16:03.198222"
   4.      updated_on : "2019-08-01T20:17:49.993173"
            
   5.        name : "MgO-structure optimization"
              launches : Array
           archived_launches : Array
                0 : 1
    6.         state : "FIZZLED"
          
       


And at last I got an error below when I used "rlaunch singleshot".

2019-08-01 22:16:21,937 INFO Hostname/IP lookup (this will take a few seconds)
2019-08-01 22:16:21,938 INFO Launching Rocket
2019-08-01 22:16:26,720 INFO RUNNING fw_id: 4 in directory: /mnt/lscratch/users/skumar/atomate_opt/MgO
2019-08-01 22:16:27,945 INFO Task started: FileWriteTask.
2019-08-01 22:16:27,946 INFO Task completed: FileWriteTask 
2019-08-01 22:16:28,191 INFO Task started: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}.
2019-08-01 22:16:28,212 INFO Task completed: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}} 
2019-08-01 22:16:28,455 INFO Task started: {{atomate.vasp.firetasks.run_calc.RunVaspCustodian}}.
Validation failed: <custodian.vasp.validators.VasprunXMLValidator object at 0x7fd530602400>
Traceback (most recent call last):
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/atomate/vasp/firetasks/run_calc.py", line 205, in run_task
    c.run()
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/custodian/custodian.py", line 328, in run
    self._run_job(job_n, job)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/custodian/custodian.py", line 452, in _run_job
    raise ValidationError(s, True, v)
custodian.custodian.ValidationError: Validation failed: <custodian.vasp.validators.VasprunXMLValidator object at 0x7fd530602400>
2019-08-01 22:17:50,731 INFO Rocket finished

Please suggest me what should I do to remove this error?  I would be thankful.


Many thanks and regards

Sandeep

Anubhav Jain

unread,
Aug 6, 2019, 12:42:36 PM8/6/19
to atomate
Hi Sandeep

If you run "rlaunch singleshot", it will run the workflow on your local system. This means that VASP will also be executed on your local system. I'd first make sure VASP is properly installed on your system.

If you feel that is the case, can you provide more details? For example, does the VASP run start at all? If so, what are the contents of the OUTCAR? Also, what does FW_job.error and FW_job.out say?

Best,
Anubhav

Sandeep

unread,
Aug 6, 2019, 1:28:35 PM8/6/19
to atomate
Dear Anubhav,

Thank you very much for your reply. Actually, VASP run through a job scheduler like SLURM . So I think I should use qlaunch instead of rlaunch and for this, I set up my_qadapter.yaml file and the contain inside are below:

my_qadapter.yaml

fw_name: CommonAdapter
_fw_q_type: SLURM
_fw_template_file: /scratch/users/skumar/atomate_opt/job_ht.sh
rocket_launch: qlaunch -c /scratch/users/skumar/atomate_opt/config rapidfire
nodes: 2
walltime: 24:00:00
queue: batch
account: null
job_name: null
pre_rocket: null
post_rocket: null
logdir: /scratch/users/skumar/atomate_opt/logs

db.json

{
    "host": "mongodb+srv://cluster0-aagd1.mongodb.net/",
    "port": 27017,
    "database": "test",
    "collection": "test",
    "admin_user": "test",
    "admin_password": "12345",
    "authsource": "admin"
    "aliases": {}
}


But when I used qlaunch singleshot, I got an error below:

Traceback (most recent call last):
  File "/home/users/skumar/anaconda3/envs/atomate_opt/bin/qlaunch", line 10, in <module>
    sys.exit(qlaunch())
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/scripts/qlaunch_run.py", line 222, in qlaunch
    do_launch(args)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/scripts/qlaunch_run.py", line 61, in do_launch
    queueadapter = load_object_from_file(args.queueadapter_file)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/utilities/fw_serializers.py", line 394, in load_object_from_file
    return load_object(m_dict)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/utilities/fw_serializers.py", line 323, in load_object
    fw_name = FW_NAME_UPDATES.get(obj_dict['_fw_name'], obj_dict['_fw_name'])
KeyError: '_fw_name'


However, I got VASP input files like INCAR, POTCAR, KPOINTS and POSCAR when I used rlaunch singleshot with an error. And it was like below:

2019-08-06 19:07:47,701 INFO Hostname/IP lookup (this will take a few seconds)
2019-08-06 19:07:47,702 INFO Launching Rocket
2019-08-06 19:07:52,654 INFO RUNNING fw_id: 4 in directory: /mnt/lscratch/users/skumar/atomate_opt/MgO
2019-08-06 19:07:53,886 INFO Task started: FileWriteTask.
2019-08-06 19:07:53,891 INFO Task completed: FileWriteTask 
2019-08-06 19:07:54,137 INFO Task started: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}.
2019-08-06 19:07:54,211 INFO Task completed: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}} 
2019-08-06 19:07:54,456 INFO Task started: {{atomate.vasp.firetasks.run_calc.RunVaspCustodian}}.
Validation failed: <custodian.vasp.validators.VasprunXMLValidator object at 0x7f58d1ededa0>
Traceback (most recent call last):
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/atomate/vasp/firetasks/run_calc.py", line 205, in run_task
    c.run()
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/custodian/custodian.py", line 328, in run
    self._run_job(job_n, job)
  File "/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/custodian/custodian.py", line 452, in _run_job
    raise ValidationError(s, True, v)
custodian.custodian.ValidationError: Validation failed: <custodian.vasp.validators.VasprunXMLValidator object at 0x7f58d1ededa0>
2019-08-06 19:08:16,703 INFO Rocket finished


Could you please suggest me what should I do ? I need your valuable suggestion to resolve this error.


Thanks and regards

Sandeep

Anubhav Jain

unread,
Aug 9, 2019, 6:54:22 PM8/9/19
to atomate
Your my_qadapter.yaml should say "_fw_name: CommonAdapter" not "fw_name: CommonAdapter"

Sandeep

unread,
Aug 10, 2019, 6:16:58 AM8/10/19
to atomate
Dear Anubhav,

Thanks a lot! Now qlaunch  is working  and  relaxation step is done but it stops after optimization. When I used qlaunch rapidfire , I also got  these warning below:


/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/queue/queue_adapter.py:143: UserWarning: Key rocket_launch has been specified in qadapter but it is not present in template, please check template (/scratch/users/skumar/atomate_opt/config/job_ht.sh) for supported keys.
  .format(subs_key, self.template_file))
/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/queue/queue_adapter.py:143: UserWarning: Key nodes has been specified in qadapter but it is not present in template, please check template (/scratch/users/skumar/atomate_opt/config/job_ht.sh) for supported keys.
  .format(subs_key, self.template_file))
/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/queue/queue_adapter.py:143: UserWarning: Key ntasks_per_node has been specified in qadapter but it is not present in template, please check template (/scratch/users/skumar/atomate_opt/config/job_ht.sh) for supported keys.
  .format(subs_key, self.template_file))
/home/users/skumar/anaconda3/envs/atomate_opt/lib/python3.7/site-packages/fireworks/queue/queue_adapter.py:143: UserWarning: Key walltime has been specified in qadapter but it is not present in template, please check template (/scratch/users/skumar/atomate_opt/config/job_ht.sh) for supported keys.
  .format(subs_key, self.template_file))


################### slurm.out##############################

== Starting run at Sat Aug 10 11:46:06 CEST 2019
== Job ID: 682351
== Node list: iris-131
== Submit dir. : /mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
 Error reading item 'VCAIMAGES' from file INCAR.
2019-08-10 11:46:09,017 INFO Hostname/IP lookup (this will take a few seconds)
2019-08-10 11:46:12,788 INFO Created new dir /mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357/launcher_2019-08-10-09-46-12-788389
2019-08-10 11:46:12,789 INFO Launching Rocket
2019-08-10 11:46:14,744 INFO RUNNING fw_id: 4 in directory: /mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357/launcher_2019-08-10-09-46-12-788389
2019-08-10 11:46:15,970 INFO Task started: FileWriteTask.
2019-08-10 11:46:15,971 INFO Task completed: FileWriteTask 
2019-08-10 11:46:16,214 INFO Task started: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}.
2019-08-10 11:46:16,297 INFO Task completed: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}} 
2019-08-10 11:46:16,539 INFO Task started: {{atomate.vasp.firetasks.run_calc.RunVaspCustodian}}.
2019-08-10 11:46:48,138 INFO Task completed: {{atomate.vasp.firetasks.run_calc.RunVaspCustodian}} 
2019-08-10 11:46:48,386 INFO Task started: {{atomate.common.firetasks.glue_tasks.PassCalcLocs}}.
2019-08-10 11:46:48,388 INFO Task completed: {{atomate.common.firetasks.glue_tasks.PassCalcLocs}} 
2019-08-10 11:46:48,633 INFO Task started: {{atomate.vasp.firetasks.parse_outputs.VaspToDb}}.
2019-08-10 11:46:48,633 INFO atomate.vasp.firetasks.parse_outputs PARSING DIRECTORY: /mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357/launcher_2019-08-10-09-46-12-788389
2019-08-10 11:46:48,633 INFO atomate.vasp.drones Getting task doc for base dir :/mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357/launcher_2019-08-10-09-46-12-788389
2019-08-10 11:46:48,942 INFO atomate.vasp.drones Post-processing dir:/mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357/launcher_2019-08-10-09-46-12-788389
2019-08-10 11:46:48,943 WARNING atomate.vasp.drones Transformations file does not exist.
2019-08-10 11:46:48,984 INFO atomate.vasp.drones Post-processed /mnt/lscratch/users/skumar/atomate_opt/MgO/block_2019-08-10-09-45-44-609779/launcher_2019-08-10-09-45-48-345357/launcher_2019-08-10-09-46-12-788389
2019-08-10 11:46:51,067 INFO Rocket finished
Traceback (most recent call last):
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/site-packages/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/site-packages/atomate/vasp/firetasks/parse_outputs.py", line 113, in run_task
    mmdb = VaspCalcDb.from_db_file(db_file, admin=True)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/site-packages/atomate/utils/database.py", line 114, in from_db_file
    creds = loadfn(db_file)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/site-packages/monty/serialization.py", line 83, in loadfn
    return json.load(fp, *args, **kwargs)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/json/__init__.py", line 296, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/site-packages/monty/json.py", line 255, in decode
    d = json.JSONDecoder.decode(self, s)
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/users/skumar/anaconda3/envs/my_env/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 9 column 3 (char 219)


######################################################
For your kind information, below are my_qadapter.yaml and job_ht.sh file:


my_qadapter.yaml::::::::::::::::::


_fw_name: CommonAdapter
_fw_q_type: SLURM
_fw_template_file: /scratch/users/skumar/atomate_opt/config/job_ht.sh
rocket_launch: rlaunch -c /scratch/users/skumar/atomate_opt/config rapidfire
nodes: 1
ntasks_per_node: 28
walltime: 05:00:00
queue: null
account: null
job_name: null
pre_rocket: null
post_rocket: null
logdir: /scratch/users/skumar/atomate_opt/logs


job_ht.sh

#!/bin/bash -l
# Submission script for Iris
#SBATCH -J HTS
#SBATCH --mail-type=end,fail
#SBATCH --mail-user=sandee...@gmail.com
#SBATCH -N 1
#SBATCH --ntasks-per-node=28
#SBATCH -c 1
#SBATCH --time=0-05:00:00
#SBATCH -p batch
#SBATCH --qos=qos-batch
####SBATCH --qos=qos-besteffort

#print

echo "== Starting run at $(date)"
echo "== Job ID: ${SLURM_JOBID}"
echo "== Node list: ${SLURM_NODELIST}"
echo "== Submit dir. : ${SLURM_SUBMIT_DIR}"

## Set up the python environment


# Set up the linux environment, i.e. load the modules you need to run VASP
#module load swenv/default-env/devel

module load phys/VASP/5.4.4-intel-2018a

####srun -n $SLURM_NTASKS vasp_ncl
srun -n $SLURM_NTASKS vasp_std
####srun -n $SLURM_NTASKS vasp_ncl


# Run the rlaunch command 
rlaunch -c /scratch/users/skumar/atomate_opt/config rapidfire




Could you please tell me where I did a mistake ? I would be very thankful to you.


Thanks and regards

Sandeep




Anubhav Jain

unread,
Aug 22, 2019, 4:03:12 PM8/22/19
to atomate
hi Sandeep

As for the warnings, it looks like you did not specify the correct queue adapter file. For some reason your queue adapter file is being specified as /scratch/users/skumar/atomate_opt/config/job_ht.sh. It should instead be your path/to/my_qadapter.yaml

i.e.

qlaunch -q path/to/my_qadapter.yaml rapidfire

Since I don't know the exact command you are typing or your exact set up, I don't know why /scratch/users/skumar/atomate_opt/config/job_ht.sh is being read in as the my_qadapter.yaml file.

As for the error below that, first I'd fix the above error. Then let's see if you continue getting the second error


On Saturday, August 10, 2019 at 3:16:58 AM UTC-7, Sandeep wrote:
Dear Anubhav,

#SBATCH --mail-user=sandeepsingh@gmail.com

Sandeep

unread,
Sep 17, 2019, 5:11:42 AM9/17/19
to atomate
Dear Anubhav,

Thank you for your response. Now the first problem is solved but now I am getting VASP output files in gzipped format and my job is fizzled:

  1. id
    :
    5d7f5d4cff9ad99e63961a80
  2. spec
    :
    Object
  3. fw_id
    :
    4
  4. created_on
    :
    "2019-09-16T10:00:42.524125"
  5. updated_on
    :
    "2019-09-17T07:54:32.558292"
  6. launches
    :
    Array
  7. state
    :
    "FIZZLED"
  8. name
    :
    "MgO-structure optimization"
  9. archived_launches
    :
    Array

Please suggest what should I do. 

Thanks

Sandeep


#SBATCH --mail-user=sandee...@gmail.com

Anubhav Jain

unread,
Sep 20, 2019, 6:54:46 PM9/20/19
to atomate
Hi Sandeep

The gzipped is expected - there are multiple ways to turn it off if you'd like, but that would be unrelated to your error.

As for why the job failed, can you send the output directory? There are dozens of possible failures (VASP not found, VASP POTCARs not found, some setting is inconsistent, etc) and it's not possible to guess without seeing all your logs.

Sandeep

unread,
Sep 23, 2019, 9:04:28 AM9/23/19
to atomate
Hi,
Thanks for reply. Please find attached the output directory. 

Thanks and waiting for your response eagerly.

best regards,

Sandeep
Mgo.tar.gz
Reply all
Reply to author
Forward
0 new messages