Problem of local submission in USPEX-10.2

1,798 views
Skip to first unread message

Bin Li

unread,
Feb 10, 2019, 1:21:03 PM2/10/19
to USPEX
Dear all,

I encounter a problem when using the new version USPEX-10.2.

If I set "0 : whichCluster", USPEX works well. 

Then I set  "1 : whichCluster", and modified the Submission/submitJob_local.py to meet PBS. It submits jobs well, but USPEX seems can't get correct jobID and loop on step 1. File "results1/OUTPUT.txt" collects nothing.

Please give some suggestions. Thank you in advance.

#######  CalcFoldTemp/Jobs.history  ######

$ cat CalcFoldTemp/Jobs.history 
Generation 1 Step 1 of Structure 1 at Calcfold1 : JOBID 0  Submitted Feb11-01:36:18
Generation 1 Step 1 of Structure 2 at Calcfold2 : JOBID 0  Submitted Feb11-01:36:18
Generation 1 Step 1 of Structure 3 at Calcfold3 : JOBID 0  Submitted Feb11-01:36:18
Generation 1 Step 1 of Structure 4 at Calcfold4 : JOBID 0  Submitted Feb11-01:36:18
Generation 1 Step 1 of Structure 5 at Calcfold5 : JOBID 0  Submitted Feb11-01:36:19
Generation 1 Step 1 of Structure 6 at Calcfold6 : JOBID 0  Submitted Feb11-01:36:19
Generation 1 Step 1 of Structure 7 at Calcfold7 : JOBID 0  Submitted Feb11-01:36:19
Generation 1 Step 1 of Structure 8 at Calcfold8 : JOBID 0  Submitted Feb11-01:36:19
Generation 1 Step 1 of Structure 9 at Calcfold9 : JOBID 0  Submitted Feb11-01:36:19
Generation 1 Step 1 of Structure 10 at Calcfold10 : JOBID 0  Submitted Feb11-01:36:19
Generation 1 Step 1 of Structure 1 at Calcfold1 : JOBID 0  Submitted Feb11-01:36:56
Generation 1 Step 1 of Structure 2 at Calcfold2 : JOBID 0  Submitted Feb11-01:36:56
Generation 1 Step 1 of Structure 3 at Calcfold3 : JOBID 0  Submitted Feb11-01:36:56
Generation 1 Step 1 of Structure 4 at Calcfold4 : JOBID 0  Submitted Feb11-01:36:56
Generation 1 Step 1 of Structure 5 at Calcfold5 : JOBID 0  Submitted Feb11-01:36:56
Generation 1 Step 1 of Structure 6 at Calcfold6 : JOBID 0  Submitted Feb11-01:36:56
......

###### Submission/submitJob_local.py ########

$ cat Submission/submitJob_local.py
from subprocess import check_output
import re

def submitJob_local():
    """
    This routine is to submit job
    One needs to do a little edit based on your own case.
    Step 1: to prepare the job script which is required by your supercomputer
    Step 2: to submit the job with the command like qsub, bsub, llsubmit, .etc.
    Step 3: to get the jobID from the screen message
    :return:
    """

    # Step 1
    myrun_content = ''
    myrun_content += '#!/bin/sh\n'
    #myrun_content += '#SBATCH -o out\n'
    #myrun_content += '#SBATCH -p cpu\n'
    #myrun_content += '#SBATCH -J USPEX\n'
    #myrun_content += '#SBATCH -t 06:00:00\n'
    #myrun_content += '#SBATCH -N 1\n'
    #myrun_content += '#SBATCH -n 8\n'
    myrun_content += '#PBS -l nodes=1:ppn=1,walltime=2:30:00 -q batch\n'
    myrun_content += '#PBS -N USPEX\n'
    myrun_content += '#PBS -j oe\n'
    myrun_content += '#PBS -V \n'
    myrun_content += 'cd ${PBS_O_WORKDIR}\n' 
    # myrun_content += 'cd ${PBS_O_WORKDIR}\n' check this, must have /cephfs suffix with SBATCH in my case
    myrun_content += 'mpirun -np 1 vasp > log\n'
    with open('myrun', 'w') as fp:
        fp.write(myrun_content)
    # Step 2-3
    # It will output some message on the screen like '2350873.nano.cfn.bnl.local'
    #output = str(check_output('sbatch myrun', shell=True))
    output = str(check_output('qsub myrun', shell=True))
    jobNumber = int(re.findall(r'\d+', output)[0])
    return jobNumber


if __name__ == '__main__':
    with open('TEMPORARY_FILE', 'w') as fp:
        fp.write('I HAVE BEEN HERE')
    
    print('CALLBACK ')
    N = submitJob_local()
    print(str(N))


######### checkStatus_local.py #############

$ cat Submission/checkStatus_local.py 
import os
from subprocess import check_output
import glob
_author_ = 'etikhonov'


def checkStatus_local(jobID):
    """
    This function is to check if the submitted job is done or not
    One needs to do a little edit based on your own case.
    1   : whichCluster (0: no-job-script, 1: local submission, 2: remote submission)
    Step1: the command to check job by ID. 
    Step2: to find the keywords from screen message to determine if the job is done
    Below is just a sample:
    -------------------------------------------------------------------------------
    Job id                    Name             User            Time Use S Queue
    ------------------------- ---------------- --------------- -------- - -----
    2455453.nano              USPEX            qzhu            02:28:42 R cfn_gen04 
    -------------------------------------------------------------------------------
    If the job is still running, it will show as above.
    
    If there is no key words like 'R/Q Cfn_gen04', it indicates the job is done.
    :param jobID: 
    :return: doneOr
    """

    # Step 1
    output = str(check_output('qstat {}'.format(jobID), shell=True))
    # Step 2
    doneOr = True
    if ' R ' in output or ' Q ' in output:
        doneOr = False
    if doneOr:
        for file in glob.glob('USPEX*'):
            os.remove(file)  # to remove the log file
    return doneOr

Зэд Икс

unread,
Feb 10, 2019, 1:47:28 PM2/10/19
to USPEX
Of course, it will not read the jobs correctly when you don't modify the checkStatus_local.py according to your system.

Bin Li

unread,
Feb 11, 2019, 7:40:51 AM2/11/19
to USPEX
Hi, Зэд Икс,

Thank you!
Could you please give submitJob_local.py and checkStatus_local.py templates for PBS? I tried to modify these files. But USPEX still stucks at step 1, no matter how many times "USPEX -r ". 
Even I delete Submission/checkStatus_local.py, "USPEX -r "  returns no error. It seems that USPEX ignored checkStatus_local.py, and just submit the jobs.

To do a test in CalcFold1, I use a file named 'jobID' to save jobID, and try " python ../Submission/checkStatus_local.py" , its syntax seems correct. But problem still there.
######
$ python ../Submission/checkStatus_local.py 
CALLBACK 
1062133
CALLBACK 1
#####
 
### Submission/checkStatus_local.py ###
$ cat Submission/checkStatus_local.py 
import os
import argparse
import glob
from subprocess import check_output
_author_ = 'etikhonov'

print('CALLBACK ')
def checkStatus_local(jobID):
         
    #    with open('jobID', 'r') as fp:
    #    jobNumber = fp.read()
    # Step 1
    #output = str(check_output('qstat {}'.format(jobID), shell=True))
    output = str(check_output('qstat ' + str(jobID), shell=True))
    # Step 2
    doneOr = True
    #if ' R ' in output or ' Q ' in output:
    if 'R ' in output or 'Q ' in output:
        doneOr = False
    if doneOr:
        for file in glob.glob('USPEX*'):
            os.remove(file)  # to remove the log file
    return doneOr

#        parser = argparse.ArgumentPaser()
#        parser.add_argument('-j', dest='jobID', type=int)
#        args = parser.parser_args()
#        isDone = checkStatus_local(jobID=args.jobID)
#        print('CALLBACK ' + str(int(isDone)))
if __name__ == '__main__':
   with open('jobID', 'r') as fp:
        jobNumber = fp.read()
        isDone = checkStatus_local(jobNumber)
print(str(int(jobNumber)))
print('CALLBACK ' + str(int(isDone)))

##########
在 2019年2月11日星期一 UTC+8上午2:47:28,Зэд Икс写道:

klx...@gmail.com

unread,
Feb 13, 2019, 1:29:45 AM2/13/19
to USPEX
Hi, Bin Li

I have same problem using TORQUE(PBS) system.
You know, our submitJob_local.py can get jobID and may return it.
But USPEX (or MATLAB) may not be able to get the jobID.
In fact, if you check Current_POP.mat, you can see that JobID is all 0, even though you submit the job.

I will try a little more.

2019年2月11日月曜日 21時40分51秒 UTC+9 Bin Li:

Зэд Икс

unread,
Feb 13, 2019, 2:40:02 PM2/13/19
to USPEX
Please give me the followings:

1- An example of job file for your system.
2- Command for submitting a job. (i.e. sbatch, bsub, qsub, etc)
3- Command for Checking the queue system, with Jobs in the queue (i.e. squeue, bjobs, qstat, etc)
4- The output of the command in step 3. 

Bin Li

unread,
Feb 14, 2019, 5:01:15 AM2/14/19
to USPEX
Hi, Зэд Икс,

Here are file and commands. Thank you!

1. job file:

$ cat myrun 
#!/bin/sh
#PBS -l nodes=1:ppn=1,walltime=2:30:00 -q batch
#PBS -N USPEX
#PBS -j oe
#PBS -V 
cd ${PBS_O_WORKDIR}
mpirun vasp> log


2. Command for submitting a job

$ qsub myrun 

3. Command for Checking the queue system, with Jobs in the queue

$ qstat

4. The output of the command in step 3. 

$ qstat
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
725.R410                   USPEX            bli                    0 R batch   


在 2019年2月14日星期四 UTC+8上午3:40:02,Зэд Икс写道:

Konstantin Rushchanskii

unread,
Feb 15, 2019, 2:07:02 PM2/15/19
to USPEX
Dear developers,

I confirm the problem. I made extensive tests during last two days, nothing works for me.

It seems that the USPEX is not receiving jobID from the python script. And the checkStatus_local.py is not called at all.

I notice that the code for these subroutines in distribution is different from that given in the manual. 
Moreover, the version of the submitJob_local.py subroutine given in the manual seems to use Python3 formats, 
whereas the rest of code is for 2.7 

Therefore, it is difficult to guess what are the conventions used in new version of USPEX to parse the information between python parts and MatLab code.
May we have source MatLab code for job submission subroutine as well as for the part where calculations are checked to correct the python scripts or the problem is more involved?

Best,
Konstantin

USPEX

unread,
Feb 15, 2019, 6:26:54 PM2/15/19
to USPEX
Dear All, 

Thank you for your active contributions and useful comments. I'm about to update the scripts for local/remote submissions since it seems not working for many. 

So, I have a request to those who are challenged with local submission and didn't succeed to solve the problem (the same applies to remote submission if any is interested here).

I attached the new python scripts for local/remote submissions, and I want you to test them and let me know if there are any problems or if everything is fine, this way I'll update the package for future, so everyone can use the code without any problem.

These scripts are again for sbatch, but it is super easy to make your script for PBS, BSUB and other systems. 
And maybe it would be good to share your scripts for different systems here (after making sure they are working), so that other users can use them, if they are working with similar queue system as you.

*** Important: before testing these scripts you must:
-copy the uspex.x in the attached zip file to the installation path of your USPEX/application/archive/src/uspex.x
-specify commandExecutable in the INPUT.txt.

-For remote submission, do not forget to add the "working directory" in the remote computer to the INPUT.txt (as mentioned in the manual under the keyword: remoteFolder)


Best regards
USPEX-team
NewSubmissionFiles.zip

Bin Li

unread,
Feb 16, 2019, 5:41:33 AM2/16/19
to USPEX
Hi,

I replaced the uspex.x, edit the submitJob_local.py and INPUT.txt. But this time USPEX seems cannot submit a job. It returns a Matlab error "Cell contents reference from a non-cell array object.".

I tested submitJob_local.py by "python ../Submission/submitJob_local.py" in CalcFold1 folder, it works well (in submitJob_local.py,  set    jobNumber = submitJob_local(index=1, commandExecutable="vasp")      ).  So the problem maybe from uspex.x.

=============
$ python ../Submission/submitJob_local.py
CALLBACK 732

====output=====
$ USPEX -r
/bin/bash: synclient: command not found
*********************************************************
*                                                       *
  _|    _|     _|_|_|   _|_|_|     _|_|_|_|   _|      _| 
  _|    _|   _|         _|    _|   _|           _|  _|   
  _|    _|     _|_|     _|_|_|     _|_|_|         _|     
  _|    _|         _|   _|         _|           _|  _|   
    _|_|     _|_|_|     _|         _|_|_|_|   _|      _| 
*                                                       *
** USPEX v.10.2                          Oganov's Lab! **
*********************************************************
  
Structure 1 built with the symmetry group 190 (P-62c) , composition 2  0
Structure 2 built with the symmetry group 35 (Cmm2) , composition 0  2
Structure 3 built with the symmetry group 189 (P-62m) , composition 2  0
Structure 4 built with the topology 746 , composition 0  3
Structure 5 built with the topology 1880 , composition 3  0
Structure 6 built with the topology 845 , composition 0  2
 
 
Read Seeds ... 
 
Read AntiSeeds ...
Cell contents reference from a non-cell array object.
Error in submitJob (line 23)


Error in SubmitJobs (line 33)


Error in LocalRelaxation (line 22)


Error in EA_301 (line 12)


Error in Start (line 52)


Error in USPEX (line 39)
MATLAB:cellRefFromNonCell
===============

===============
$ ls -l /apps/USPEX-10.2/application/archive/src/
总用量 3836
-rwxr-xr-x. 1 root root     114 11月 18 22:09 clean
drwxr-xr-x. 5 root root    4096 2月  11 15:20 FunctionFolder
-rwxr-xr-x. 1 root root      36 11月 18 22:09 job
drwxr-xr-x. 2 root root    4096 2月  16 11:31 Submission
drwxr-xr-x. 2 root root    4096 2月  11 15:20 Submission.bak
-rwxr-xr-x. 1 root root 1299468 1月  31 15:32 uspex.old
-rwxr-xr-x. 1 root root 1299395 2月  16 16:46 uspex.x
======================

==========
$ ls -lrt CalcFold1/
total 172
-rwxr-xr-x. 1 bli bli    292 Feb 16 16:55 getStuff
-rw-rw-r--. 1 bli bli    164 Feb 16 16:55 INCAR
-rw-rw-r--. 1 bli bli 156808 Feb 16 16:55 POTCAR
-rw-rw-r--. 1 bli bli    321 Feb 16 16:55 POSCAR
-rw-rw-r--. 1 bli bli     26 Feb 16 16:55 KPOINTS


=====submitJob_local.py=======

$ cat Submission/submitJob_local.py
from __future__ import with_statement
from __future__ import absolute_import
from subprocess import check_output
import re
import sys
from io import open


def submitJob_local(index, commandExecutable):
    """
    This routine is to submit job locally
    One needs to do a little edit based on your own case.

    Step 1: to prepare the job script which is required by your supercomputer
    Step 2: to submit the job with the command like qsub, bsub, llsubmit, .etc.
    Step 3: to get the jobID from the screen message
    :return: job ID
    """

    # Step 1
    myrun_content = ''
    myrun_content += '#!/bin/sh\n'
    myrun_content += '#PBS -l nodes=1:ppn=1,walltime=3:30:00 -q batch\n'
    myrun_content += '#PBS -N USPEX-' + unicode(index) + '\n'
    myrun_content += '#PBS -j oe\n'
    myrun_content += '#PBS -V \n'
    myrun_content += 'cd ${PBS_O_WORKDIR}\n'
    # myrun_content += 'cd ${PBS_O_WORKDIR}\n' check this, must have /cephfs suffix with SBATCH in my case
    myrun_content += commandExecutable + '\n'
    #myrun_content += 'mpirun -np 1 vasp > log\n'

    with open('myrun', 'w') as fp:
        fp.write(myrun_content)

    # Step 2
    # It will output some message on the screen like '2350873.nano.cfn.bnl.local'
    output = unicode(check_output('qsub myrun', shell=True))

    # Step 3
    # Here we parse job ID from the output of previous command
    #jobNumber = int(re.findall(r'\d+', output)[0])
    jobNumber = re.findall(r'\d+', output)[0]
    return jobNumber


if __name__ == '__main__':
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', dest='index', type=int)
    #parser.add_argument('-c', dest='commandExecutable', type=unicode)
    parser.add_argument('-c', dest='commandExecutable', type=unicode)
    args = parser.parse_args()

    jobNumber = submitJob_local(index=args.index, commandExecutable=args.commandExecutable)
    #jobNumber = submitJob_local(index=1, commandExecutable="vasp")
    #jobNumber = submitJob_local(index=args.index, commandExecutable="vasp")
    print 'CALLBACK ' + unicode(jobNumber)

====================================


====== INPUT.txt======
$ cat INPUT.txt 
******************************************
*      TYPE OF RUN AND SYSTEM            *
******************************************
% PARAMETERS EVOLUTIONARY ALGORITHM
USPEX : calculationMethod (USPEX, VCNEB, META)
301   : calculationType (dimension: 0-3; molecule: 0/1; varcomp: 0/1)

% optType
enthalpy
% EndOptType

% atomType
Si C
% EndAtomType

% numSpecies
1 0
0 1
% EndNumSpecies

4     : populationSize 
6     : initialPopSize
3     : numGenerations 
3     : stopCrit

2     : minAt
3     : maxAt

0.40  : fracGene 
0.20  : fracRand 
0.20  : fracAtomsMut 
0.20  : fracTrans
0.00  : fracLatMut

abinitioCode 
1 1 
ENDabinit

% KresolStart
0.20 0.20 
% Kresolend

1     : numParallelCalcs 
1     : whichCluster 


% commandExecutable 
mpirun -np 1 vasp > output
% EndExecutable
=======================

Зэд Икс

unread,
Feb 16, 2019, 11:34:50 AM2/16/19
to USPEX
Dear Bin Li,

Thank you for testing and reporting the bug. 

Dear All, 
Here I attached the new files (Submission_new.zip), I checked these files myself and ran a local submission test without any problem. I'm sure that works for everyone.

If anyone can test the remote submission, that would be great.

Please don't forget to change the python files according to your system, if needed.
And copy the files (uspex.x and Submission folder) to "the installation path of your USPEX"/application/archive/src/

I'll wait for a few responses to make sure everything works fine. Then, will update the USPEX package.

Submission_new.zip

Konstantin Rushchanskii

unread,
Feb 18, 2019, 4:58:25 AM2/18/19
to USPEX
Dear Zahed,

I tried the last update. Now submission works fine, USPEX receives jobID.
During check the script checkStatus_local.py is called, and answered

CALLRESULT
0

or
CALLRESULT
1

But, USPEX stops with the following error:

----------------------------------------------------------
Structure1 step1 at CalcFold1
JobID=6785191
Operands to the || and && operators must be convertible to logical scalar values.
Error in checkStatusC (line 169)
Error in META_ReadJobs (line 13)
Error in META_LocalRelaxation (line 5)
Error in META (line 8)
Error in META_Start (line 32)
Error in USPEX (line 47)
MATLAB:nonLogicalConditional
----------------------------------------------------------

Best,
Konstantin

Зэд Икс

unread,
Feb 19, 2019, 2:46:34 PM2/19/19
to USPEX
Dear Konstantin,

I think you are not using the correct files, because you should not receive "CALLRESULT" but it should be  "<CALLRESULT>".
(in fact, you should not receive any of these but if any received it should be the one mentioned above).

please use follow my last respond of this post (in which I attached Submission_new.zip) and use the files in Submission_new.zip which I attached.

Then, if the problem is still there, please let me know.

Konstantin Rushchanskii

unread,
Feb 19, 2019, 3:00:57 PM2/19/19
to USPEX
Dear Zahed,

Sorry, I made mistake in my previous message. 
In fact the script writes

<CALLRESULT>
0

or
<CALLRESULT>
1

The problem is still there...

Also, I notice that in case the job is finished and it is not longer in the list, the answer could be like

slurm_load_jobs error: Invalid job id specified

<CALLRESULT>

1


It would be nice if USPEX could filter those extra messages.


Thank you in advance,

Konstantin


Зэд Икс

unread,
Feb 19, 2019, 3:18:41 PM2/19/19
to USPEX
Please attach your checkStatus_local.py and submitJob_local.py

You can filter messages like this (slurm_load_jobs error: Invalid job id specified), by adding a few lines to your checkStatus_local.py. It is open source and you are free to modify it according to your system. 
These messages appear during the run of checkStatus_local.py and not USPEX files, so the only place to avoid showing it, is there.

Konstantin Rushchanskii

unread,
Feb 19, 2019, 3:51:42 PM2/19/19
to USPEX
Enclosed are python files.

Submission works well, USPEX receives jobID.
checkStatus_local.py is properly called and answers 

<CALLRESULT>
0

At this point no error message from slurm was generated, because the job was in the queue.
Then USPEX stops with the error:
checkStatus_local.py
submitJob_local.py

Tomasz Pawlak

unread,
Feb 21, 2019, 4:00:31 AM2/21/19
to USPEX

Dears,

I recognized thay you have some problem with properly execute 'checkStatus_local.py' script. However I am one step before that. I can submit job but then the script 'checkStatus_local.py' is not called at all. I did not get any error etc. Just the calculations finished. in the nohup file  a see somethin like bellow. 


How to force Uspex to to run check status script ? 


====nohup.out=========
 
Individual : 1 -- JobID :2053

Individual : 2 -- JobID :2054

Individual : 3 -- JobID :2055

Individual : 4 -- JobID :2056

Submission dir copied to the current directory.

Vishank Kumar

unread,
Feb 25, 2019, 9:43:12 AM2/25/19
to USPEX
Dear USPEX Developers,

I am trying to use USPEX_v10.2 with the ABINIT interface and trying example EX01. I put the "abinitiocode: 18" and "whichcluster: 0" for local submission and I am getting the following error message:

Undefined function or variable 'listCommand'.
Error in createORG_AbinitCode (line 104)


Error in createORGStruc (line 83)


Error in Start (line 37)


Error in USPEX (line 39)
MATLAB:UndefinedFunction
 
I have modified the "Specific" folder to include ABINIT files: abinit_1, abinit_2, abinit_3, run.files, Si.psp (as provided by Zahed earlier). I think it is somehow related to linking the correct filenames.  Would you please let me know how can I link the files correctly?

Thanks,
Vishank

Vishank Kumar

unread,
Feb 25, 2019, 9:45:34 AM2/25/19
to USPEX
I forgot to attach the input file, here is the INPUT.txt file I used for the example.


On Sunday, 10 February 2019 19:21:03 UTC+1, Bin Li wrote:
INPUT.txt

lxf...@gmail.com

unread,
Feb 25, 2019, 10:07:00 AM2/25/19
to USPEX
Dear Bin Li
       I want to run USPEX v10.2 on a local single machine (I use mobaxterm to visit it) ,which number need I set in INPUT.txt? 0 or 1?, I set 0 :whichCluster firstly, it doesn't work ,and I set 1:whichCluster and modify the submitlocal.py as : 

 # Step 1
    myrun_content = ''
    myrun_content += '#!/bin/sh\n'
    #myrun_content += '#SBATCH -o out\n'
    #myrun_content += '#SBATCH -p cpu\n'
    #myrun_content += '#SBATCH -J USPEX-' + str(index) + '\n'
    #myrun_content += '#SBATCH -t 06:00:00\n'
    #myrun_content += '#SBATCH -N 1\n'
    #myrun_content += '#SBATCH -n 8\n'
    #myrun_content += 'cd ${PBS_O_WORKDIR}\n' check this, must have /cephfs suffix with SBATCH in my case
    myrun_content += 'mpirun vasp_std.base > log\n'
    with open('myrun', 'w') as fp:
        fp.write(myrun_content)
......
it doesn't work either, you said you set 0: whichCluster and it work well, so I want to ask you how to do ? Thank you 

在 2019年2月11日星期一 UTC+8上午2:21:03,Bin Li写道:

Зэд Икс

unread,
Feb 25, 2019, 2:13:46 PM2/25/19
to USPEX
Dear Vishanak,

You didn't specify the command for running your abinitio code. Please add it to INPUT.txt (for more info see the manual)

For example: 
% commandExecutable 
abinit < input >& out
% EndExecutable

lxf...@gmail.com

unread,
Feb 25, 2019, 7:59:13 PM2/25/19
to USPEX
Hi
 I want to run USPEX v10.2 on a local single machine (I use mobaxterm to visit it) ,which number need I set in INPUT.txt? 0 or 1?, I set 0 :whichCluster firstly, it doesn't work ,and I set 1:whichCluster and modify the submitlocal.py as : 

 # Step 1
    myrun_content = ''
    myrun_content += '#!/bin/sh\n'
    #myrun_content += '#SBATCH -o out\n'
    #myrun_content += '#SBATCH -p cpu\n'
    #myrun_content += '#SBATCH -J USPEX-' + str(index) + '\n'
    #myrun_content += '#SBATCH -t 06:00:00\n'
    #myrun_content += '#SBATCH -N 1\n'
    #myrun_content += '#SBATCH -n 8\n'
    #myrun_content += 'cd ${PBS_O_WORKDIR}\n' check this, must have /cephfs suffix with SBATCH in my case
    myrun_content += 'mpirun vasp_std.base > log\n'
    with open('myrun', 'w') as fp:
        fp.write(myrun_content)
......
it doesn't work either, you said you set 0: whichCluster and it work well, so I want to ask you how to do ? Thank you 

在 2019年2月26日星期二 UTC+8上午3:13:46,Зэд Икс写道:

Tomasz Pawlak

unread,
Feb 26, 2019, 3:55:19 AM2/26/19
to USPEX
Hi,

I repeat slightly my question. I can submit job but then the script 'checkStatus_local.py' is not called at all. I did not get any error etc. nothing.. Just the calculations finished without any error/comment. in the "nohup.out" file a see something like bellow. Please find attached also my INPUT and Python files. What I am doing wrong ?


I will appreciate for any comment,
Tomasz
checkStatus_local.py
INPUT.txt
submitJob_local.py

Зэд Икс

unread,
Feb 26, 2019, 8:27:57 AM2/26/19
to USPEX
Dear Tomasz,

First, please remove all "print" things that you added to the python files, and let only those exist that needed in the file (and are initially written by developers).
Second, if you receive the job id and then without any error USPEX quits, that is just the way USPEX works in case that whichCluster is not 0. So, users should run USPEX with a bash script that every 3-5 min runs USPEX.
Please see the manual for it.

You can try to run USPEX -r again and if the jobs are finished you must see that USPEX calls checkStatues_local.py and then it will submit new structures.

Tomasz Pawlak

unread,
Feb 26, 2019, 10:22:07 AM2/26/19
to USPEX


Than you ! 
Honestly, I did not find it in manual (I supposed I read it carefully) that I have to manually repeat running the USPEX -r script. Now it working without any error. CheckStatus is called properly etc. 

Thank you for short but huge comment ;)

1611...@st.vju.ac.vn

unread,
Feb 27, 2019, 1:11:45 AM2/27/19
to USPEX
Hello,
Although I copied these files as guideline, I stilled got the error like this:
==========

Submission dir copied to the current directory.

Failed to connect to X Server.

*********************************************************

*                                                       *

  _|    _|     _|_|_|   _|_|_|     _|_|_|_|   _|      _| 

  _|    _|   _|         _|    _|   _|           _|  _|   

  _|    _|     _|_|     _|_|_|     _|_|_|         _|     

  _|    _|         _|   _|         _|           _|  _|   

    _|_|     _|_|_|     _|         _|_|_|_|   _|      _| 

*                                                       *

** USPEX v.10.2                          Oganov's Lab! **

*********************************************************

  

Structure 1 built with space group 3 (P2)

Structure 2 built with space group 112 (P-42c)

Structure 3 built with space group 89 (P422)

Structure 4 built with space group 8 (Am)

Structure 5 built with space group 177 (P622)

Structure 6 built with space group 104 (P4nc)

Structure 7 built with space group 17 (P222_1)

Structure 8 built with space group 137 (P4_2/nmc)

Structure 9 built with space group 149 (P312)

Structure 10 built with space group 210 (F4_132)

Structure 11 built with space group 34 (Pnn2)

Structure 12 built with space group 46 (Ima2)

Structure 13 built with space group 188 (P-6c2)

Structure 14 built with topology 1524

Structure 15 built with topology 932

Structure 16 built with topology 1497

Structure 17 built with topology 1845

Structure 18 built with topology 1506

Structure 19 built with topology 872

Structure 20 built with topology 1472

 

Read Seeds ... 

 

Read AntiSeeds ...

Cell contents reference from a non-cell array object.

Error in submitJob (line 23)



Error in SubmitJobs (line 33)



Error in LocalRelaxation (line 22)



Error in EA_300 (line 13)



Error in Start (line 52)



Error in USPEX (line 39)

MATLAB:cellRefFromNonCell

==========
Anyone can help me how to fix it? Thank you so much!


Vào 08:26:54 UTC+9 Thứ Bảy, ngày 16 tháng 2 năm 2019, USPEX đã viết:
Error.rtf

Celine Dupont

unread,
Feb 27, 2019, 3:09:01 AM2/27/19
to USPEX
Dear Developpers,

I also encounter some troubles with local submission of this new version of USPEX.

I can submit my job, USPEX creates properly the different structures and launch VASP.
However when a VASP calculation is done, the following is not launch.

Please find enclosed :
- the submitJob and checkstatus files
- the error message I have
- what I have with qstat on my cluster

Thanks in advance for your help
Céline
checkStatus_local.py
error
qstat
submitJob_local.py

Celine Dupont

unread,
Feb 27, 2019, 4:02:38 AM2/27/19
to USPEX
There is just a small typo in the file "checkstatus_local.py" I sent previously.
Here is the right version of the files I used

Thanks
Celine
checkStatus_local.py
error
qstat
submitJob_local.py

klx...@gmail.com

unread,
Feb 27, 2019, 4:22:14 AM2/27/19
to USPEX
Hi Celine

"check_output" subprocess must return 0.
When you use the qstat command and the job has finished, grep cannot find the job-number and returns non-zero status.

So I am using following commands:

output = unicode(check_output(u'qstat | grep {}; true'.format(jobID), shell=True))

This is not a good way, but it returns zero.
If anyone knows a smarter way, please let me know.

2019年2月27日水曜日 18時02分38秒 UTC+9 Celine Dupont:

Celine Dupont

unread,
Feb 27, 2019, 4:52:03 AM2/27/19
to USPEX
Hi

Thanks a lot for your fast answer. I will do some attempts

Best
Céline

Xabier Méndez Aretxabaleta

unread,
Feb 27, 2019, 9:27:13 AM2/27/19
to USPEX
I get the same error.

Jing Meng

unread,
Jul 17, 2019, 9:45:01 AM7/17/19
to USPEX
Dear  Зэд Икс 
I try to use the script  (1      : whichClusteryou) last upload on the forum,In the user giude siad :needs to tell USPEX how to submit the job and check if the job has completed or not. But I don't know which parts should I change.Could you tell us the detail how to change the script,and could you make explanatory notes  after each command line? And  this will help a lot of users. Thank you very much !
在 2019年2月17日星期日 UTC+8上午12:34:50,Зэд Икс写道:
submitJob_local.py
checkStatus_local.py
Reply all
Reply to author
Forward
0 new messages