try to run USPEX in parallel locally

219 views
Skip to first unread message

remi marchal

unread,
Mar 22, 2018, 6:37:49 AM3/22/18
to USPEX
Dear USPEX developer,

I am quite new in the USPEX community and try to run USPEX in parallel locally.

Indeed, I have a 36 core machine and want to run USPEX which several parallel jobs on this machine locally using OCTAVE.

I first try to run the Eemple EXO1

I changed the submit_local.m function in order to launch VASP through the nohup command and to extract the PID into a file.

Thus, I changed the checkStatus_local.m function in order to check is the calculation is still running through a ps -ef | grep PID_NUMBER command.

I include both files at the post and also attach them.

Then, I set
1     : whichCluster 
2     : numParallelCalcs

and launch USPEX through the USPEX -o -r command.
The calculation starts, launch the 2 parallel and then stops withe some commands like:
sh: 2: .JOB-Gen1Ind1Step1Fold1: not found

I also include the log at the end and also attach is.

Can anybody help me please.

Best regards.

Rémi


HERE IS THE LOG
GNU Octave, version 3.8.2
Copyright (C) 2014 John W. Eaton and others.
This is free software; see the source code for copying conditions.
There is ABSOLUTELY NO WARRANTY; not even for MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  For details, type 'warranty'.

Octave was configured for "x86_64-pc-linux-gnu".

Additional information about Octave is available at http://www.octave.org.

Please contribute if you find this software useful.
For more information, visit http://www.octave.org/get-involved.html

Read http://www.octave.org/bugs.html to learn how to submit bug reports.
For information about changes from previous versions, type 'news'.

  

  Initial space group: Pba2 (32)
  Actual  space group: Pba2 (32) (determined with tolerance=0.1)
Structure 1 built with the symmetry group 32 (Pba2)
  Initial space group: Fd-3 (203)
  Actual  space group: Fd-3m (227) (determined with tolerance=0.1)
Structure 2 built with the symmetry group 203 (Fd-3)
  Initial space group: I432 (211)
  Actual  space group: Im-3m (229) (determined with tolerance=0.1)
Structure 3 built with the symmetry group 211 (I432)
  Initial space group: P-3c1 (165)
  Actual  space group: P-3m1 (164) (determined with tolerance=0.1)
Structure 4 built with the symmetry group 165 (P-3c1)
  Initial space group: F23 (196)
  Actual  space group: F-43m (216) (determined with tolerance=0.1)
Structure 5 built with the symmetry group 196 (F23)
  Initial space group: Fm-3c (226)
  Actual  space group: Pm-3m (221) (determined with tolerance=0.1)
Structure 6 built with the symmetry group 226 (Fm-3c)
  Initial space group: P-4c2 (116)
  Actual  space group: P4mm (99) (determined with tolerance=0.1)
Structure 7 built with the symmetry group 116 (P-4c2)
  Initial space group: Pmna (53)
  Actual  space group: Pmna (53) (determined with tolerance=0.1)
Structure 8 built with the symmetry group 53 (Pmna)
  Initial space group: C2cm (40)
  Actual  space group: P1 (1) (determined with tolerance=0.1)
Structure 9 built with the symmetry group 40 (C2cm)
  Initial space group: P6/m (175)
  Actual  space group: P6/m (175) (determined with tolerance=0.1)
Structure 10 built with the symmetry group 175 (P6/m)
  Initial space group: Imma (74)
  Actual  space group: P-1 (2) (determined with tolerance=0.1)
Structure 11 built with the symmetry group 74 (Imma)
  Initial space group: P4_2cm (101)
  Actual  space group: P4_2cm (101) (determined with tolerance=0.1)
Structure 12 built with the symmetry group 101 (P4_2cm)
  Initial space group: Pn-3n (222)
  Actual  space group: Pm-3m (221) (determined with tolerance=0.1)
Structure 13 built with the symmetry group 222 (Pn-3n)
  Initial space group: C2cb (41)
  Actual  space group: C2cb (41) (determined with tolerance=0.1)
Structure 14 built with the symmetry group 41 (C2cb)
  Initial space group: P6_3mc (186)
  Actual  space group: P6_3mc (186) (determined with tolerance=0.1)
Structure 15 built with the symmetry group 186 (P6_3mc)
  Initial space group: P4bm (100)
  Actual  space group: P4bm (100) (determined with tolerance=0.1)
Structure 16 built with the symmetry group 100 (P4bm)
  Initial space group: P4nc (104)
  Actual  space group: P4/mnc (128) (determined with tolerance=0.1)
Structure 17 built with the symmetry group 104 (P4nc)
  Initial space group: Fmmm (69)
  Actual  space group: A2/m (12) (determined with tolerance=0.1)
Structure 18 built with the symmetry group 69 (Fmmm)
  Initial space group: P4_2 (77)
  Actual  space group: P4_2 (77) (determined with tolerance=0.1)
Structure 19 built with the symmetry group 77 (P4_2)
  Initial space group: P4mm (99)
  Actual  space group: P4/mmm (123) (determined with tolerance=0.1)
Structure 20 built with the symmetry group 99 (P4mm)

 

ATTENTION! In 12 / 20 cases actually generated symmetry was different.

 

 

Read Seeds ... 

 

Read AntiSeeds ...
a = 0
b = 
a = 0
b = 104282



104282

sh: 2: .JOB-Gen1Ind1Step1Fold1: not found
a = 0
b = 
a = 0
b = 104309



104309

sh: 2: .JOB-Gen1Ind2Step1Fold2: not found

---------------------------------------------------------------------------

HERE IS THE sumitJob_local.m file
function jobNumber = submitJob_local()
%-------------------------------------------------------------
%This routine is to check if the submitted job is done or not
%One needs to do a little edit based on your own case.

%1   : whichCluster (default 0, 1: local submission, 2: remote submission)
%-------------------------------------------------------------
[a,b] = unix(['nohup /cluster_cti/utils/openmpi/openmpi-1.10.2/bin/mpirun -np 2 /cluster_cti/bin/VASP/bin/vasp_std > /dev/null 2>&1 & echo $! > run.pid'])
[a,b] = unix(['cat run.pid'])
unix(['echo ' a]);
unix(['echo ' b]);
jobNumber = b;

---------------------------------------------------------------------------

HERE IS THE checkStatus_local.m file
function doneOr = checkStatus_local(jobID)
%--------------------------------------------------------------------
%This routine is to check if the submitted job is done or not
%One needs to do a little edit based on your own case.
%1   : whichCluster (0: no-job-script, 1: local submission, 2: remote submission)
%--------------------------------------------------------------------

[a,b] = unix([' ps -ef | grep "mpirun" | grep ' jobID ' | wc -l'])
isOK=str2num( b );
if  isOK==0
        doneOr = 1;
    else
        doneOr = 0;
    end

---------------------------------------------------------------------------
checkStatus_local.m
log
sumitJob_local.m
Reply all
Reply to author
Forward
0 new messages