Interfacing AE Net with a Monte Carlo Code

93 views
Skip to first unread message

troydl...@gmail.com

unread,
Oct 15, 2018, 12:53:58 AM10/15/18
to aenet
Hi,

I'm in the process of developing an interface between AENet and a Monte Carlo based code in order to perform some property calculations.  I had some success getting AENet to properly return the energy to the MC code, but I have been having troubles with segmentation faults originating from the AENet code.file.  The MC Code is also in Fortran.

In particular with optimization on (Intel Fortran Compiler) it segment faults on line 353 of symmfunc.f90 with the error

forrtl: severe (408): fort: (2): Subscript #1 of the array SF_NG_TYPE has value 1212789451 which is greater than the upper bound of 2

With the iFort debug flags turned fully on, I also get an error without optimization on line 137 of lclist.f90 since an unallocated array (Cvec) is passed to translation_vectors.

I was wondering if someone might be able to help me in resolving these issues.  In this particular case, I am calling the AEnet functions directly as an external library and have a wrapper which calls the get_energy and initialize functions.  Initialize works with little problem, it is when I go to compute the energy that the segmentation faults begin.

Thanks

-Troy

Nongnuch Artrith

unread,
Oct 15, 2018, 5:30:55 PM10/15/18
to troydl...@gmail.com, aenet
Dear Troy,

It would be helpful if you could send us a minimal code example that showcases these issues.  Without those I can only comment in general:

In particular with optimization on (Intel Fortran Compiler) it segment faults on line 353 of symmfunc.f90 with the error 
This module has been stable and unchanged for many years, and line 353 looks very innocent.  Which level of optimization are you using with ifort?  We found that "-o3" with some versions of ifort breaks the code, while "-o2" gives an efficient and stable binary with all versions we tested.

With the iFort debug flags turned fully on, I also get an error without optimization on line 137 of lclist.f90 since an unallocated array (Cvec) is passed to translation_vectors.
The first call to 'translation_vectors' actually only counts the translation vectors, and only in the second call they are stored in the cell vector array 'Cvec'.  So it is fine to call the subroutine with 'Cvec' being unallocated.  We could make 'Cvec' an optional parameter to get rid of the compiler warning, but definitely this is not a bug.

With a minimal example code we might be able to give more specific advice. 

Best,
Nong

troydl...@gmail.com

unread,
Oct 15, 2018, 6:49:19 PM10/15/18
to aenet
Hi,

I can provide the wrapper code in full since it is a self-contained module, but in general most of the code deals with converting the MC's code's atomistic information into a format that get_energy in predict.f90 can understand. 

      call initialize_lib(str1, str2, indata)


I call the "initialize" function from AENet in the constructor function which gets called during the simulation initialization so it is called once at the start of the simulation. The only major changes I have made to the AENet code is making it so it beings reading from the second input file on the command line instead of the first to avoid having it read the MC code's input script and also added the _lib extension to routine names to avoid a name space conflict. However, this loads just fine since I was able to get the same energy as the predict executable for a single configuration case.  The actual energy calculation lines are similar to what is found in predict.f90.

      boxrecp(:,:) = geo_recip_lattice(box)

      tempcoords(1:3,1:nCurAtoms) = matmul(boxrecp, tempcoords)/(2.0E0_dp*pi)

      call get_energy(box, nCurAtoms, tempcoords(1:3, 1:nCurAtoms), atomTypes(1:nCurAtoms), pbc, Ecoh, E_T)


Outside of that the remaining code deals mainly Monte Carlo code's formatting and I have double checked to ensure the information being passed is consistent with the information predict.f90 would receive.  The problem occurs when an actual simulation run is attempted.

I did use -O3 -xHost so I can try with -O2.  In my experience the line at 353 error occurs when Intel Fortran encounters a segmentation fault earlier in the code and it modifies the next array or item in memory.  GFortran is less likely to do this.  Intel Fortran can be picky about how unallocated arrays are passed.  

I tried to check to ensure I wasn't missing a function call, but it seemed like most of the other function calls in predict.f90 were related to I/O or Parallelization. 

Any insight would be great,  thanks.

-Troy
FF_AENet.f90

troydl...@gmail.com

unread,
Oct 15, 2018, 6:52:40 PM10/15/18
to aenet
Oops accidentally did a copy paste when copying the code file and a few lines were messed up.  Here's the correct wrapper code.
FF_AENet.f90
Reply all
Reply to author
Forward
0 new messages