Dear all,
I have a run with
###########################################
integer, parameter :: ncpus=1600,nprocx=4,nprocy=20,nprocz=20
integer, parameter :: nxgrid=400,nygrid=400,nzgrid=400
integer, parameter :: npar=nxgrid*nygrid*nzgrid*10, mpar_loc=npar/160, npar_mig=204800
integer, parameter :: npar_stalk=1.0e4
###########################################
and run it on a computer where I am the only user!
But after the first run (max_walltime=16000 ! hard walltime = 4,5 hr ) it does not continue, with error message
###########################################
Subcommand: run
forrtl: severe (67): input statement requires too much data, unit 88, file /isaac/u/aschreib/pc_projects/scaling_tests/a_3d_max_scaling/1600_42020/data/proc1077/var.dat
Image PC Routine Line Source
run.x 0000000000411FFC Unknown Unknown Unknown
run.x 00000000004460DE Unknown Unknown Unknown
run.x 00000000004437FB Unknown Unknown Unknown
run.x 000000000073BE5D Unknown Unknown Unknown
run.x 000000000073A3B9 Unknown Unknown Unknown
run.x 00000000005092F5 Unknown Unknown Unknown
run.x 00000000004039DE Unknown Unknown Unknown
libc-2.22.so 00002B7801FED6E5 __libc_start_main Unknown Unknown
run.x 00000000004038E9 Unknown Unknown Unknown
forrtl: severe (67): input statement requires too much data, unit 88, file /isaac/u/aschreib/pc_projects/scaling_tests/a_3d_max_scaling/1600_42020/data/proc983/var.dat
Image PC Routine Line Source
run.x 0000000000411FFC Unknown Unknown Unknown
run.x 00000000004460DE Unknown Unknown Unknown
run.x 00000000004437FB Unknown Unknown Unknown
run.x 000000000073BE5D Unknown Unknown Unknown
run.x 000000000073A3B9 Unknown Unknown Unknown
run.x 00000000005092F5 Unknown Unknown Unknown
run.x 00000000004039DE Unknown Unknown Unknown
libc-2.22.so 00002B3F6BAFB6E5 __libc_start_main Unknown Unknown
run.x 00000000004038E9 Unknown Unknown Unknown
forrtl: severe (67): input statement requires too much data, unit 88, file /isaac/u/aschreib/pc_projects/scaling_tests/a_3d_max_scaling/1600_42020/data/proc505/var.dat
###########################################
Furthermore, timestamp gets inconsistent:
###########################################
...
SVN: ------- v. ( ) $Id$
The verbose level is ip= 14 (ldebug= F )
This is a 3-D run
nxgrid, nygrid, nzgrid= 400 400 400
Lx, Ly, Lz= 2.000000000000000E-002 2.000000000000000E-002
2.000000000000000E-002
Vbox= 8.000000000000001E-006
Timestamps in snapshot INCONSISTENT. Using (max) t= 8.307890670577762E-002
with ireset_tstart= 2 .
Timestamps in snapshot INCONSISTENT. Using t= 8.307890670577762E-002 .
###########################################
The first run ends with
###########################################
7500 0.080 1.065E-05 1.542E+04 1.279E-08 5.282E-08 1.209E-06 2.246E-05 7.209E-07 7.209E-07 4.866E-03 1.184E-05 9.470E-11 9.999E-01 1.000E+00 1.000E+00 0.000E+00 7.854E+00 3.600E+01 -2.715E-03 2.356E+00 6.467E+00 -1.943E-22 1.467E-08 9.485E-08 3.049E-07 3.077E-05 4.661E-05
7600 0.081 1.065E-05 1.564E+04 1.294E-08 5.291E-08 1.227E-06 2.255E-05 7.292E-07 7.292E-07 4.877E-03 1.189E-05 9.513E-11 9.999E-01 1.000E+00 1.000E+00 0.000E+00 7.854E+00 3.200E+01 -2.376E-02 2.356E+00 6.571E+00 -2.776E-22 1.486E-08 9.480E-08 3.113E-07 3.072E-05 4.655E-05
Maximum walltime exceeded
Simulation finished after 7684 time-steps
Writing final snapshot at time t = 8.201366450196135E-002
Wall clock time [hours] = 4.44 (+/- 2.78E-10)
Wall clock time/timestep/(meshpoint+particle) [microsec] = 2.958E-03
Fri Feb 17 19:18:30 2017
Running
pc_deprecated_slice_links
###########################################
So everything looks fine.
Makefile.local is:
###########################################
MPICOMM = mpicomm
HYDRO = hydro
DENSITY = density
ENTROPY = noentropy
MAGNETIC = nomagnetic
RADIATION = noradiation
PSCALAR = nopscalar
GRAVITY = nogravity
FORCING = noforcing
SHEAR = shear
SHOCK = shock_highorder
PARTICLES = particles_dust
PARTICLES_MAP = particles_map
FOURIER = fourier_fftpack
REAL_PRECISION = double
SELFGRAVITY = selfgravity
POISSON = poisson
PARTICLES_SELFGRAVITY = particles_selfgravity
PARTICLES_STALKER = particles_stalker
###########################################
Computer is fresh and new. So maybe I do an compiler error?
My guess is, that I am somehow allocating too much storage for the particles and snapshots are not written completely? Or maybe, output somehow gets too large and is not read in consistently, since not enough storage is allocated when resuming the run.
Anyone an idea whats going on?
best,
Andreas