Problem with restarting

Nadeem Malik

unread,

Aug 14, 2020, 1:56:28 PM8/14/20

to Nek5000

Hello Neks,

I tried to continue my previous run with restart file, but it began running from the start again, overwriting all my previous files.

Obviously I am missing something -- greatly appreciate if this could be sorted out asap.

(1) A ran e3q with the following parameters, printing out some velcoity, vorticity files every 5000 steps:

#

# nek parameter file

#

[GENERAL]

stopAt = numsteps

numsteps = 5000

dt = 5.0e-04

timeStepper = bdf3

targetCfl = 0.5

TIMESTEP

writeInterval = 5000

.....

This has run ok, and at the end of 5000 timestps, my directly contains the files:

build.log

e3q0.f00001

e3q.box

e3q.f

e3q.ma2

e3q.par

e3q.re2

e3q.usr

jobscript

makefile

nek5000

obj

SESSION.NAME

SIZE

V_circulation.dat

vele3q0.f00001

vele3q0.f00002

vele3q0.f00003

vele3q0.f00004

vele3q0.f00005

vele3q0.f00006

vele3q.nek5000

vore3q0.f00001

vore3q0.f00002

vore3q0.f00003

vore3q0.f00004

vore3q0.f00005

vore3q0.f00006

vore3q.nek5000

(2) Now I want to continue from the above for an other 1000 timesteps (say), so I submitted exactly the same, except new *.par entries:

#

# nek parameter file

#

[GENERAL]

stopAt = numsteps

numsteps = 6000 ! THIS IS THE ONLY DIFFERENCE

dt = 5.0e-04

timeStepper = bdf3

targetCfl = 0.5

TIMESTEP

writeInterval = 5000

.....

As you c an see, the only difference with the initial run is numsteps=6000.

When I submit this, it starts from the beginning isteps=1, simply overwriting the previous files -- no progress.

Obviously, there must be a flag for restart and reading the dump file (which is the dump file anyway ?).

Or do I have to explicitly printout the dump file for restart inside the *.usr file in my first run?

Thanks

Nadeem

YuHsiang Lan

unread,

Aug 14, 2020, 2:13:01 PM8/14/20

to Nek5000

Hi Nadeem,

Nek5000 always start from istep=1, but if you restart from a file, it will continue the "time" from the solution.

Yes, Nek will dump file from f0001, so you have to move files to certain location to avoid overwrite.

Here is my workflow:

mkdir run1

mv *0.f0* run1

cp run1/e3q0.f00006 r1.fld

Note:

- e3q0.f00006 is the latest one, make sure the simulation didn't stop in the middle of dumping files so the file is complete.

- I like to rename the extension is ".fld" so I can mv *0.f0* without moving the restart file.

- A memory efficient way is not "cp" file, but "link" file. Personally, "cp" is more robust for me.

Then set the restart file in par file by adding

  startFrom = r1.fld

under [general]

ref: https://nek5000.github.io/NekDoc/problem_setup/case_files.html?highlight=restart

Hope this helps,

Yu-Hsiang

--

Philipp Schlatter

unread,

Aug 14, 2020, 2:17:32 PM8/14/20

to nek...@googlegroups.com

Hi,

probably important for your case (as I assume you want to do high-fidelity simulations); make sure to do a "full" restart based on 3 consecutive velocity fields as to not reduce the temporal order. Not sure whether there is an example that supports that, if not, we can help you setting it up.

Best,

Philipp

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.

Stefan K.

unread,

Aug 14, 2020, 2:21:31 PM8/14/20

to Philipp Schlatter, Nek5000

https://github.com/Nek5000/NekExamples/blob/master/tgv/tgv.usr#L85

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.

nadeem...@cantab.net

unread,

Aug 14, 2020, 3:08:08 PM8/14/20

to Philipp Schlatter, nek...@googlegroups.com

Yes, this is important for me.

How do I adjust the *.usr file (attached) to do this?

Thanks

Nadeem

To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.

e3q.usr

Nadeem Malik

unread,

Aug 18, 2020, 9:23:17 PM8/18/20

to Nek5000

I have looked at the link, see at the end (below). It is not quite clear to me.

In first run 1, I first need to dump the last 3 time steps, say istep=998, 999, and 1000. How do I do this?

If I assume that these last 3 dump files are labelled run1.f00008, run1.f00009, and run1.f00010 respectively, what do I do next?

From Stefan's workflow example, should I do;

mkdir run2
mv *0.f0* run2

cp run2/run1.f00008 r1.fld

cp run2/run1.f00009 r2.fld

cp run2/run1.f00010 r3.fld

Next, should I update the *.par file to,

[GENERAL]

startFrom = r1.fld

stopAt = numsteps

numsteps =1000

dt = 2.50e-04

timeStepper = bdf3

targetCfl = 0.5

TIMESTEP

writeInterval = 1000

... but what about r2.fld and r3.fld?

Next, how I modify userchk? Should I un-comment the commented lines?

subroutine userchk()
	include 'SIZE'
	include 'TOTAL'

	common /SCRNS/ w1 (lx1ly1lz1*lelv),
	& w2 (lx1ly1lz1*lelv),
	& omg(lx1ly1lz1*lelv,ldim)

	character*80 fnames(3)

	n = nx1ny1nz1*nelv

	c if (.false.) then
	c call blank(fnames,size(fnames)*80)
	c fnames(1) ='rs6tgv0.f00001'
	c fnames(2) ='rs6tgv0.f00002'
	c fnames(3) ='rs6tgv0.f00003'
	c call full_restart(fnames,3) ! replace istep=0,1,..
	c endif
	c
	c iostep_full = iostep
	c call full_restart_save(iostep_full)

	if (mod(istep,50).ne.0) return

	sum_e1 = 0.
	sum_e2 = 0.
	call curl(omg,vx,vy,vz,.false.,w1,w2)
	do i = 1,n
	vv = vx(i,1,1,1)2 + vy(i,1,1,1)2 + vz(i,1,1,1)**2
	oo = omg(i,1)2 + omg(i,2)2 + omg(i,3)**2
	sum_e1 = sum_e1 + vv*bm1(i,1,1,1)
	sum_e2 = sum_e2 + oo*bm1(i,1,1,1)
	enddo
	e1 = 0.5 * glsum(sum_e1,1) / volvm1
	e2 = 0.5 * glsum(sum_e2,1) / volvm1
	if (nid.eq.0) write(6,2) time, e1, e2
	2 format(1p3e13.4,' monitor')

	return
	end

Thanks

Nadeem

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

YuHsiang Lan

unread,

Aug 19, 2020, 12:06:25 AM8/19/20

to Nek5000

Hi Nadeem,

It has tons of details. I recommend to test it before applying onto the large computations.

- First, you need to dump the full-restart files.

They area little bit different than the "normal" nek field fields. They are in double precision, and sometimes do extra interpolation for pressure onto pN grid (if needed, meaning you are using PnPn-2).

This can be achieved by simply put the following into the userchk.

	c iostep_full = iostep
	c call full_restart_save(iostep_full)

To save the memory, Nek will dump restart files alternatively into two sets [1-3] or [4-6].

The outcome is, you will get 3 or 6 files, named rs6<case>0.f000[1-6]. (assuming you are using bdf2+ext3 or bdf3+ext3).

Let suppose you have files rs6e3q0.f0000[1-3], which are dumped at iostep, iostep+1, and iostep+2 th timestep.

You can double check the time/timesteps of the field files by reading the header, `head -1 re6*0.f0*`

Recommend backup those 3 or 6 files for every jobs.

- The second, or the successive runs, should include the following into the beginning of the userchk

character*80 fnames(3)

call blank(fnames,size(fnames)*80)

fnames(1) ='rs6tgv0.f00001'

fnames(2) ='rs6tgv0.f00002'

fnames(3) ='rs6tgv0.f00003'

call full_restart(fnames,3) ! replace istep=0,1,..

Replace the fnames by the full restart files you got from previous runs.

The userchk at the zero-th, first, and the second timestep (again, assume you have 3 saved states) will read the filename stored in the "fname(1~3)".

That will do the trick to flush out the history for the bdf+ext timestepping. You should get solution recovered after third or fourth timesteps.

- Some notes:

- You probably won't get all 16-digits recover. It's achievable under certain condition.

Reasons: Some noise from mpi communication, some extra history (can up to 20+ states) used in residual projection, or maybe bad implementation to introduce more lag in time.

- How to test this?

Simple but easy to make mistakes.

- referenced job: just run from 1 to 20 steps.

- test job,

- run1: run 1 to 20 step, but set iostep=10, it will dump full-restart files at step 10 11 12

- run2: full-restart from step run1, run 10 steps.

run1 istep=10 istep=11 istep=12

run2 istep=0 istep=1 istep=2

Then compared the results between ref (20th step) and run2(10-th) step.

YuHsiang Lan

unread,

Aug 19, 2020, 12:12:27 AM8/19/20

to Nek5000

some related discussion can be found at here:

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!searchin/nek5000/full$20restart|sort:date/nek5000/6fYhZAgh17k/Lj-AEXatBQAJ

Yu-Hsiang

--

Philipp Schlatter

unread,

Aug 19, 2020, 10:22:13 AM8/19/20

to Nek5000

For completeness I wanted to mention another alternative. You could go via our KTH Framework. The ext_cyl_DNS example under https://github.com/KTH-Nek5000/KTH_Examples uses our full restart, which can be controlled fully via the input par file. The documentation for all of that is here (doxygen):

https://kth-nek5000.github.io/KTH_Framework/index.html
being more specific
https://kth-nek5000.github.io/KTH_Framework/group__chkpoint.html
and
https://kth-nek5000.github.io/KTH_Framework/group__chkpoint__mstep.html

Philipp

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/f1450d5e-b978-4c55-ab74-d7bc14110738o%40googlegroups.com.

nadeem...@cantab.net

unread,

Aug 19, 2020, 5:20:04 PM8/19/20

to Philipp Schlatter, Nek5000

Thanks. I will test these out .. and get back if needed.

Regards

Nadeem

From: nek...@googlegroups.com <nek...@googlegroups.com> On Behalf Of Philipp Schlatter
Sent: Wednesday, August 19, 2020 9:22 AM
To: Nek5000 <nek...@googlegroups.com>
Subject: Re: [nek5000] Re: Problem with restarting

For completeness I wanted to mention another alternative. You could go via our KTH Framework. The ext_cyl_DNS example under https://github.com/KTH-Nek5000/KTH_Examples uses our full restart, which can be controlled fully via the input par file. The documentation for all of that is here (doxygen):

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/f1450d5e-b978-4c55-ab74-d7bc14110738o%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/47e22991-9755-4e80-13da-abf426eee5bf%40gmail.com.

Nadeem Malik

unread,

Aug 22, 2020, 11:15:49 AM8/22/20

to Nek5000

Hi neks,

I have tried a couple of restart options, but no luck so far.

First, my *.par file looks like:

[GENERAL]

startFrom = r1.fld

stopAt = numsteps

numsteps =50

dt = 2.50e-04

timeStepper = bdf3

targetCfl = 0.5

userparam01 = 25

userparam02 = 50

userparam03 = 5

Second, the relevant portion of my *.usr file is:

character*80 fnames(3)

ifield = 1 ! for outpost

n = nx1*ny1*nz1*nelv

n2 = nx2*ny2*nz2*nelv

visc = param(2)

irate1 = uparam(1)

irate2 = uparam(2)

iostep = uparam(3)

call blank(fnames,size(fnames)*80)

fnames(1) ='rs1.fld'

fnames(2) ='rs2.fld'

fnames(3) ='rs3.fld'

call full_restart(fnames,3) ! replace istep=0,1,..

....

However, when I run, the error message is:

call userchk

Reading checkpoint data

/work/06396/tg856952/stampede2/WorkNew/e3q_0/24/rs1.fld

byte_read() :: fopen failure2!

ERROR: Error reading restart header in mfi_prepare ierr= 1

an error occured: dying ...

TACC: MPI job exited with code: 1

TACC: Shutdown complete. Exiting.

Can anyone help please.

Thanks

Nadeem

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

nadeem...@cantab.net

unread,

Aug 22, 2020, 11:23:38 AM8/22/20

to Nadeem Malik, Nek5000

Sorry, that should be ..

[GENERAL]

startFrom = rs1.fld ß correction

stopAt = numsteps

….

Same error message.

From: nek...@googlegroups.com <nek...@googlegroups.com> On Behalf Of Nadeem Malik
Sent: Saturday, August 22, 2020 10:16 AM
To: Nek5000 <nek...@googlegroups.com>
Subject: Re: [nek5000] Re: Problem with restarting

Hi neks,

To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/bd93ba48-80b9-462a-82f0-d75615e115e0n%40googlegroups.com.

Dr Nadeem Malik

unread,

Aug 22, 2020, 3:15:29 PM8/22/20

to Nadeem Malik, Nek5000

In fact, I think that I have now sorted most of the issues. Just a couple of fine points.

First, I have been able to dump 3 files and restart by the following workflow:

(1) My *.usr contains the following:

integer e,eg,ex,ey,ez,f,irate1,irate2,irest
character*80 fnames(3)

irate1 = uparam(1)
irate2 = uparam(2)

irest = uparam(3)
iostep = uparam(4)

if (irest.eq.0) then
iostep_full = iostep
call full_restart_save(iostep_full)
else

call blank(fnames,size(fnames)*80)
fnames(1) ='rs1.fld'
fnames(2) ='rs2.fld'
fnames(3) ='rs3.fld'
call full_restart(fnames,3) ! replace istep=0,1,..

end if
...

(2) My first run dumps steps 10, 11, and 12, so my *.par contains:

[GENERAL]
#startFrom = rs1.fld
stopAt = numsteps
numsteps = 20

dt = 2.50e-04
timeStepper = bdf3
targetCfl = 0.5

userparam01 = 10
userparam02 = 20
userparam03 = 0
userparam04 = 10

(3) I see 3 files named rs_e3q0.f00001 *01, and *03; I rename these to rs1.fld, rs2.fld, and rs3.fld

Then I rerun for a further 10 steps starting from 10 (or is it 11?), with the new *.par entries:

[GENERAL]
startFrom = rs1.fld
stopAt = numsteps
numsteps = 10

dt = 2.50e-04
timeStepper = bdf3
targetCfl = 0.5

userparam01 = 10
userparam02 = 10
userparam03 = 1
userparam04 = 8

This seems to restart and run, and produce new files. Some questions remain:

(i) How accurate is the restart; when I compare the last files contain some calculations, I see a small difference

between the straight 20 steps, and the restarted 20 steps:

Straight runs, 20 steps:

321 5.00000000E-03 9.50012809E-03 2.45792784E+04 2.60201618E+02 3.93944173E+03
325 5.00000000E-03 9.93029522E-03 2.56922315E+04 2.25725041E+02 3.93972916E+03
329 5.00000000E-03 1.02087679E-02 2.64127121E+04 2.01700075E+02 3.94021456E+03
333 5.00000000E-03 1.02852992E-02 2.66107184E+04 1.94992163E+02 3.94018812E+03
337 5.00000000E-03 1.06277492E-02 2.74967247E+04 1.63251932E+02 3.94010784E+03
341 5.00000000E-03 1.10478591E-02 2.85836572E+04 1.24098509E+02 3.94037922E+03
345 5.00000000E-03 1.12491924E-02 2.91045584E+04 1.03449011E+02 3.94075114E+03
349 5.00000000E-03 1.13701112E-02 2.94174063E+04 9.24367589E+01 3.94059821E+03
353 5.00000000E-03 1.17605194E-02 3.04274928E+04 5.21455888E+01 3.94052500E+03
357 5.00000000E-03 1.21509276E-02 3.14375794E+04 1.42264930E+01 3.94074582E+03

Restart run1 after 10 steps, 20 steps:

321 5.00000000E-03 9.50012809E-03 2.45792784E+04 2.60545154E+02 3.93943715E+03
325 5.00000000E-03 9.93029522E-03 2.56922315E+04 2.26024128E+02 3.93972472E+03
329 5.00000000E-03 1.02087679E-02 2.64127121E+04 2.01921705E+02 3.94020690E+03
333 5.00000000E-03 1.02852992E-02 2.66107184E+04 1.95225458E+02 3.94018301E+03
337 5.00000000E-03 1.06277492E-02 2.74967247E+04 1.63469569E+02 3.94010255E+03
341 5.00000000E-03 1.10478591E-02 2.85836572E+04 1.24259106E+02 3.94037416E+03
345 5.00000000E-03 1.12491924E-02 2.91045584E+04 1.03580570E+02 3.94074187E+03
349 5.00000000E-03 1.13701112E-02 2.94174063E+04 9.25536050E+01 3.94059328E+03
353 5.00000000E-03 1.17605194E-02 3.04274928E+04 5.22158129E+01 3.94051886E+03
357 5.00000000E-03 1.21509276E-02 3.14375794E+04 1.42361019E+01 3.94074075E+03

In the last 2 columns, there is a small difference -- I would like this to be smaller?

(ii) Following on from the above, it is important that I dump double precision files otherwise there is no point in

dumping 3 consecutive files. Is this default or does it need to be specified -- how?

(iii) suppose I want a consequence of restarts. I .e. in the 2nd and subsequent runs I want both a restart and then

dump files at the end in preparation for the next run. How can I do this?

(iv) If the last time step is 100, say, and I want to dump at the end, should I set iostep=98? (This will dump 98,99,100 ?)

But then on restart, it will continue from 98 or 100 ?

(v) After 2 runs, how can I concatenate 2 results files, so that their time step numbers and times are continuous?

Thanks

Nadeem

-------------------------

Nadeem A. Malik

YuHsiang Lan

unread,

Aug 22, 2020, 5:08:13 PM8/22/20

to Nek5000

Hi Nadeem,

Some notes:

For (1)

a. iostep is a Nek global variable, you can set it inside par file via the key "writeInterval" under [general]

I guess what you want is a independent parameter to control fullrestart-dump. In that case, you are free to use iostep_full = uparam(4)

b. uparam is real-type, and irest is integer-type. The line `irest = uparam(3)` actually assign `irest = int(uparam(3))` which is good since you want (irest.eq.0) later.

c. The logic should be put at the restart part.

We can dump files all the time, but we can only restart from the second runs. I suggest the follows to switch between 2 sets easily.

iostep_full = iostep
call full_restart_save(iostep_full)

if (irest.eq.1) then

call blank(fnames,size(fnames)*80)
fnames(1) ='rs_e3q0.f00001'
fnames(2) ='rs_e3q0.f00002'
fnames(3) ='rs_e3q0.f00003'

call full_restart(fnames,3) ! replace istep=0,1,..

elseif (irest.eq.2) then

call blank(fnames,size(fnames)*80)
fnames(1) ='rs_e3q0.f00004'
fnames(2) ='rs_e3q0.f00005'
fnames(3) ='rs_e3q0.f00006'

call full_restart(fnames,3) ! replace istep=0,1,..

endif

For (3)

d. You can find the info from the logfile to see when a file is read or write.

Search the filename in the logfile

Reading checkpoint data
FILE:<my_path_to_the_case>/<file_name>
0 1.0000E-03 done :: Read checkpoint data

Together with the "header" of the files:

head -1 rs_eddy_fullrestart_run10.f00001

#std 8 8 8 1 256 256 0.1000000000000E-02 10 0 1 XUP 0.0000000E+00 T �a�@
The first "8" indicates the file is under double presicion.

Others are listed at here:

https://nek5000.github.io/NekDoc/problem_setup/case_files.html#restart-output-files-f-05d

This should help you figure out the timing.

rs_files file1 file2 file3

dump at iostep iostep+1 iostep+2

read at istep=0 istep=1 istep=2

(3) (i), what is the numbers you print?

(3) (ii), inside fill_restart_save, the p63=1 is set, you should find the files are already under double precision (check the header)

(3) (iii), see c. above.

(3) (iv), This is up to you. In my case, I don't need my files dumped at the specific timestep. I usually control the wall time of the job instead of numbers of timesteps.

Usually, it's hard to predict when the simulation is done, so my iostep is small enought to make sure my last save is close to the end of the simulation but large enough to reduce cost. of dumping files.

That's also why we need two sets or full_restart files, one of them is for backup to prevent simulation stopping in the middle of dumping files.

(3) (v) I don't quite understand this. You should have correct time as long as you choose the right files to restart. Timesteps, by default, are reset to 0.

Yu-Hsiang

--

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

nadeem...@cantab.net

unread,

Aug 22, 2020, 5:47:44 PM8/22/20

to YuHsiang Lan, Nek5000

Thanks.

Double precision ok.

I have modified the *.usr :

integer e,eg,ex,ey,ez,f,irate1,irate2,irest

character*80 fnames(3)

irest = nint(uparam(3))

iostep = nint(uparam(4))

iostep_full = iostep

if (irest.ge.1) call full_restart_save(iostep_full)

if (irest.eq.2) then

call blank(fnames,size(fnames)*80)

fnames(1) ='rs1.fld'

fnames(2) ='rs2.fld'

fnames(3) ='rs3.fld'

call full_restart(fnames,3) ! replace istep=0,1,..

end if

Furthermore, in *.par I already have:

TIMESTEP

writeInterval = 10000

is this the same? Then this writes 3-dump files every 10000 steps?

So after 20000 steps does it overwrite the previous 3 or creates new files?

About continuous time, I mean that if I dump files # 10,11,12 then does the restart start calculating from step 11 or 13? It makes a difference since I may lose 2 time steps. So if I want to stop—start in groups of 10 time steps, I should dump with iostep=8?

The numbers are simply some quantities (not important here) that I calculate, and they should be identical if the restart is perfect.

To simply, the first line is from a single straight run of 2o timesteps; the 2^nd line compares with a stop-restart of 10 time steps each:

357 5.00000000E-03 1.21509276E-02 3.14375794E+04 1.42264930E+01 3.94074582E+03

357 5.00000000E-03 1.21509276E-02 3.14375794E+04 1.42361019E+01 3.94074075E+03

The last 2 columns differ a bit – really they should be equal. This suggest that either a restart is never perfect, or I have missed a time step or two in the restart workflow?

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/bd93ba48-80b9-462a-82f0-d75615e115e0n%40googlegroups.com.

--

You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/b9641ea6-4dd3-4d6f-9f96-24bc0c4c6a13o%40googlegroups.com.

YuHsiang Lan

unread,

Aug 24, 2020, 3:06:07 PM8/24/20

to Nek5000

Hi Nadeem,

Then this writes 3-dump files every 10000 steps?

Yes.

So after 20000 steps does it overwrite the previous 3 or creates new files?

At the 20000 th step, it will start to dump the second set, rs6<case>0.f0000[4-6].

At the 30000 th step, it will start to overwrite the first set, rs6<case>0.f0000[1-3].

The last 2 columns differ a bit...

Right, so my actual question is, why the 3rd and 4th column are matched but the last two only have 6 digit.

I also look at your user file, but I cannot figure out why fbax and fbay behaves different.

i,time,zbar(i,1),fbax(i,1),fbay(i,1),fbaz(i,1)

Have you checked if vx, vy and vz restart accurately (14 - 16 digits)?

Or do you have some calculation or pre-processing depended on the timesteps?

Yu-Hsiang

--

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/bd93ba48-80b9-462a-82f0-d75615e115e0n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+unsubscribe@googlegroups.com.

Nadeem Malik

unread,

Aug 24, 2020, 3:36:45 PM8/24/20

to Nek5000

> Right, so my actual question is, why the 3rd and 4th column are matched but the last two only have 6 digit.

> I also look at your user file, but I cannot figure out why fbax and fbay behaves different.

> i,time,zbar(i,1),fbax(i,1),fbay(i,1),fbaz(i,1)

This is a problem which I am trying to sort out. I have modified *.usr so that I am now printing out just the x- and z-

circualtions (fbax,fbaz) and x and z plane coords (xbar,zbar):

write(57,3) i,time,xbar(i,1),zbar(i,1),fbax(i,1),fbaz(i,1)

However, the ranges of x and z should both be in the range [-pi,pi]; but I am getting the following type of output:

1 5.00000000E-03 -1.53929693E+06 -1.20944950E+02 -2.07214900E+03 -8.09433203E+00

5 5.00000000E-03 -1.53929693E+06 -1.20944927E+02 -2.07247789E+03 -8.09561677E+00

9 5.00000000E-03 -1.53929693E+06 -1.20944934E+02 -2.07261202E+03 -8.09614069E+00

As you can see, the xbar values are -1e06, and zbar values are -1e02, both very large. I have tried some variations,

but no luck so far. For example, for the x-coord and x-circulation I try (the ines starting "!" are previous attempts):

! x-component of circulation in x-planes: Gx = Int_{x-planes} [Vor_x (dS)x]

eg = lglel(e) ! Which processor has which elements

call get_exyz(ex,ey,ez,eg,nelyz,1,nelx)

do k=1,lx1

do i=1,ny1*nz1

fbax(k,ex) = fbax(k,ex)+area(i,1,fx,e)*abs(vort(i,1,k,e,1)) ! x-coord of vorticity

! xbar(k,ex) = xbar(k,ex)+area(i,1,fx,e)*xm1(i,1,k,e) ! x-coord on mesh 1

! xbar(k,ex) = xm1(i,1,k,e) ! x-coord on mesh 1

xbar(k,ex) = xm1(1,1,1,e) ! x-coord on mesh 1

wghx(k,ex) = wghx(k,ex)+area(i,1,fx,e)

enddo

call gop(fbax,worx,'+ ',lx1*nelv)

! call gop(xbar,worx,'+ ',lx1*nelv)

call gop(wghx,worx,'+ ',lx1*nelv)

do i=1,lx1*nelx

fxav(i,1)=fbax(i,1)/wghx(i,1) ! If you want the average

! xbar(i,1)=xbar(i,1)/wghx(i,1)

enddo

[Simialar for z- ]

write(57,'(A,1pe14.7,1x,i7,1x,i7)') '# time,irate1,irate2 = ',

$ time,irate1,irate2

do i = 1,lz1*nelz,4

write(57,3) i,time,xbar(i,1),zbar(i,1),fbax(i,1),fbaz(i,1)

3 format(i5,1x,5(1pe16.8,1x))

(Note, lx1=lz1=15, and nelx=nelz=24.)

The probelm appears gto be in, "xbar(k,ex) = xm1(1,1,1,e)" by which I mean xbar contains the

x-ccord of the (1,1,1) grid point in element e. This should be in the [-pi,pi] range? (same of Z-ccord.)

> Have you checked if vx, vy and vz restart accurately (14 - 16 digits)?

> Or do you have some calculation or pre-processing depended on the timesteps?

No. I am relying on the circulation calculation to tell me that -- but I am now worried that this may not be correct?

-nadeem

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/e7ad923f-03b6-4dc4-a6a9-023ad768f9c3o%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Nek5000" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/9305eef1-19bf-4cc3-7a9d-724ef270ff00%40gmail.com.
--
You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/nek5000/bd93ba48-80b9-462a-82f0-d75615e115e0n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Nek5000" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nek5000+u...@googlegroups.com.

Nadeem Malik

unread,

Aug 24, 2020, 3:43:32 PM8/24/20

to Nek5000

p.s. The last 2 colums should be positive, because I am calualting the absolute value of the circulation:

fbax(k,ex) = fbax(k,ex)+area(i,1,fx,e)*abs(vort(i,1,k,e,1)) ! x-coord of vorticity

So, there is clearly somthing wrong, maybe an index exceeding its limit?

-nadeem

Nadeem Malik

unread,

Aug 26, 2020, 1:22:44 PM8/26/20

to Nek5000

Thanks, restart problem is now sloved.

Reply all

Reply to author

Forward