On Tue, 23 Sep 2008, Matei Radulescu wrote:
>
> Dear James,
>
> In a calculation of a detonation interacting with a half-cylinder, in
> a large domain and high resolution, the calculation eventually hangs
> after a number of time steps.
>
> An example is the script run_circle_detonation i uploaded today.
> After the 39th output (few hours on a single processor machine), the
> calculation hangs. Doing "top" to obtain the processes running shows
> that the Amrita process has stopped, and only a Perl process remains,
> which controls 100% of my CPU, but consumes practically no RAM.
As described in the VKI notes, AMRITA has a front-end and a back-end.
The Perl job is the AMRITA interpreter that parses your script and the
back-end is the mesh refinement engine that carries out the
actual simulation. The two ends runs as separate UNIX processes
that communicate via named pipes. Change directory to
/tmp/amrita-xyz/amrita/pipe where xyz is your login name
and you'll see a series of islin/islout files that are used
for the communication. Thus what you're observing is the back-end
dieing leaving the front-end hanging. Note that amrkill can be used to
clean up a hanging front end.
At this end your script runs fine, see the attached file sch46.pdf that
shows a time shot around iteration 150. Therefore without having access
to your machine so that I can check what compiler you're using and examine
your shell environment, there's no real way for me to pinpoint the
problem.
Gary's issue, when running in parallel, is slightly different.
There is a known problem with MPI that AMRITA's error messages
get lost owing to the way MPI buffers stderr from Fortran programs.
But this issue is fixed in my development version.
You should, however, take up his suggestion of check-pointing
intermediate solutions so that you can resume an aborted run.
There is also a problem with your script that suggests you are
not using Amrita's folding editor. Specifically, there is a closing
brace missing in the fold where ArraySizes is called; I inserted
a line 23 containing: " }".
Without the added line, the entire script is mangled when viewed with
amrgi, and as a matter of ettiquette you can't really expect A.N. Other
to wade through such a listing.
You should also get into the habit of removing detritus from a script.
In the example you sent me, you do not actually plot vorticity
but all the custom code remains in the script. And it may well
turn out that it is the vorticity stuff that is causing your problem
rather than the base AMRITA installation; Brian has already alluded
to one problem you had in an earlier message. Thus as a first step
you should whittle the script down to it bare essentials and
then see if the spurious hanging remains.
James
The discrepancy arose because I misread your original e-mail. You said
that the problem occured after the 39th output, but I took that to mean
the 39 iteration. Hence when I ran the script I reduced the march down
from 20 steps to 5 so that I would have enough checkpointed solutions to
look at what was going on.
Now given that you were able to run the simulation out to late time
I'm fairly sure I know what the problem is. Amr_sol, as shipped, has an
internal limit on the number of mesh patches set at 2^10 i.e. 4096.
This limit is irrespective of the number fed into ArraySizes and
arisees from the way the mesh interconnectivity was shoehorned
into an 8MByte machine back in 1991. Given the memory available
today I have revisted the data storage and v3.03 will allow for
2^20 meshes which should be good for the next 10-15 years.
Normally the the mesh limit does not come in to play but
you are using five grid levels with a refinement ratio of 2
and so you end up will an unnecessarily large number of small
pathches (there are 2000+ at the early time I ran). I would
probably only have run 2 levels wiht a ratio of 4 and adjusted
the base grid to suit. Note there is not much to be gained
by using etra grid levels as the work on the finest mesh
dominates. Also, the more levels you use the less frequently
the time step can be adjusted to suit the CFL condition,
which is not good for highly dynamic flows.
Anyhow, I suggest you do the following. Modify your
script to output checkpointed solutions with the flowout command.
Run the simulation until it fails then read in the last
available solution and run the following:
set npatches = 0
datastructure
do l=0,5
set npatches += $$nga'l
end do
echo $npatches
and I'm fairly sure that you'll find the number of patches
is close to the 2^10 limit.
If this proves to be the case, I'll ship you a prerelease
of 3.03 with its enlarged storage tables to see if that
fixes the problem.
James
On Wed, 24 Sep 2008, Matei Radulescu wrote:
>
>
> > internal limit on the number of mesh patches set at 2^10 i.e. 4096.
>
> you mean 2^12=4096?
Sorry for the brain fade, but I'm suffering with a bad head cold
at the moment. If you take a look in the file:
$AMRITA/include/f77/AMR_SOL/AMRITA
you will see a variable MASK2 which is three nibbles wide i.e. 0xFFF.
>
> > This limit is irrespective of the number fed into ArraySizes and
> > arisees from the way the mesh interconnectivity was shoehorned
> > into an 8MByte machine back in 1991. Given the memory available
> > today I have revisted the data storage and v3.03 will allow for
> > 2^20 meshes which should be good for the next 10-15 years.
> >
> > Normally the the mesh limit does not come in to play but
> > you are using five grid levels with a refinement ratio of 2
> > and so you end up will an unnecessarily large number of small
> > pathches (there are 2000+ at the early time I ran). I would
> > probably only have run 2 levels wiht a ratio of 4 and adjusted
> > the base grid to suit. Note there is not much to be gained
> > by using etra grid levels as the work on the finest mesh
> > dominates.
>
> The reason here was to be able to do an appropriate resolution study
> and not have to compare apples with oranges.
>
> > Also, the more levels you use the less frequently
> > the time step can be adjusted to suit the CFL condition,
> > which is not good for highly dynamic flows.
>
> Isn't the time step always adjusted based on your most resolved Dx?
> In that sense, should it not matter how many levels "above" that there
> are?
The time step is decided by looking at the most restrictive case
from all the grid levels. But because of the way the temporal refinement
is orchestrated, the stable time set can only be selected when the
coarse grid is intregrated. Thus with lmax=5 and r=2, there will
be 2^5 steps on the fine mesh with a fixed dt.
>
> >
> > Anyhow, I suggest you do the following. Modify your
> > script to output checkpointed solutions with the flowout command.
> > Run the simulation until it fails then read in the last
> > available solution and run the following:
> >
> > set npatches = 0
> > datastructure
> > do l=0,5
> > set npatches += $$nga'l
> > end do
> > echo $npatches
> >
> > and I'm fairly sure that you'll find the number of patches
> > is close to the 2^10 limit.
>
> If indeed the max no. of patches is 2^12=4096, than indeed, at a
> slightly earlier time of the breaking point, there are 4081 patches,
> and its likely to increase as the run advances do to the physical
> situation.
Good. That's what I figured.
>
> It thus seems that you identified the problem. I suspected something
> along those lines, since the breaking point didn't correlate with a
> numerical difficulty per se, but rather with the size of the domain
> and grid levels i was using. (see my original post)
> >
> > If this proves to be the case, I'll ship you a prerelease
> > of 3.03 with its enlarged storage tables to see if that
> > fixes the problem.
> >
> I'd very much appreciate that. Its been a few weeks now i've been
> trying to work around this problem!
> I can give you access to my machine, or i can download it via ftp from
> where_ever.
> Thanks for your help.
I'll put something together for you, but it will propably be
towards the end of next week. I have the serial case working,
with the upped storage limits, but I want to get the parallel
version working before I ship it, j, just to make sure that
I havn't missed any snafus.
James
>
> matei
>
> >
>
> > > > Also, the more levels you use the less frequently
> > > > the time step can be adjusted to suit the CFL condition,
> > > > which is not good for highly dynamic flows.
> >
> > > Isn't the time step always adjusted based on your most resolved Dx?
> > > In that sense, should it not matter how many levels "above" that there
> > > are?
> >
> > The time step is decided by looking at the most restrictive case
> > from all the grid levels. But because of the way the temporal refinement
> > is orchestrated, the stable time set can only be selected when the
> > coarse grid is intregrated. Thus with lmax=5 and r=2, there will
> > be 2^5 steps on the fine mesh with a fixed dt.
>
> Does that mean that I could run into the possibility that the CFL
> condition may not be verified during the time which dt is fixed?
> Say all of a sudden, something ignites & gives large characteristic
> velocities?
Yes, thermal runaway processes could present a problem.
> Do you have a flag somewhere to see if indeed the CFL condition is not
> broken during the fixed time steps on the finest grids necessary to
> make up on big step on the coarsest grid?
No. But there is nothing to prevent the savvy user from building
one into his or her patch-integrator.
> > > I'd very much appreciate that. Its been a few weeks now i've been
> > > trying to work around this problem!
> > > I can give you access to my machine, or i can download it via ftp from
> > > where_ever.
> > > Thanks for your help.
> >
> > I'll put something together for you, but it will propably be
> > towards the end of next week. I have the serial case working,
> > with the upped storage limits, but I want to get the parallel
> > version working before I ship it, j, just to make sure that
> > I havn't missed any snafus.
> >
> Thanks, let me know when this becomes available.
I'm currently running your case in serial mode and have just got
to phase 32. Therefore I should know later today whether or not
v3.03 fixes the problem.
>
> Speaking of which, if you release a new version, you could also fix a
> minor bug in the 1step chemistry routine:
> $HOME/AMRITA/AMRITAv3.00/stdlib/equations/lib/ReactiveEulerEquations/
> ComputeZndProfile.amr
> on line 85, it reads:
> DT = 1.0/1000
> i had to change this to something like
> C DT = 0.25/
> FLOAT(Npts) #mir
> But haven't tested it sufficiently.
> I ran into this when I was studying the small scale flowfield of the
> triple point structure and needed a very fine initial discretization
> of the znd profile, with up to 1000 pts/half reaction length.
> The original line 85 restricted my znd profile to a max of ~256 pts
> per half reaction length or so.
I will take a look at it to see how DT can be paramaterized.
James
Using AMRITAv3.03, I've been able to run your script out to 1000
iterations, see the attached tmp.pdf . And so it looks like the upgrade I
mentioned has done the trick. Of course, I would not be surprised that if
you ran a full-blow parameter study for a detonation diffracting around a
cyclinder, you would sooner or later run into problems, for the
roe-glister integrator is not positivity preserving i.e. it can result in
negative pressures.
James
On Mon, 29 Sep 2008, Matei Radulescu wrote:
>
> James,
> Thanks, I'm glad the problem is resolved.
> Regarding the possibility of negative pressures, that was originally
> my concern. However, looking at the profiles, I did not encounter
> this problem, yet. What part of the flow is likely to suffer from
> this type of problem, the steady expansion originating from the
> throat, which reduces the pressure to low values?
Yes. Any region of strongly expanding flow is a candidate.
>
> Can you recommend another scheme that is positivity preserving, and
> would be appropriate for this type of calculation?
It's not as simple as switching to a more robust scheme, for you'll
likely find its resolution is poor. A better way forward is to
employ two schemes and switch between them dynamically. But BCG
is not set up to deliver such a scheme.
James