Just wanted to introduce myself and tell you a little about what I
plan to work on. As I see it, there are at least three different
approaches to this problem:
1. A modification of the current FDS approach, which utilizes a direct
solve for the pressure on each mesh (here I mean that each mesh is
assigned to its own processor), and then uses some approximations for
the patch boundary conditions that yield reasonable results. In the
end, an error is committed at the patch boundary, but I believe the
error will be small enough for our purposes and that we can
characterize the order of the error.
2. On a shared memory machine, simply recode the direct solver
(FISHPACK) using Open MP. With the trend in CPUs being to increase
the number of processors on a single board, this will likely be an
optimal solution for many of our users who do not have large
distributed memory machines.
3. Solve the Poisson equation on a distributed memory machine using
MPI.
I am currently working on option #1. My understanding is that you are
currently working on option #3. This is great, because I believe in
the long run there will be different situations in which the various
methods listed above will be optimal.
My status on solving the problem using option #1 is that I have a
scheme working for incompressible flow that is stable and guarantees
mass conservation between meshes. I have designed my method in
Matlab. It remains that we implement the scheme in FDS and that we
characterize the order of the patch boundary errors.
Once this is done (which still may take some time), my plan is to go
to work on option #2. Hopefully, I will be able to find a code that
is already available. If you (or anyone) know(s) of one, please let
me know. Otherwise we will just have to modify the existing code or
more likely start from scratch (I am not a big fan of modifying other
people's codes -- by the time I figure out the code, I could have
written my own which I would understand much more thoroughly).
Once that is done, my interests will actually shift in a different
direction: adaptive mesh refinement. Here things may take a whole new
twist. I will likely pick a framework (like SAMRAI out of Lawrence,
Livermore) and begin building an AMR code from there. Of course, fast
global elliptic solvers will be an important piece of this code as
well. And I will be interested to discuss the options with you at
that point.
So, I think this illustrates that our work is really quite
complimentary and I think it is great that you are taking such an
interest in the global solver. I am sure you have much more expertise
in this area than I do.
Take care,
Randy
many thanks for your comprehensive messsage!
I think that the optimization of the pressure solver is a very
important topic in order to improve the overall efficiency of FDS,
especially if you think of applications with millions of unknowns and
a very fine grid resolution. Nice to have somebody to exchange and
discuss some ideas now!
As I told before in the thread on 'parallel scaling tests', message
38, it is not enough to only look at the parallel efficiency of an
algorithm. An algorithm with a parallel efficiency of only 60%-70% and
a good numerical efficiency may be superior to an algorithm with
nearly 100% and a poor numerical efficiency. The more complex the
problem is, the more you will need some kind of global data transfer,
particularly if you want to use a big number of subgrids.
If I understand it correctly, you will save the former way of solving
the poisson problem with local FFT-solvers based on crayfishpack on a
MIMD-machine. After the local solves, the global solution is achieved
by some kind of data exchange at inner boundaries. Does this kind of
postprocessing include global data transfer? As I saw there already
exists an experimental code CORRECT_PRESSURE in MAIN_MPI. Are you just
working on improving this code? What is the basic idea behind it? How
can the mass conservation be guaranteed? You told that you have
already designed a corresponding code under Matlab? Are there some
articles describing this algorithm? I would be very interested in it!!
You are right: I am working at a strategy of your type #3. It is based
on coupling the local solves by a surrounding global method of
multigrid type, which maybe uses local FFT-solves or whatever to
calculate the local solutions. Just at the moment I am implementing a
separate master solver. I have the hope that this master can also be
used in other parts of the code in order to guarantee a better
coupling of the single subgrids. But this is future work...
This kind of poisson solver would also allow for local grid
refinements! You could use different grid resolution on different
subgrids (macros), also including local adaptivity. If you use
adaptive mesh refinement, there is a big need for a very strong and
robust elliptic solver!! In the first step, I would like to begin
with some kind of 'macrowise' adaptivity. Do you have already special
adaptivity concepts in mind?
I would be very happy to discuss all these topics in detail with you
in the future!!
So, have a nice time in Italy!
Susan
However, there are other scenarios/geometries where the current solver
does not work well, tunnels especially. In such cases, we need a
global solution. The first idea is to simply integrate the Poisson
equation over each mesh and solve this trivially small system of
linear equations to get a volume flux at each mesh boundary, and then
do another local FFT solve to ensure that the volume flux (int u dot
dS) is consistent mesh to mesh. This is VOLUME conservation. Mass
conservation could be ensured by sharing desnities and species
concentrations appropriately.
I have coded this in the serial version of FDS (avoiding MPI details
for now). It is invoked by PRESSURE_CORRECTION, but this is not a
feature we recommend. It is unstable in most situations. It only works
for a very narrow range of conditions. Randy is currently testing
ideas in MatLab and we are going to transfer these ideas to FDS when
they are ready.
Our short term goal is NOT to use adaptive gridding. I believe this
would overly complicate FDS to the point where we could not maintain
it. If we improve the robustness of the current multi-mesh scheme,
then, as shown by Christian's 64 mesh scaling test, we will have a
very powerful solver.
> > Randy- Hide quoted text -
>
> - Show quoted text -
Besides, I will try to study your strategies to preserve volume and
mass conservation in the pressure solver. How do you store the data
concerning the coarse grid and where do you solve the corresponding
coarse grid problem in your pressure correction scheme? I suppose that
this is additionally done on one of the processors related to the
single submeshes, for example on submesh-processor 1? In this case,
you will probably have considerable waiting times on the other
processors?! Or is the coarse grid problem solved in a completely
distributed way over all processors, each processor only dealing with
a very small amount of data and using frequent communications?
I agree that the use of adaptive grid refinement strategies would
complicate the current development
of FDS substantially. If I understand it correctly, Randall considers
adaptive gridding as a long-term goal. But I could imagine some kind
of macro-wise grid adaptivity, where you may have very different local
grid resolutions on single submeshes, coupled by a strong global
solver.
> > - Show quoted text -- Hide quoted text -
in order to examine the quality of the pressure solver, I would like
to suggest a parallel benchmark test. For this reason, I have designed
a geometry file (see the file bench64.fds) which is based on the
'scaling test'-example from the 'Parallel scaling tests'-thread. As
before, it uses the regular 4x4x4-subdivision of the unit cube, but
now with an additional fire scenario, a ventilation from the left and
an outflow on the right. I have inserted various measuring points,
nearly all positioned in the middle of the single subdomains
(intentionally not at the subdomain-borders) for a detailed
comparision of the serial and parallel version. To avoid too much
symmetry within the problem, the ventilation and the outflow are
positioned sideways.
I had the opportunity to run the parallel version for bench64.fds on
32 double AMD Opteron DP 250, 2.4 GHz-processors (which are monocore
processors!) like I did before for the original scaling_test-example.
The serial version is still running, so that I don't have comparative
data up to now. For the original scaling_test-example I got completely
consistent execution times for all subdomains. But in this case, the
local execution times differ quite a lot. See the file
bench64_execution_times.out, where I have sorted the macros
corresponding to increasing execution times.
I would be very interested in the execution times on the JUMP in
Jülich. If everybody agreed to the bench64-constellation, I would like
to ask Christian Rogsch for a test run on the JUMP in Jülich.
Christian, would this be possible? Thanks a lot in advance!
What I want to point out at this example is the following: It doesn't
suffice to only look at at the parallel efficiency of a problem. For
this more complex example it is clear from the problem that the
execution times can't be the same on all subdomains and that there are
more or less local waiting times dependent on the computational
complexity on the different macros. This fact will always deteriorate
the overall parallel efficiency. It lays in the nature of problems
coming from CFD that you don't know before how the problem will
evolve. And the center of computational load may change frequently
during the whole process, so that it is not alway possible to adjust
the grid exactly before starting the calculation. One could think of a
dynamic load balancing which is a huge topic for
itself.
Anyway, if rating a parallel algorithm, the consistency to the serial
version must be compared at first! The subdivision into single macros
always breaks up the global physical connectivity. This is expecially
the case for the poisson problem which is based on the diffusion
operator. You won't get rid of introducing several strategies of
recoupling the local problems which normally require more or less
global communication, as discussed before. To my opinion the parallel
efficiency should only be considered in a second step, if the
consistency to the serial algorithm is approved and the unavoidable
methods to guarantee the consistency are thoroughly inserted into the
code.
On Oct 15, 1:34 pm, fds4hhpberlin <s.kil...@hhpberlin.de> wrote:
> Indeed, the 64-mesh-example works very well! In this case the local
> FFT methods seem to be
> strong enough to solve the overall problem. Nevertheless, the
> underlying problem is
> very easy and I would suggest to use a more complex test scenario with
> fire and transversal ventilation in order to avoid to much symmetry.
> Then, the progression of temperature in special measuring points could
> be compared for the serial and parallel version. At the moment I am
> just designing a corresponding data file which I will suggest you
> briefly.
>
> Besides, I will try to study your strategies to preserve volume and
> mass conservation in thepressuresolver. How do you store the data
> concerning the coarse grid and where do you solve the corresponding
> coarse grid problem in yourpressurecorrection scheme? I suppose that
> > > I think that the optimization of thepressuresolveris a very
> > > separate mastersolver. I have the hope that this master can also be
> > > used in other parts of the code in order to guarantee a better
> > > coupling of the single subgrids. But this is future work...
>
> > > This kind of poissonsolverwould also allow for local grid
> > > refinements! You could use different grid resolution on different
> > > subgrids (macros), also including local adaptivity. If you use
> > > adaptive mesh refinement, there is a big need for a very strong and
> > > robust ellipticsolver!! In the first step, I would like to begin
> > > with some kind of 'macrowise' adaptivity. Do you have already special
> > > adaptivity concepts in mind?
>
> > > I would be very happy to discuss all these topics in detail with you
> > > in the future!!
>
> > > So, have a nice time in Italy!
>
> > > Susan
>
> > > On Oct 11, 2:34 pm, Randy McDermott <randy.mcderm...@gmail.com> wrote:
>
> > > > Hi Susanne,
>
> > > > Just wanted to introduce myself and tell you a little about what I
> > > > plan to work on. As I see it, there are at least three different
> > > > approaches to this problem:
>
> > > > 1. A modification of the current FDS approach, which utilizes a direct
> > > > solve for thepressureon each mesh (here I mean that each mesh is
> > > > interest in the globalsolver. I am sure you have much more expertise
I will test your new case on the JUMP in Jülich.
Have you also made some test with 1, 2, 4, 8, 16, 32 meshes?
I will post the results...
> > > > > characterize the- Zitierten Text ausblenden -
>
> - Zitierten Text anzeigen -...
>
> Erfahren Sie mehr »
many thanks for your prompt reply! No, I didn't make the tests for a
smaller number of meshes, but I am pleased to run all the different
tests. So, if the correct form of the geometry file is fixed, I will
start the tests so that we can compare all the results.
Many thanks!
Susanne
On Oct 18, 3:15 pm, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
> Hi Susanne,
>
> I will test your new case on the JUMP in Jülich.
> Have you also made some test with 1, 2, 4, 8, 16, 32 meshes?
>
> I will post the results...
>
> On 18 Okt., 14:50, fds4hhpberlin <s.kil...@hhpberlin.de> wrote:
>
> > Hello,
>
> > in order to examine the quality of thepressuresolver, I would like
> ...
>
> read more »
>
> Besides, I will try to study your strategies to preserve volume and
> mass conservation in the pressure solver. How do you store the data
> concerning the coarse grid and where do you solve the corresponding
> coarse grid problem in your pressure correction scheme? I suppose that
> this is additionally done on one of the processors related to the
> single submeshes, for example on submesh-processor 1? In this case,
> you will probably have considerable waiting times on the other
> processors?! Or is the coarse grid problem solved in a completely
> distributed way over all processors, each processor only dealing with
> a very small amount of data and using frequent communications?
No, my pressure correction scheme is very crude now. I use node 0 to
solve the system of linear equations that arise when you write the PDE
using just the meshes themselves as nodes. I just use a simple GJ
algorithm to do it. Also, this only works in the serial version. The
linear equation is solved in main.f90, not main_mpi.f90. I have been
working with the serial version just to make things easier. I'll worry
about MPI issues later. First I need to make sure this works. The
finite differencing of the linear system is done in pres.f90, and the
control of the procedure is done from main.f90. But this is not well
documented yet. I need more time to go back and look at things. Randy
McD says he has a working algorithm in Matlab, but it's going to take
time to convert to FDS. This is what we plan to do in the next few
months.
>
> I agree that the use of adaptive grid refinement strategies would
> complicate the current development
> of FDS substantially. If I understand it correctly, Randall considers
> adaptive gridding as a long-term goal. But I could imagine some kind
> of macro-wise grid adaptivity, where you may have very different local
> grid resolutions on single submeshes, coupled by a strong global
> solver.
>
Yes, for now, the only adaptive gridding we might consider is in more
efficiently doing a global solve. I still do not want to abandon the
FFT based local solve.
nice to hear from you again! I will try to understand the different
steps of you pressure correction scheme. When will Randall come back?
Did he already start with the conversion of the matlab code into
Fortran90?
Did you already take a look to the parallel benchmark geometry
bench64.fds up to now? Probably, it could be better to use a small
modification of the parallel benchmark geometry bench64.fds (see
bench64_v2.fds). In order to avoid the burner crossing the submesh-
boundaries, I inscribed the burner completely within a submesh
adjacent to the border of the room. Besides, I inserted some
additional vector slices. Please give me a feedback concerning the new
geometry, so that Christian and me can start with our comparision
tests.
Have a nice day, Susan
Sorry. I am not as diligent at checking the discussion board as
Kevin. I am back and now working on other issues related to stability
of low speed variable density schemes.
You are correct that I envision AMR as a long-term activity.
Cheers,
Randy
no problem! I hope you had a nice time in Italy ...
I am just working on the introduction of an additional master process
to FDS which will be responsible for the global coupling in the
pressure solver. Up to now, this master-slave-structure is already
working and I am just implementing the corresponding coarse grid
structure. Concerning the ingoing boundary conditions it would be very
nice to discuss the details with you. So, let's keep in touch, okay?
Have a nice time,
Susan, hhpberlin
Sorry to hear about your being ill. Hope the holiday somewhat made up for it!
I wanted to make a comment regarding any tests that you perform on the
new code. We should be careful not to judge the formulation based on
the tests alone. I am also interested in your analytical assessment
of the algorithm. Do you think it should be "embarrassingly parallel"
up to the point at which the coarse linear solver starts to slow down
(I can't see this happening until many 1000s of meshes, which we will
not get to in practice for some time). My guess is that it should
scale very well and that any poor performance we see may point to
other problems with the code (or architecture) which we are just now
in a position to diagnose.
I look forward to hearing from you.
Cheers,
Randy
On Mon, Mar 31, 2008 at 5:43 AM, Susanne Kilian <s.ki...@hhpberlin.de> wrote:
> Hello Randy,
>
> many thanks for your mail! And sorry, that my answer comes so late (I was in a short holiday and then ill for some time).
>
> Naturally, I will read your new explanations concerning the "Multiple mesh considerations" and "Domain decomposition strategy", I am very interested in it. We intend to perform some test calculations in the near future and I will report on the results.
>
> What will be the main topics of your future research?
>
> Have a nice day
>
> Susan
>
>
>
>
> _____________________________________
> Von: Randy McDermott [randy.m...@gmail.com]
> Gesendet: Mittwoch, 26. März 2008 19:36
> An: Susanne Kilian
> Cc: Kevin McGrattan
> Betreff: Re: Pressure Solvers
>
> Hi Susan,
>
> At long last I have at least partially written up the new domain
> decomposition strategy used in FDS 5.1.4. See the current FDS 5 Tech
> Guide under "Multiple Mesh Considerations (On-Going Research)" and the
> appendix "Domain Decomposition Strategy". I hope this helps answer
> your questions about how we are doing things in parallel. I would be
> appreciative of any feedback you have as I will soon be trying to put
> a paper together on this approach.
>
> It is clear that we have some scaling issues to deal with in
> practice. But my guess is that this is due to the legacy of the
> serial FDS code... in principle, the algorithm should be
> embarrassingly parallel. At the moment I think we are stopping to do
> mesh exchanges more often than will ultimately be required, but this
> may not be the biggest problem.
>
> Cheers,
> Randy
>
>
> On Nov 1 2007, 4:32 am, fds4hhpberlin <s.kil...@hhpberlin.de> wrote:
> > Hello Randy,
> >
> > no problem! I hope you had a nice time in Italy ...
> >
> > I am just working on the introduction of an additional master process
> > to FDS which will be responsible for the global coupling in thepressuresolver. Up to now, this master-slave-structure is already
> > working and I am just implementing the corresponding coarse grid
> > structure. Concerning the ingoing boundary conditions it would be very
> > nice to discuss the details with you. So, let's keep in touch, okay?
> >
> > Have a nice time,
> >
> > Susan, hhpberlin
> >
> > On Oct 31, 3:41 pm, Randy McDermott <randy.mcderm...@gmail.com> wrote:
> >
> > > Hi Susan,
> >
> > > Sorry. I am not as diligent at checking the discussion board as
> > > Kevin. I am back and now working on other issues related to stability
> > > of low speed variable density schemes.
> >
> > > You are correct that I envision AMR as a long-term activity.
> >
> > > Cheers,
> > > Randy
> >
> > > On Oct 23, 11:40 am, fds4hhpberlin <s.kil...@hhpberlin.de> wrote:
> >
> > > > Hello Kevin,
> >
> > > > nice to hear from you again! I will try to understand the different
> > > > steps of youpressurecorrection scheme. When will Randall come back?
All good points. We actually perform the coarse solve redundantly on
each mesh! So, all information needs to be 'gathered' but not 'sent'
after the solve.
Regarding different coarse mesh sizes, yes, this is probably our
biggest issue and will become even more of an issue with AMR. We do
not do any real 'load balancing' at the moment. It is up to the user
to specify the meshes in a logical way so that one mesh is not
performing more work than the others. Getting around this may require
some major rethinking in how we do things.
Regarding mesh stretching, Kevin can correct me, but I think FDS
allows stretching in two of the three directions (in X and Z). There
is some history there that I do not know the details of. But
basically Kevin hacked into the FFT solver and added the stretching
factors.
The sooner I can get to AMR the better. But my guess is that I will
not get to think about it seriously until this fall. There are a few
more important issues to deal with. The other day Kevin and I played
around with embedding a mesh with another and things seemed to work,
but there is no information transferred from the fine mesh back to the
coarse mesh. So there is currently only a one-way coupling. For fun,
just try it some time... embed a fine grid inside a coarse grid and
see what happens. So, this is the beginning.
Ok, hope this answered your questions. Talk to you soon.
Randy
> ________________________________________
> Von: Randy McDermott [randy.m...@gmail.com]
> Gesendet: Montag, 7. April 2008 16:28
> An: Susanne Kilian
> Cc: fds...@googlegroups.com; Kevin McGrattan
> Betreff: Re: Embarrassingly parallel
>
> Hi Susan,
>
> Good to hear from you. Sorry I had not checked the discussion group.
> You are correct that the coarse solve is not embarrassingly parallel.
> I mentioned this in my post on March 31, but perhaps I should be more
> clear in the Tech Guide. The coarse solve is M x M, where M = number
> of meshes, and is symmetric positive-definite. So, I would guess that
> CG would be optimal. Right now we simply use a direct LU because we
> are only considering runs up to 256 processors, in which case even a
> direct coarse solve is trivial. If we ever get to point where this
> part of the algorithm is slowing us down, we will be very happy! Most
> FDS users only use O(10) processors.
>
> Regarding anisotropy of the grid... anisotropic (i.e., pancake-shaped)
> grids should be avoided in large-eddy simulation. However, the direct
> FFT solver has no problem with such grids; the Poisson equation has no
> cross derivatives and the grid is a structured, Cartesian grid with
> little or no stretching. On adjacent meshes I would not use a
> refinement ratio of more than 4:1, and these mesh interfaces should be
> positioned away from regions where accuracy is important. Once we
> implement adaptive mesh refinement, this will be done automatically.
>
> Good luck with the Linux cluster. I look forward to chatting more
> once you have had a chance to study the scheme in detail.
>
> Cheers,
> Randy
>
> On Mon, Apr 7, 2008 at 7:55 AM, Susanne Kilian <s.ki...@hhpberlin.de> wrote:
> >
> >
> >
> >
> > Hi, Randy,
> >
> >
> >
> > sorry, I have forgotten to send my last discussion-group-posting to your
> > email adress as well. So, I am not sure that you already read it and I
> > would like to attach it again subsequently.
> >
> >
> >
> > In this week I will be concerned with the maintenance of our Linux-Cluster
> > (we have expanded …), but I hope that I will find the time to read the
> > description of your new pressure solver next week.
> >
> >
> >
> > So, I am looking forward to new interesting discussions with you.
> >
> >
> >
> > Have a good time
> >
> >
> >
> > Susan
> >
> >
> >
> >
> >
> > My posting:
> >
> >
> >
> > Hi, Randy,
> >
> > I am a little bit confused by what is exactly meant with the
> > definition of "embarrassingly parallel". In the new Technical
> > Reference Guide the new algorithm is described as embarrassingly
> > parallel once the boundary conditions are defined. But this holds only
> > true for several times within a single time step from one mesh
> > exchange to the next one and from one coarse grid solve to the next
> > one, resp. The coarse grid solve for itself is highly "not
> > embarrassingly parallel". The original definition of this term says
> > that "there is no essential dependency (or communication) between
> > those parallel tasks" (wikipedia). Is "embarrassingly parallel" an
> > attribute of single parts of an algorithm or of the whole algorithm, I
> > am not really sure.
> >
> > The new algorithm is some kind of an additive Schwarz method
> > (originally they use an overlap ...) with a coarse grid correction by
> > which one can get rid of the troublesome dependency of the number of
> > subgrids. Due to my experience this methods works very fine if the
> > underlying grid structure is more ore less equidistant (or not too
> > anisotropic); especially, if the subgrid sizes don't differ so much.
> > Are there already experiences with subgrids of very different sizes or
> > different grid resolutions at adjacent subgrids in the FDS-community?
> > Many years ago I worked a lot with Schwarz-like methods (additive,
> > multiplicative, hybrid, without and with coarse grid correction) and I
> > often could see a strong dependency on the degree of anisotropy of the
> > used (coarse) grid.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> >
> > hhpberlin
> > Ingenieurgesellschaft für Brandschutz mbH
> >
> > Dipl.-Math.
> > Dr. Susanne Kilian
> > Wissenschaftliche Beraterin
> > Geschäftsbereich Ingenieurmethoden
> >
> > Rotherstraße 19, 10245 Berlin
> > Phone +49 30 895955-0 , Fax -100
> > s.ki...@hhpberlin.de
> > www.hhpberlin.com
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> >
> > Amtsgericht Berlin-Charlottenburg HRB 78 927
> > Geschäftsführerin: Dipl.-Ing. Margot Ehrlicher
> > Beirat: Prof. Dr.-Ing. Dietmar Hosser
> > Dr.-Ing. Karl-Heinz Schubert
> >
> > Diese E-Mail ist vertraulich. Wenn Sie nicht
> > der rechtmäßige Empfänger sind, dürfen Sie
> > den Inhalt weder kopieren, verbreiten noch
> > benutzen. Sollten Sie diese E-Mail versehent-
> > lich erhalten haben, senden Sie sie bitte an
> > uns zurück und löschen sie anschließend.
> > This E-Mail is confidential. If you are not the
> > intended recipient, you must not copy, disclos
> > or use its contents. If you have received it in
> > error, please inform us immediately by return
> > E-Mail and delete the document.
> >
> >
> >
> >
>