parallel meshes into moose

324 views
Skip to first unread message

MGC Nestola

unread,
Mar 1, 2016, 4:27:24 PM3/1/16
to moose-users
Dear all,
 what is the easiest way to manage a distributed mesh from an external code for transfers? We have coupled a moose application with an external one. The external application manages local meshes (i.e. each process has access only to a portion of the global mesh).
We need to perform a data transferring between two meshes and to do this, we are using our own transfer which has been introduced into moose. However I need to figure out how to menage these local meshes into moose. I know that moose can use partitioner methods (which are implemented into libmesh) and nemesis. Which are the main differences between the two and what is the best way to distribute the mesh among the different processes.


Best,

Maria

Cody Permann

unread,
Mar 1, 2016, 4:43:29 PM3/1/16
to moose-users
I'm not entirely sure what you mean here. In MOOSE each processor only works on a subset of the Mesh during the transfer by default. So essentially transfers are already "distributed". The difference between what libMesh calls "serial mesh" and "parallel mesh" is how much of the mesh data structure is available to you for nodes/elements that you do NOT own. That's what the "nemesis" format is for, each processor just reads/writes the part of the mesh it owns individually. It's otherwise equivalent to exodus.

If you already have a working transfer you may be closer than you think. Each process in your parallel run is only looping over a subset of the mesh. It's up to you to figure out what to do in each part separately. It is possible to write your own partitioner. That's rather advanced though and not something we normally do in MOOSE. It's not something you'd normally need to do.

Cody

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/6aeab445-51d2-4fae-aeff-6b45a9b537cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

MGC Nestola

unread,
Mar 1, 2016, 4:52:39 PM3/1/16
to moose-users
When I use a partitioner I guess that each node holds the entire mesh, although the computation on an individual compute node is restricted to the desired subset of the domain. Am I right?

Cody Permann

unread,
Mar 1, 2016, 4:57:32 PM3/1/16
to moose...@googlegroups.com
On Tue, Mar 1, 2016 at 2:52 PM MGC Nestola <mgcne...@gmail.com> wrote:
When I use a partitioner I guess that each node holds the entire mesh, although the computation on an individual compute node is restricted to the desired subset of the domain. Am I right?

Yes!


Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:
Dear all,
 what is the easiest way to manage a distributed mesh from an external code for transfers? We have coupled a moose application with an external one. The external application manages local meshes (i.e. each process has access only to a portion of the global mesh).
We need to perform a data transferring between two meshes and to do this, we are using our own transfer which has been introduced into moose. However I need to figure out how to menage these local meshes into moose. I know that moose can use partitioner methods (which are implemented into libmesh) and nemesis. Which are the main differences between the two and what is the best way to distribute the mesh among the different processes.


Best,

Maria

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

MGC Nestola

unread,
Mar 2, 2016, 5:45:27 AM3/2/16
to moose-users
Can i ask if it is a moose or a libmesh limitation?

best

Maria


Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:

Cody Permann

unread,
Mar 2, 2016, 8:49:59 AM3/2/16
to moose-users
How else would you scale? If you want the whole solution everywhere to work with you can get it. Your memory will increased dramatically and if you process every element on every processor your compute time will stay constant as well. Perhaps I'm not sure what you are asking?
--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
Message has been deleted

MGC Nestola

unread,
Mar 2, 2016, 9:20:29 AM3/2/16
to moose-users
I have another question.
Is the solution global or local when i run parallel simulation. I mean,  does the solution refer to the local mesh or to the global one?

Best,

Maria

Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:

Cody Permann

unread,
Mar 2, 2016, 9:54:43 AM3/2/16
to moose-users
There should be no difference between the two from your abstracted view. You access the solution through the DofMap which gives you access to the solution based on the variables and mesh entities you are currently inspecting. If you are working on your local piece of the mesh you should be obtaining indices into your local piece of the solution vector. If you try to access outside of that, you'll receive errors unless you've localized the global solution vector to every processor first (this is not something you want to do).

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Daniel Schwen

unread,
Mar 2, 2016, 10:12:30 AM3/2/16
to moose-users

We should make it absolutely clear that in the case of the "serial" mesh ONLY the actual mesh (i.e. node locations) is kept on all processors. The bin linear variables are not sheeted globally! Parallel mesh is ONLY necessary when you have such a huge amount of elements that you cannot fit the mesh into memory on a single compute node. Almost NOBODY needs this capability. We need a big fat FAQ for this.


Daniel Schwen

unread,
Mar 2, 2016, 10:13:25 AM3/2/16
to moose-users

argh, auto correct...
The non linear variables are not shared globally

Derek Gaston

unread,
Mar 2, 2016, 10:51:49 AM3/2/16
to moose-users
Yes: the first rule of Parallel Mesh is that you don't need to use it. The second rule of Parallel Mesh is that you really don't need to use it. The third rule is that if you still think you need to use it: you really don't :-)

Derek

MGC Nestola

unread,
Mar 2, 2016, 10:58:47 AM3/2/16
to moose-users
I am trying to run some nemesis examples in moose, but i get the following error message:

Error opening ExodusII mesh file: cylinder/cylinder.e


Best

Maria

Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:

Cody Permann

unread,
Mar 2, 2016, 11:06:31 AM3/2/16
to moose-users
I think we should rename it to 

ParallelDistributedAndSlowerDoNotUseMesh



--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Cody Permann

unread,
Mar 2, 2016, 11:12:44 AM3/2/16
to moose-users
Are you running that particular example serially? If that's the nemesis test, it's not designed to run serially. Do this instead:

./run_tests --re=nemesis_test -v

That'll run the tests with the necessary arguments. Just look at the output and you'll see the exact command line.

MGC Nestola

unread,
Mar 2, 2016, 11:39:18 AM3/2/16
to moose-users
i was running in parallel but using mpirun, thus what is the right way to run the example?


Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:

Cody Permann

unread,
Mar 2, 2016, 11:48:56 AM3/2/16
to moose-users

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

MGC Nestola

unread,
Mar 2, 2016, 11:59:14 AM3/2/16
to moose-users
However I get the same error also with mpiexec


Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:

Wang (Non-US), Yaqi

unread,
Mar 2, 2016, 12:03:28 PM3/2/16
to moose-users
Large serial mesh (>1M elements) makes the memory not scalable and add too much memory penalty when using domain decomposition. We should not discourage people on using parallel mesh. Parallel mesh does not work well does not mean it cannot work well in future. Once mesh is distributed, it could be hard to do mesh rebalance. But it is definitely not a unsolveable issue.

Cody Permann

unread,
Mar 2, 2016, 12:54:09 PM3/2/16
to moose-users
What is the output of the run_tests command I gave above?

MGC Nestola

unread,
Mar 2, 2016, 4:38:14 PM3/2/16
to moose-users
The output is

Ran 1 tests in 1.5 seconds
1 passed, 0 skipped, 0 pending, 0 failed

Il giorno martedì 1 marzo 2016 22:27:24 UTC+1, MGC Nestola ha scritto:

Cody Permann

unread,
Mar 2, 2016, 5:03:51 PM3/2/16
to moose-users
So that means the test_harness is running it correctly so it is working on your system. Scroll up and look at the output. The exact command the test harness is using to run the test should be printed. That's what you need to use to run the test in parallel.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

alexlin...@gmail.com

unread,
Feb 17, 2017, 12:45:00 PM2/17/17
to moose-users
Alright having recently been at a meeting where I couldn't stop hearing about the benefits of Parallel Mesh...what are some new rules for when to use Parallel Mesh? A million elements? In cases where Parallel Mesh _is_ beneficial, where do speed-ups occur? I do think that a FAQ entry/more documentation on Parallel Mesh would be good.

Derek Gaston

unread,
Feb 17, 2017, 3:22:30 PM2/17/17
to moose-users
1 million elements is now my new recommendation for the the point where DistributedMesh makes sense.  The memory and startup time savings are real.

As for when it's faster... other than startup it's not often faster.  But the startup speed increase can be huge... like going from ~10 minutes down to seconds for really big meshes.

Derek

Cody Permann

unread,
Feb 17, 2017, 3:50:18 PM2/17/17
to moose-users
Note: The distributed startup isn't available in MOOSE yet. You can still use distributed mesh starting from a serial format though.

Wang, Yaqi

unread,
Feb 17, 2017, 4:02:42 PM2/17/17
to moose-users
I was wondering that is this an issue? Since no solution vectors, Jacobian, moose objects including kernels, materials, etc. has been constructed, the memory spike created by loading serial mesh followed by the distribution should be possible lower than the final memory usage of a simulation. Is there any other reason we have to use the distributed startup?

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

Derek Gaston

unread,
Feb 17, 2017, 4:12:04 PM2/17/17
to moose-users
The memory spike will not be lower than all of those other things.  It can go up to many GB per process for 10-100M elements and easily kill the simulation if you're running full MPI procs per node.  Reading in a distributed mesh typically means that your mesh never exceeds ~100 MB or so (when following our normal recommendations of ~5-10k DoFs per proc).

Then there's the setup time... which I demonstrated while we were in college station.  "find_neighbors()" is done for every locally available element during startup... which can take many minutes (up to a half hour or more for huge meshes) for a serially read in mesh.  With a distributed mesh they pretty much always read and initialize within seconds (<30s).

Finally: there is the disk I/O.  Reading in a distributed mesh reads a LOT less from disk (and each processor can attach to a separate file which is good for parallel filesystems like /scratch in the INL HPC).  Distributed meshes read in WAY faster.

Derek

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Cody Permann

unread,
Feb 17, 2017, 4:25:25 PM2/17/17
to moose-users
Yeah! Let's get that ticket closed! Still some odd problems to resolve.

Wang, Yaqi

unread,
Feb 17, 2017, 4:33:45 PM2/17/17
to moose-users
Is 'find_neighbors()' only called after mesh has been distributed? I agree on the other two points: the number of processors could be much larger than the memory consumption ratio between all others and local mesh, reading mesh by all processors. Maybe we should have another recommendation that use distributed startup for the mesh with more than 10M? Thanks for working on this ticket!

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

Derek Gaston

unread,
Feb 17, 2017, 5:02:57 PM2/17/17
to moose-users
`find_neighbors()` is called before the mesh is distributed (has to be... you have to know the connectivity before you can do a domain decomposition).

Derek

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
Reply all
Reply to author
Forward
0 new messages