MOOSE DG is much slower than OpenFOAM

515 views
Skip to first unread message

Victoria Korchagova

unread,
Jun 19, 2020, 11:41:45 AM6/19/20
to moose-users
Hello everyone!

I've found strange slow behaviour of MOOSE. 
I solve 1D scalar transport equation in MOOSE with DG method (test/tests/dgkernels/1d_advection_dg.i). Case parameters were: family is MONOMIAL, order is CONSTANT, 100 cells in 1D mesh, delta_t = 1e-4, 10000 time steps, ExplicitEuler scheme. Total computation time is 36 s.
I solve the same case in scalarTransportFoam with the same settings. Computational time here is only 5 seconds!

What can be a reason for slow computation in MOOSE case? How to accelerate it?..

--
Best regards, Victoria

Alexander Lindsay

unread,
Jun 19, 2020, 5:49:54 PM6/19/20
to moose...@googlegroups.com
Try using ActuallyExplicitEuler as the time integrator. Also are you setting the `implicit = false` flags on non-time objects for the explicit solve?

Note that MOOSE DG will be quite a bit slower than a pure FV implementation because the latter can loop just over faces for problems like this while we loop over elements and then query faces because we are a very general finite *element* code.

That being said, we have had a couple of folks working on a finite volume implementation in MOOSE that hopefully will end up being of comparable performance with OpenFOAM.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/0828fdfe-e1b1-475c-872a-bf9a0abf2dc4o%40googlegroups.com.

Victoria Korchagova

unread,
Jun 19, 2020, 6:14:16 PM6/19/20
to moose-users
Good evening, Alexander!

Try using ActuallyExplicitEuler as the time integrator. 
The result has not changed. Computational time is approximately 36 seconds again.

Also are you setting the `implicit = false` flags on non-time objects for the explicit solve?
Yes, I am.

Note that MOOSE DG will be quite a bit slower than a pure FV implementation because the latter can loop just over faces for problems like this while we loop over elements and then query faces because we are a very general finite *element* code.
I also have tried to implement and use DG method for compressible flows (in Navier - Stokes module), and I had the extremal time difference too. For Sod problem, MOOSE was 20 times slower than pure DG code where I have a loop over faces and elements too.

--
Best regards, Victoria

суббота, 20 июня 2020 г., 0:49:54 UTC+3 пользователь Alexander Lindsay написал:
Try using ActuallyExplicitEuler as the time integrator. Also are you setting the `implicit = false` flags on non-time objects for the explicit solve?

Note that MOOSE DG will be quite a bit slower than a pure FV implementation because the latter can loop just over faces for problems like this while we loop over elements and then query faces because we are a very general finite *element* code.

That being said, we have had a couple of folks working on a finite volume implementation in MOOSE that hopefully will end up being of comparable performance with OpenFOAM.

On Fri, Jun 19, 2020 at 8:41 AM Victoria Korchagova <v.kor...@gmail.com> wrote:
Hello everyone!

I've found strange slow behaviour of MOOSE. 
I solve 1D scalar transport equation in MOOSE with DG method (test/tests/dgkernels/1d_advection_dg.i). Case parameters were: family is MONOMIAL, order is CONSTANT, 100 cells in 1D mesh, delta_t = 1e-4, 10000 time steps, ExplicitEuler scheme. Total computation time is 36 s.
I solve the same case in scalarTransportFoam with the same settings. Computational time here is only 5 seconds!

What can be a reason for slow computation in MOOSE case? How to accelerate it?..

--
Best regards, Victoria

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose...@googlegroups.com.

Victoria Korchagova

unread,
Jun 22, 2020, 11:07:55 AM6/22/20
to moose-users
Classical finite element method (LAGRANGE, FIRST) in MOOSE gives me 21 s for 1D advection equation - less than DG in MOOSE but too slow in comparison with OpenFOAM again.

BTW, 1D DG method and 1D FVM method give the same numerical schemes for the uniform mesh, therefore, the time of computation should be the same too... Which operations in MOOSE could be so heavy? Where should we search the place of slowing down? Is it an assembly of global matrix or something more?

--
Best regards, Victoria

суббота, 20 июня 2020 г., 0:49:54 UTC+3 пользователь Alexander Lindsay написал:
Try using ActuallyExplicitEuler as the time integrator. Also are you setting the `implicit = false` flags on non-time objects for the explicit solve?

Note that MOOSE DG will be quite a bit slower than a pure FV implementation because the latter can loop just over faces for problems like this while we loop over elements and then query faces because we are a very general finite *element* code.

That being said, we have had a couple of folks working on a finite volume implementation in MOOSE that hopefully will end up being of comparable performance with OpenFOAM.
On Fri, Jun 19, 2020 at 8:41 AM Victoria Korchagova <v.kor...@gmail.com> wrote:
Hello everyone!

I've found strange slow behaviour of MOOSE. 
I solve 1D scalar transport equation in MOOSE with DG method (test/tests/dgkernels/1d_advection_dg.i). Case parameters were: family is MONOMIAL, order is CONSTANT, 100 cells in 1D mesh, delta_t = 1e-4, 10000 time steps, ExplicitEuler scheme. Total computation time is 36 s.
I solve the same case in scalarTransportFoam with the same settings. Computational time here is only 5 seconds!

What can be a reason for slow computation in MOOSE case? How to accelerate it?..

--
Best regards, Victoria

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose...@googlegroups.com.

Alexander Lindsay

unread,
Jun 22, 2020, 11:21:11 AM6/22/20
to moose...@googlegroups.com
We would need to profile it to know exactly. If your application is online we can profile it for you, otherwise you can look into profiling here: https://mooseframework.org/application_development/profiling.html.

Some things that might make MOOSE slow:

1) "too many" quadrature points, although for a constant monomial our default quadrature rule should only produce one quadrature point, so that hopefully isn't it
2) We form the global matrix (or for a "lumped" setting, diagonal vector) at every time step. We are planning to add an option to only compute the matrix once if the matrix is supposed to be constant. Is that the case for you?
3) Hopefully we are being smart and not reinit'ing the finite element basis every time we move to a new element for a constant monomial, but I can't guarantee that that is the case. A profile would show us this.
4) If you have auxiliary variables that can slow things down; that may or may not be the case for you.

Anyway there is a lot of room for speculation but a profile would answer these questions. We have a similar thread about explicit time integration performance over here: https://groups.google.com/forum/#!topic/moose-users/TkZnHfrYXds. I think we would really like to improve our explicit performance as it seems our user base is exploring those options a lot more recently; we have generally been focused on implicit simulations in the past.

Thanks for your posts!

Alex


To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/7a9b0fdb-3c97-4aeb-9e23-b4af12ef7cb2o%40googlegroups.com.

Sebastian Schunert

unread,
Jun 22, 2020, 12:15:10 PM6/22/20
to moose...@googlegroups.com
Victoria,

> BTW, 1D DG method and 1D FVM method give the same numerical schemes for the uniform mesh, therefore, the time of computation should be the same too

Why would you think this is generally true? DG generalizes to different shape function orders, FVM does not. I would argue your statement is generally wrong unless you are really smart in laying out your code to streamline to FVM when selecting expansion order = 0. 

With that being said, careful benchmarking of DG is in order. openfoam has always focused on fluids and related methods (FVM), while MOOSE's focus
has always been CFEM. 

Victoria Korchagova

unread,
Jun 22, 2020, 1:37:22 PM6/22/20
to moose-users
Many thanks for the detailed answer!

How to create an online application or how to find gperftools in MOOSE? I tried to profile application now but I couldn't get any .prof files and I couldn't find gperftools in the git repo.


понедельник, 22 июня 2020 г., 18:21:11 UTC+3 пользователь Alexander Lindsay написал:
We would need to profile it to know exactly. If your application is online we can profile it for you, otherwise you can look into profiling here: https://mooseframework.org/application_development/profiling.html.

Some things that might make MOOSE slow:

1) "too many" quadrature points, although for a constant monomial our default quadrature rule should only produce one quadrature point, so that hopefully isn't it
2) We form the global matrix (or for a "lumped" setting, diagonal vector) at every time step. We are planning to add an option to only compute the matrix once if the matrix is supposed to be constant. Is that the case for you?
3) Hopefully we are being smart and not reinit'ing the finite element basis every time we move to a new element for a constant monomial, but I can't guarantee that that is the case. A profile would show us this.
4) If you have auxiliary variables that can slow things down; that may or may not be the case for you.

Anyway there is a lot of room for speculation but a profile would answer these questions. We have a similar thread about explicit time integration performance over here: https://groups.google.com/forum/#!topic/moose-users/TkZnHfrYXds. I think we would really like to improve our explicit performance as it seems our user base is exploring those options a lot more recently; we have generally been focused on implicit simulations in the past.

Thanks for your posts!

Alex


On Mon, Jun 22, 2020 at 8:07 AM Victoria Korchagova <v.kor...@gmail.com> wrote:
Classical finite element method (LAGRANGE, FIRST) in MOOSE gives me 21 s for 1D advection equation - less than DG in MOOSE but too slow in comparison with OpenFOAM again.

BTW, 1D DG method and 1D FVM method give the same numerical schemes for the uniform mesh, therefore, the time of computation should be the same too... Which operations in MOOSE could be so heavy? Where should we search the place of slowing down? Is it an assembly of global matrix or something more?

--
Best regards, Victoria

суббота, 20 июня 2020 г., 0:49:54 UTC+3 пользователь Alexander Lindsay написал:
Try using ActuallyExplicitEuler as the time integrator. Also are you setting the `implicit = false` flags on non-time objects for the explicit solve?

Note that MOOSE DG will be quite a bit slower than a pure FV implementation because the latter can loop just over faces for problems like this while we loop over elements and then query faces because we are a very general finite *element* code.

That being said, we have had a couple of folks working on a finite volume implementation in MOOSE that hopefully will end up being of comparable performance with OpenFOAM.

On Fri, Jun 19, 2020 at 8:41 AM Victoria Korchagova <v.kor...@gmail.com> wrote:
Hello everyone!

I've found strange slow behaviour of MOOSE. 
I solve 1D scalar transport equation in MOOSE with DG method (test/tests/dgkernels/1d_advection_dg.i). Case parameters were: family is MONOMIAL, order is CONSTANT, 100 cells in 1D mesh, delta_t = 1e-4, 10000 time steps, ExplicitEuler scheme. Total computation time is 36 s.
I solve the same case in scalarTransportFoam with the same settings. Computational time here is only 5 seconds!

What can be a reason for slow computation in MOOSE case? How to accelerate it?..

--
Best regards, Victoria

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/0828fdfe-e1b1-475c-872a-bf9a0abf2dc4o%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose...@googlegroups.com.

Victoria Korchagova

unread,
Jun 22, 2020, 1:44:28 PM6/22/20
to moose-users
Not generally - only for a constant approximation. We can look at the FV method as the particular case of DG method, can't we? When we use only one basis function (phi = 1) in DG method, we get a piecewise constant approximation of solution... like in FV method. 

понедельник, 22 июня 2020 г., 19:15:10 UTC+3 пользователь Sebastian Schunert написал:

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose...@googlegroups.com.

Alexander Lindsay

unread,
Jun 22, 2020, 1:59:52 PM6/22/20
to moose...@googlegroups.com
You can create an empty repository on github and then push up your local repository.

Your OS may have a package available for gperftools. Ubuntu for example has the `libgoogle-perftools-dev` package that you can install with `sudo apt install libgoogle-perftools-dev`. Otherwise you could build it by hand from https://github.com/gperftools/gperftools. You would probably then also want to `go get pprof`, which is the binary you would use to inspect the profile generated while running MOOSE. So yea it's kind of complicated to get setup... :-( We used to include it in our MOOSE environment package but since our transition to conda we haven't come up with a viable means to easily distribute it to our users. Hopefully we can get that done in the not too distant future.

Alex

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/93a1e4e9-57fb-4c0a-ba37-e16422cbf4e0o%40googlegroups.com.

Sebastian Schunert

unread,
Jun 22, 2020, 2:16:22 PM6/22/20
to moose...@googlegroups.com
> Not generally - only for a constant approximation. We can look at the FV method as the particular case of DG method, can't we? When we use only one basis function (phi = 1) in DG method, we get a piecewise constant approximation of solution... like in FV method. 

I do not disagree that FVM is a particular case of DG. What I disagree on is that you say: two numerical methods are equal (or more exactly reduce to the same method for constant shape functions) and therefore the runtimes for the two implementations in question should be the same as well. 

An example: FVM does not use a quadrature to compute element integrals, simply computed average times volume (or if you will a very low order quadrature), but DG may still use a different quadrature e.g. if you define an AuxVariable in MOOSE that is Linear Lagrange you trigger using a higher order quadrature even in the DG residual loop. 

Another example: in 2D/3D MOOSE supports hanging nodes and that requires some plumbing to compute numerical fluxes across "split" faces. That plumbing turns out to be pretty expensive. For FVM, the problem is much more trivial, basically average on one side time area of the portion of the face is your numerical flux. However, MOOSE does not implement the case of constant shape functions separately and goes through the same rigor as if you were using a higher order shape function. 

Just because the numerical methods are the same for a special case, the implementations may be very different and therefore runtimes don't have to be the same. 

Derek Gaston

unread,
Jun 22, 2020, 4:52:05 PM6/22/20
to MOOSE
(I wrote this the other day - but apparently never hit send!  It's fairly comprehensive, so even though others may have answered some of this now I'm sending it anyway in case it fills in some gaps!)

Victoria,

There are many things going on here - we'll need your help to get to the bottom of it.  Firstly, are you using a DG method in OpenFOAM - or a finite volume method?  DG in MOOSE is quite a bit different from a traditional finite-volume implementation, it still uses quadrature rules and evaluates shape functions (yes, even in 1D with CONSTANT MONOMIALS) - two things you don't need to do in a low-order finite-volume code.

In fact, recognizing this, we recently added a "true", low-order finite-volume capability to MOOSE.  You can read some about it here: https://mooseframework.org/syntax/FVKernels/index.html .  Now, that system is very new (just merged within the last couple of months!), so there is still a lot of optimization to do and I wouldn't expect that at this point it could keep up with a mature FV code.

Next, "ExplicitEuler" is not a "traditional" explicit implementation.  It is still forming a matrix and doing a linear solve for each timestep.  It is more of an "academically correct" implementation that technically performs explicit time integration - but it's not how most explicit codes are coded.

To get closer to what is probably in OpenFOAM you'll want to use ActuallyExplicitEuler: https://www.mooseframework.org/source/timeintegrators/ActuallyExplicitEuler.html ... more information can be found in the base class for it here: https://www.mooseframework.org/source/timeintegrators/ExplicitTimeIntegrator.html .  With that TimeIntegrator you can choose to use a "lumped" matrix to avoid the need to do a linear solve.  One capability not yet in that time integrator is the ability to hold the matrix constant (i.e., only compute it once), though we have looked recently at adding that.

Finally: let me just say a few of "high-level" things about MOOSE performance:

1. MOOSE was originally targeted at solving _implicit_ multiphysics problems.  Explicit is still fairly new to us.
2. The point of MOOSE is multiphysics.  We never really intend to compete on speed for single-physics problems (although we should ideally be close!).  We aim to be fast at _development_ of complex multiphysics applications - and utilize state-of-the-art solving techniques for getting answers to them.
3. MOOSE is designed to be flexible:
  - Extremely modular architecture with high amounts of code reuse.
  - We do CG, DG and FV in 1D, 2D and 3D with coupling between fields solved with different discretizations.
  - CG and DG (and eventually FV) can utilize MANY types of shape functions and quadrature rules.
  - Completely generic mesh adaptivity
  - Parallelism is inherent - with both shared and distributed memory parallelism working without the users thinking about it.

All of this flexibility will show up as a cost _somewhere_.  Compared to a hard-coded, purpose-built application that only solves for one physical phenomenon a MOOSE-based code to do the same thing may utilize more memory or take some more compute time.  The tradeoff is that you actually _can_ create these applications and make them work in a reasonable amount of time :-)

So: even if OpenFOAM may be faster at solving a pure advection problem in 1D... it still might not be the right choice for your project if you're eventually going to be trying to solve something quite a bit more complicated.

I hope that helps!

Derek


--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

Victoria Korchagova

unread,
Jun 24, 2020, 9:55:51 AM6/24/20
to moose-users
Good afternoon!

I'm very happy to read all answers! Yes, I understand that different implementations give different computational time. For me, the main goal was to implement the DG scheme for compressible flows. When I found the extremal slowing down of the implementation in MOOSE I've tried to solve a simple problem (advection) which can be built without any additional code. I saw that even the simple problem is much slower than my in-house DG code or OpenFOAM, for example. Therefore, I would like to find slow parts of MOOSE code because I have a wish to accelerate my DG implementation for compressible flows and I see that slow part can be in some place of MOOSE core. 
 
I've build gperftools from github sources. Unfortunately, "png" command doesn't not worked on my HPC, therefore, I attach profile results in the text format.
An application for the advection problem was generated automatically with MOOSE script. I used only built-in MOOSE tools for this problem.
Three results of profiling are attached here:
* 1D advection, classical finite element;
* 1D advection, constant monomial;
* 2D advection, constant monomial.
And the other two files are the input data for 1D advection and 2D advection problems.

Also I've tried to compile the last version of MOOSE but I have some compilation errors. Should I rebuild libmesh and petsc too? My computations were performed with the 1st May commit on github (commit 450966f5394eb778b52e1021d34136d4bacb4a0e).

--
Best regards, Victoria

понедельник, 22 июня 2020 г., 20:59:52 UTC+3 пользователь Alexander Lindsay написал:
You can create an empty repository on github and then push up your local repository.

Your OS may have a package available for gperftools. Ubuntu for example has the `libgoogle-perftools-dev` package that you can install with `sudo apt install libgoogle-perftools-dev`. Otherwise you could build it by hand from https://github.com/gperftools/gperftools. You would probably then also want to `go get pprof`, which is the binary you would use to inspect the profile generated while running MOOSE. So yea it's kind of complicated to get setup... :-( We used to include it in our MOOSE environment package but since our transition to conda we haven't come up with a viable means to easily distribute it to our users. Hopefully we can get that done in the not too distant future.

Alex

prof_2d_advection_CONSTANT_MONOMIAL
prof_1d_advection_FIRST_LAGRANGE
prof_1d_advection_CONSTANT_MONOMAIL
2d_advection_dg.i
1d_advection_dg (1).i

princessandsuperman

unread,
Jun 26, 2020, 11:17:03 AM6/26/20
to moose-users
Do you need higher order DG?  If not, then you could try the FV code/support in MOOSE.  Those profiles are not surprising - a lot of time is spent initializing shape functions that are just unity for const monomial DG - this is all stuff that the FV implementation skips.  Also, you should definitely expect an in-house purpose built code to be able to achieve better (single-machine) performance than moose.  The cost of generalizing things and supporting coupling arbitrary physics together, etc. is not negligible.  But with MOOSE, you can scale your problem big on big machines with confidence.

Victoria Korchagova

unread,
Jun 29, 2020, 12:41:02 PM6/29/20
to moose-users
Good afternoon,

I would like to implement a high-order DG method based on MOOSE framework. I have the 2D implementation in my in-house code; my idea is to extend these ideas for 3D problems based on the popular international framework. I have the first version of 3D implementation; the next step is to understand which functions in MOOSE can be slow, is it possible or easy to change these functions to accelerate the general algorithm. I understand that all generalizing things have some cost; how to understand how much percents each generalizing function needs, which names and places in code these functions have, and is it possible to deal with these functions in my application without changing of general MOOSE or even libmesh code?

--
Best regards, Victoria

пятница, 26 июня 2020 г., 18:17:03 UTC+3 пользователь princessandsuperman написал:

Yaqi Wang

unread,
Jun 29, 2020, 1:20:04 PM6/29/20
to moose...@googlegroups.com
We need to start a new thread on this. The residual evaluation with DG can be a little slower than without DG kernels, but the solving time should be fine. MOOSE even allows you to write your own solver (executioner).

I guess I forgot sending out my emails before: Is the problem size too small for performance study? Because it is so small, some other minor overhead like doing output at each time step can show up and MOOSE run slower. If you make the problem have 10,000 cells, what you will see?

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/82e53eaf-b7be-426b-a38d-16f02add1498o%40googlegroups.com.

Victoria Korchagova

unread,
Jun 30, 2020, 6:31:34 AM6/30/20
to moose-users
Hello, Yaqi,

Thanks for a good idea!
I've run the advection problem for 10000 cells and 1000 time steps.
I see that the total computational time for MOOSE DG is very big again:
  • 1D problem, FIRST LAGRANGE elements -- 61 s
  • 1D problem, CONSTANT MONOMIAL elements -- 195 s
  • 2D problem, CONSTANT MONOMIAL elements -- 315 s
  • OpenFOAM -- 8 s
  • 2D in-house DG -- 50 s
Results of profiling are attached here (1D FEM, 1D const monomial and 2D const monomial). 

--
Best regards, Victoria


понедельник, 29 июня 2020 г., 20:20:04 UTC+3 пользователь Yaqi Wang написал:
We need to start a new thread on this. The residual evaluation with DG can be a little slower than without DG kernels, but the solving time should be fine. MOOSE even allows you to write your own solver (executioner).

I guess I forgot sending out my emails before: Is the problem size too small for performance study? Because it is so small, some other minor overhead like doing output at each time step can show up and MOOSE run slower. If you make the problem have 10,000 cells, what you will see?

prof_1d_advection_CONSTANT_MONOMAIL_10k
prof_2d_advection_CONSTANT_MONOMIAL_10k
prof_1d_advection_FIRST_LAGRANGE_10k

Daniel Schwen

unread,
Jun 30, 2020, 10:30:48 AM6/30/20
to moose-users
Victoria,
Number of time steps is not a good way to compare. You'll have to compare runs to a specific simulation time while staying below an error boundary.
Daniel

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/c1633c12-fc60-4306-98db-6884986086dco%40googlegroups.com.

Victoria Korchagova

unread,
Jun 30, 2020, 10:42:01 AM6/30/20
to moose-users
Hello, Daniel,

Do you mean using the same t_end in tests? I have it: the same size of the time step and the same number of steps in all cases produce the same value of t_end.
--
Best regards, Victoria

вторник, 30 июня 2020 г., 17:30:48 UTC+3 пользователь Daniel Schwen написал:
Victoria,
Number of time steps is not a good way to compare. You'll have to compare runs to a specific simulation time while staying below an error boundary.
Daniel

Alexander Lindsay

unread,
Jun 30, 2020, 11:22:04 AM6/30/20
to moose...@googlegroups.com
That's good that t_end is consistent.

How about 1,000,000 cells for 1-10 time steps?

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/a6dd26d0-f158-4ce6-bfab-189e3b6ea26do%40googlegroups.com.

Alexander Lindsay

unread,
Jun 30, 2020, 11:30:00 AM6/30/20
to moose...@googlegroups.com
MOOSE is designed with large scale problems in mind, so if we compare poorly at the large scale, then we have legitimate concern.

For large problems we recommend roughly 20,000 degrees of freedom per process in order to get the most bang for your buck in terms of parallel communication. So if you were to run a 1,000,000 cell problem (with just scalar transport and the original one variable problem you shared, so 1 dof per cell), then we'd recommend approximately 50 processes.

Also as Daniel said, and I'm sure you've considered this, make sure that your solve tolerance is the same across the different cases you're running. MOOSE by default solves pretty tightly (1e-8).

Yaqi Wang

unread,
Jun 30, 2020, 11:35:27 AM6/30/20
to moose...@googlegroups.com
With DG, you cannot use FIRST LAGRANGE shape function, because that shape function is enforced to be C0. More than 20 times slower is indeed a problem. Can you attach your input here? I'd like to take a quick look.

Victoria Korchagova

unread,
Jun 30, 2020, 11:50:01 AM6/30/20
to moose-users
Yes, you can find input files attached here.
I remember that we do not use DG kernels in classic FEM cases.

вторник, 30 июня 2020 г., 18:35:27 UTC+3 пользователь Yaqi Wang написал:
With DG, you cannot use FIRST LAGRANGE shape function, because that shape function is enforced to be C0. More than 20 times slower is indeed a problem. Can you attach your input here? I'd like to take a quick look.

On Tue, Jun 30, 2020 at 9:30 AM Alexander Lindsay <alexlin...@gmail.com> wrote:
MOOSE is designed with large scale problems in mind, so if we compare poorly at the large scale, then we have legitimate concern.

For large problems we recommend roughly 20,000 degrees of freedom per process in order to get the most bang for your buck in terms of parallel communication. So if you were to run a 1,000,000 cell problem (with just scalar transport and the original one variable problem you shared, so 1 dof per cell), then we'd recommend approximately 50 processes.

Also as Daniel said, and I'm sure you've considered this, make sure that your solve tolerance is the same across the different cases you're running. MOOSE by default solves pretty tightly (1e-8).

On Tue, Jun 30, 2020 at 8:21 AM Alexander Lindsay <alexlin...@gmail.com> wrote:
That's good that t_end is consistent.

How about 1,000,000 cells for 1-10 time steps?

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose...@googlegroups.com.
1d_advection_fem.i
1d_advection_dg.i

Yaqi Wang

unread,
Jun 30, 2020, 2:39:51 PM6/30/20
to moose...@googlegroups.com
The 1d input looks fine to me. You do not have BC but I guess for this particular piece-wise function, we do not hit a point where having BCs matters. I am not sure if we have all optimizations for constant monomial in place. I am afraid even further optimization is possible, it will require some nontrivial efforts. If you can make contributions, it will be great since everyone else can potentially benefit from them. On the other hand, you can use more processors to reduce the CPU time. This is arguable because other codes can do the same. You will have to think why you want to use MOOSE. MOOSE's strength is for multi-physics coupling, which typically means certain optimizations for single physics are either unapplicatable or not worthwhile in multi-physics senario. MOOSE can also reduce the development time and is a good tool for collaborations. Others may have more suggestions to you.

Best,
Yaqi

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/153e4b6e-e045-4620-b1b3-e6ad441e8449o%40googlegroups.com.

Victoria Korchagova

unread,
Jun 30, 2020, 3:06:44 PM6/30/20
to moose-users
Yes, it's interesting) 
Results for 1e6 cells and 10 steps (1 core) are the following:
  • 1d advection, FIRST LAGRANGE -- 74 s
  • 1d advection, CONSTANT MONOMIAL -- 218 s
  • 2d advection, CONSTANT MONOMIAL -- 355 s 
  • OpenFOAM -- 40 s
  • 2d in-house DG -- 115 s
Profiling results for MOOSE cases are attached here too.

prof_1d_advection_CONSTANT_MONOMAIL_1kk.txt
prof_2d_advection_CONSTANT_MONOMIAL_1kk.txt
prof_1d_advection_FIRST_LAGRANGE_1kk.txt

Aleksander Dubas

unread,
Jun 30, 2020, 3:09:15 PM6/30/20
to moose-users
Could you share the fvSchemes and fvSolution file you are using to compare in OpenFOAM?

Many Thanks in Advance,

Aleks

On Tuesday, 30 June 2020 16:30:00 UTC+1, Alexander Lindsay wrote:
MOOSE is designed with large scale problems in mind, so if we compare poorly at the large scale, then we have legitimate concern.

For large problems we recommend roughly 20,000 degrees of freedom per process in order to get the most bang for your buck in terms of parallel communication. So if you were to run a 1,000,000 cell problem (with just scalar transport and the original one variable problem you shared, so 1 dof per cell), then we'd recommend approximately 50 processes.

Also as Daniel said, and I'm sure you've considered this, make sure that your solve tolerance is the same across the different cases you're running. MOOSE by default solves pretty tightly (1e-8).

On Tue, Jun 30, 2020 at 8:21 AM Alexander Lindsay <alexlin...@gmail.com> wrote:
That's good that t_end is consistent.

How about 1,000,000 cells for 1-10 time steps?

Yaqi Wang

unread,
Jun 30, 2020, 3:38:51 PM6/30/20
to moose...@googlegroups.com
Oh, btw, 10,000 elements with 1000 time steps, it takes about 100s on my laptop and 36s with 4 processors.

Victoria Korchagova

unread,
Jun 30, 2020, 4:37:56 PM6/30/20
to moose-users
Good evening, Aleks,
yes, of course, fvSchemes and fvSolution are attached here.

Yaki, you may have a new fast machine. I used an HPC cluster for computations.

--
Best regards, Victoria


fvSchemes.txt
fvSolution.txt

Jed Brown

unread,
Jun 30, 2020, 5:00:47 PM6/30/20
to Yaqi Wang, moose...@googlegroups.com
Yaqi Wang <yaqi...@gmail.com> writes:

> Oh, btw, 10,000 elements with 1000 time steps, it takes about 100s on my
> laptop and 36s with 4 processors.

This is for 1D advection, comparing DG with 2nd order (limited) FV? The
above takes about 1 second in PETSc src/ts/tutorials/ex9.c (using
minmod, MC, or a similar limiter). The naive code in that tutorial did
not auto-vectorize, and could be made several times faster with some
coaxing.

For high-order unstructured 3D problems (such as diffusion or
elasticity, explicit or cost per Krylov iteration), we can do 1000 time
steps in 1 second with 1-2M dofs on one multicore node (or one GPU).
You can see some performance profiles here (scroll down to the figure
and click to select)

https://nbviewer.jupyter.org/github/CEED/libCEED/blob/altair/benchmarks/postprocess_altair.ipynb

This is in no way competing with MOOSE, just demonstrating that both
MOOSE and OpenFOAM have a lot of headroom if you're running
stripped-down performance experiments.

Yaqi Wang

unread,
Jun 30, 2020, 5:34:52 PM6/30/20
to Jed Brown, moose...@googlegroups.com
This is a simple DG0 with upwinding test. I used Instruments and saw that the hot spot is FEProblemBase::reinitNeighbor (76%) of time, which should not be that high and can be opt out. We knew that function is not optimized for general DG and planned to do something about it if I remember correctly. Element init can be done faster too for DG0. We probably can make it 5 times faster easily for DG0. For high order DG, it will possibly take much more time. With MOOSE, we have to do a residual evaluation per time step (contains all the things I mentioned), that takes 96% of CPU time total, for this simple problem, we actually do not need that. So this is not much a fault of MOOSE actually.

Aleksander Dubas

unread,
Jul 3, 2020, 9:42:09 AM7/3/20
to moose-users
Thank you for providing the files, I find this comparison interesting as it is challenging to get a like-for-like comparison between two FV codes let alone across FV/DGFE.  I believe you are getting a more accurate result with MOOSE c.f. your OpenFOAM setup, it would be interesting to see this comparison scaled up to 3D with a more realistic geometry.

Which OpenFOAM / MOOSE are you using and how did you install them?

Many Thanks,

Aleks

Victoria Korchagova

unread,
Jul 3, 2020, 12:11:25 PM7/3/20
to moose-users
Hello, Aleks,

I have OpenFOAM v.1912 installed from sources and MOOSE 1st May commit from github (commit 450966f5394eb778b52e1021d34136d4bacb4a0e, branch master). I've tried to update branch but I have compilation errors (maybe I should recompile libmesh and petsc too?..)

--
Best regards, Victoria

пятница, 3 июля 2020 г. в 16:42:09 UTC+3, dub...@gmail.com:

Daniel Schwen

unread,
Jul 3, 2020, 12:50:13 PM7/3/20
to moose-users
Yeah, you need to update at least libmesh

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/ba776140-1598-448a-806b-15f0a390cca7n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages