Hi there..I ran into a problem with the moose app that I am using on a linux cluster.Background information:- I am simulating deep geothermal reservoirs --> darcy flow coupled with heat transport (fault zones as discrete features, different permeability zones, production and injection wells)- Moose-App: Golem developed by the guys at GFZ Potsdam Germany- I generated my mesh with MeshIT (also from GFZ)- Tested and simulated several different scenarios with this mesh-file and Golem successfully on my workstation- Managed to install Moose and Golem on a linux cluster (got some help from the support guys there)So what I then did was taking a simulation that i had successfully finished on my workstation and copying it over to the cluster to test the scaling.I ran about twenty simulations with exact the same simulation and changed only the number of compute nodes/number of cores.Here I discovered that from time to time the simulation failed with the following output:==================================================================================== BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES= PID 5334 RUNNING AT mpp2r04c05s02= EXIT CODE: 9= CLEANING UP REMAINING PROCESSES= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES===================================================================================I noticed that when I just resent the Job without changing the settings maybe one or two tries later the simulation finished without the error.I read that this error could indicate that I am running out of memory but I am wondering why I get it only sometimes. Also the error happened regardless of the number of nodes (28cores each). And also the error occured at two different points in time: At the beginning of the simulation or at the end of a simulation (after the solve converged) before the output was written to the terminal.
To me this sounds like quite a serious and annoying bug - i added a note to the github issue.
a
From: moose...@googlegroups.com <moose...@googlegroups.com> on behalf of Tim Struppi <heroes....@gmail.com>
Sent: Friday, 13 October 2017 10:24 PM
To: moose-users
Subject: Re: Simulation occasionally crashes on cluster (EXIT CODE: 9)
Hi there so i was able to figure out that my error comes from having a postprocessor in my input file.
I was playing around with a new pretty simple model (that just solves for darcy flow in a porous medium). Here i am sure that there is not much memory needed and this model failed every time when i was running it on my cluster independent of the number of cores/nodes that i was using. After removing my postprocessor completely it worked nicley even while utilizing many nodes on the cluster.
here is my postprocessor block:
[Postprocessors]
[./FoerderDruck]
type = PointValue
point = '14200 15000 0'
variable = pore_pressure
enable = true
[../]
[]
I found the following report on github where it looks like this might be a bug related to the fact that there is no block number specified in this postprocessor: https://github.com/idaholab/moose/issues/9889
My problem is that i heavily rely on this postprocessor. I need the pressure evolution at this point as a final result in the form of .csv file.
So my question is if there is a work around or if someone has an idea how to fix this.
If needed i can provide my input file and my mesh.
Thanks for any help!
Greetings from Munich
Florian
--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/fadcb735-e1f2-42c8-adeb-ac693718fb1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Are there any news on this? I am still struggling with the problem... on github I noticed that something happened but I'm not sure if this problem got fixed? Do I need to compile moose from the devel-branch?
Am Montag, 3. Juli 2017 17:56:41 UTC+2 schrieb Tim Struppi:
--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/86a6ad17-1bf8-4021-89ec-b8ce72cda598%40googlegroups.com.