PFlotran Example Problems

192 views
Skip to first unread message

Stephanie Labasan

unread,
Oct 24, 2012, 3:17:53 PM10/24/12
to sc...@googlegroups.com

I can run the 5-spot and CO2 in the shortcourse/examples on one node with no problems. When I increase the amount of nodes without altering the input file, the example runs to completion, but the output is interspersed with: 


(...similar errors removed….)

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5Dio.c line 228 in H5Dwrite(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5D.c line 391 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5D.c line 141 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5Dio.c line 228 in H5Dwrite(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5D.c line 391 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5G.c line 766 in H5Gclose(): not a group

    major: Invalid arguments to routine

    minor: Inappropriate type

HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:

  #000: H5F.c line 1991 in H5Fclose(): decrementing file ID failed

    major: Object atom

    minor: Unable to close file

  #001: H5I.c line 1450 in H5I_dec_app_ref(): can't decrement ID ref count

    major: Object atom

    minor: Unable to decrement reference count

  #002: H5F.c line 1767 in H5F_close(): can't close file, there are objects still open

    major: File accessability

    minor: Unable to close file

HDF5: infinite loop closing library

      D,G,S,T,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F

  1 fnrm: 3.32E-12 xnrm: 1.59E+09 pnrm: 2.47E+02 inrmr: 1.80E-12 inrmu: 1.48E+02 rsn:   0

  2 fnrm: 9.59E-14 xnrm: 1.59E+09 pnrm: 2.33E+01 inrmr: 6.76E-14 inrmu: 2.32E+01 rsn:   0

  3 fnrm: 1.34E-16 xnrm: 1.59E+09 pnrm: 1.94E-01 inrmr: 7.55E-18 inrmu: 1.71E-01 rsn: stol


If I run 100_100_100 in pflotran-dev/example_problems on one node, i have no problems, but if I increase the nodes to two, I get this: 

srun: First task exited 30s ago

srun: tasks 16-19,21-22,24-29,31: running

srun: tasks 0-15,20,23,30: exited

srun: Terminating job step 1002177.0

slurmd[venus2]: *** STEP 1002177.0 KILLED AT 2012-10-24T12:07:54 WITH SIGNAL 9 ***

srun: Job step aborted: Waiting up to 2 seconds for job step to finish.

slurmd[venus2]: *** STEP 1002177.0 KILLED AT 2012-10-24T12:07:54 WITH SIGNAL 9 ***


I'm not sure what the difference is between the problems in shortcourse and that in the example_problems.  I believe we alter the NXYZ line when we want to change the grid resolution, but how do we know what to change it to without running into errors for some grid sizes? 


Sincerely, 
Stephanie Labasan
University of the Pacific

Richard Tran Mills

unread,
Oct 29, 2012, 12:40:18 PM10/29/12
to sc...@googlegroups.com
Hi Stephanie,

I am not sure what the problem is, but I assume that this has something to
do with the fact that the problem size is becoming too small to span your
cluster nodes. There is a line like

NXYZ 30 30 1

This means your grid is 30 x 30 x 1. Increase the first two to get larger
nodes. When you are doing this, I also recommend commenting out or
deleting the line

OPERATOR_SPLIT

which will turn off operator-split timestepping, which is subject to
limitations in the time step for stability reasons. With that line turned
off, fully implicit timestepping -- which does not have said stability
concerns -- will be used. I am thinking that for the purposes of the
competition, we will probably be employing fully implicit.

I need to talk to the SCC organizers to get a better idea of what kinds of
problems are appropriate for the competition. I will also be putting up a
web page in a few days with some more information on PFLOTRAN. I
apologize for not having done so already, but I was brought into the SCC
competition a bit late.

Best regards,
Richard

On 10/24/12 3:17 PM, Stephanie Labasan wrote:
>
> I can run the 5-spot and CO2 in the shortcourse/examples on one node
> with no problems. When I increase the amount of nodes without altering
> the input file, the example runs to completion, but the output is
> interspersed with:
>
>
> (...similar errors removed�.)
> --
> You received this message because you are subscribed to the Google
> Groups "SCC12" group.
> To post to this group, send email to sc...@googlegroups.com.
> To unsubscribe from this group, send email to
> scc12+un...@googlegroups.com.
> Visit this group at http://groups.google.com/group/scc12?hl=en.
>
>


--
Richard Tran Mills, Ph.D.
Computational Earth Scientist | Joint Assistant Professor
Hydrogeochemical Dynamics Team | EECS and Earth & Planetary Sciences
Oak Ridge National Laboratory | University of Tennessee, Knoxville
E-mail: rmi...@ornl.gov V: 865-241-3198 http://climate.ornl.gov/~rmills

Reply all
Reply to author
Forward
0 new messages