masking of unused compute nodes

180 views
Skip to first unread message

martin....@io-warnemuende.de

unread,
Apr 6, 2013, 4:06:13 AM4/6/13
to mom-...@googlegroups.com
One of the strong mpp infrastructure features for ocean/ice - only applications in mom4 was masking of
unused compute nodes. This saves up to half of the compute costs in some cases.The generation of mask lists was well supported by
the preprocesing tool.

Unfortunately this feature is now broken or undocumented.  The corresponding namelist elements in coupler_nml have been removed. In the users guide they are still described. 

I have found some hints that masking was moved to the fms code.However, I could not find a description, how to use it.

Could someone please help with a little example?

Many thanks,
Martin


Niki Zadeh

unread,
Apr 6, 2013, 3:42:57 PM4/6/13
to mom-...@googlegroups.com
Hi Martin,

The masking is now specified in a table in the INPUT/  and the table name appears as a namelist item for the components that need it. Below is an example of how to make it work for the baltic1 experiment (on 40 pe's). If you are not familiar with GFDL
 xml syntax  then you can get the experiment inputs from the ftp site:

wget ftp.gfdl.noaa.gov:/perm/MOM4/mom4p1_pubrel_dec2009/exp/baltic1_withMask.input.tar.gz

and compare it with:

wget ftp.gfdl.noaa.gov:/perm/MOM4/mom4p1_pubrel_dec2009/exp/baltic1.input.tar.gz

Hope this helps,

Niki


   <experiment name="baltic1_withMask" inherit="baltic1">
      <input>
         <namelist name="ocean_model_nml">
      dt_ocean = 1200
      vertical_coordinate='zstar'
      time_tendency='twolevel'
      impose_init_from_restart=.false.
      baroclinic_split=1
      surface_height_split=1
      barotropic_split=12
      debug=.false.
      layout = 6,11
      mask_table='INPUT/mask_table'
         </namelist>


          <namelist name="ice_model_nml">
       nsteps_dyn=72
       nsteps_adv=1
       num_part=6
       spec_ice=.false.
       ice_bulk_salin=1.0e-7
       alb_sno=0.80
       t_range_melt=10.0
       heat_rough_ice=5.0e-4
       mom_rough_ice=5.0e-4
       wd_turn=0.0
       slp2ocean=.true.
       do_ice_limit=.false.
       max_ice_limit=3.0
       do_sun_angle_for_alb=.true.
       layout = 6,11
       mask_table='INPUT/mask_table'
         </namelist>

         <namelist name="land_model_nml">
      layout = 6,11
      mask_table='INPUT/mask_table'
         </namelist>
         <namelist name="atmos_model_nml">
      layout = 6,11
      mask_table='INPUT/mask_table'
         </namelist>

         <csh><![CDATA[
cd $work/INPUT
cat > mask_table << mask_EOF
 26
 6 , 11
 5,1
 6,1
 5,2
 6,2
 6,3
 6,4
 2,5
 2,6
 1,7
 2,7
 1,8
 2,8
 5,8
 6,8
 1,9
 2,9
 5,9
 6,9
 1,10
 2,10
 3,10
 6,10
 1,11
 2,11
 3,11
 6,11
mask_EOF
    ]]></csh>

      </input>
      </input>

      <runtime>
         <regression name="basic">
            <run days="8" npes="40" runTimePerJob="00:40:00"/>
         </regression>
         <regression name="rts">
            <run days="4 4" npes="40" runTimePerJob="00:40:00"/>
         </regression>
         <regression name="trapnan">
            <run days="1" npes="40" runTimePerJob="02:00:00"/>
         </regression>
          <reference restart="$(REFERENCE)/1x0m8d_40pe/restart/19900910.$(TAREXT)"/>
      </runtime>
   </experiment>






--
You received this message because you are subscribed to the Google Groups "MOM Users Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mom-users+...@googlegroups.com.
To post to this group, send email to mom-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/mom-users/-/7jbRniKkfX0J.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Stephen Griffies

unread,
Apr 7, 2013, 6:25:23 PM4/7/13
to mom-...@googlegroups.com
MOM users: 

This email is related to this query...

Stephen Griffies 




Hi Martin,

   We updated mask domain option to be used in a fully coupled model and still could be used for sea-ice model. For the case that ocean, atmosphere, ice and land
all use the same grid, the following need to be done.
1) use tool check_mask to create mask_table. The tool is in tools/check_mask.
   The mask_table will contains the layout, number of masked processor and list of 
   masked processors.
2) Set ocean_model_nml, atmos_model_nml, land_model_nml and ice_model_nml 
   variable mask_table to the file created from step 1). Remember to add prefix       
   INPUT/ ahead of the file name. Also copy the mask table to INPUT.

 Normally we run a fully coupled model in concurrent mode ( ocean model in a 
   separate pelist). Just set ocean_model_nml mask_table and only ocean model
  use the mask domain option.

  Please do not hesitate to ask me if you have further questions.

Zhi


You may use the tool check_mask to create the mask_table.


On Sat, Apr 6, 2013 at 4:06 AM, <martin....@io-warnemuende.de> wrote:

Martin Schmidt

unread,
Apr 18, 2013, 2:14:05 AM4/18/13
to mom-...@googlegroups.com
Many thanks Niki and Zhi,
it works as described.
Cheers,
Martin
You received this message because you are subscribed to a topic in the Google Groups "MOM Users Mailing List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mom-users/p34H_l7HKuo/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to mom-users+...@googlegroups.com.

To post to this group, send email to mom-...@googlegroups.com.

Mariona Claret

unread,
Jun 21, 2017, 7:30:07 PM6/21/17
to MOM Users Mailing List
Hi all,

  I am running the om3_core1 test case (ocean/ice coupled model with prescribed atmospheric forcing) using MOM5. I'm trying to save some computational time by masking unused land nodes. I've followed the instructions given in the follow-ups of this post, however I'm missing something since I can not get the model running.

  I've done the following. First, I've created a mask.table.34.22x16 field using the preprocessing tool check_mask (see in PS how it looks like). Then modified input.nml as follows:

&ice_model_nml
layout = 22,16
mask_table='INPUT/mask_table.34.22x16'

 &ocean_model_nml
layout = 22,16
mask_table='INPUT/mask_table.34.22x16'

Then in the MOM_SIS_run.csh script I set npes=318, which is 22x16-34. However, when I ran I get the following error:

MPP_DEFINE_DOMAINS2D: incorrect number of PEs assigned for this layout and maskmap. Use      352 PEs for this domain decomposition for diamond

If I use npes=352 instead then the error is:

FATAL from PE     0: fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for Ice model

or

FATAL from PE     0: fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for ocean model


  Any help/insight will be very appreciated.
 
  Thanks in advance,
Mariona

PS: my mask_table.34.22x16 file looks like this:

34
 22 , 16
 1,1
 2,1
 3,1
 4,1
 21,1
 14,5
 14,6
 14,7
 14,8
 19,8
 19,9
 18,10
 19,10
 18,11
 19,11
 1,12
 2,12
 22,12
 1,13
 2,13
 11,13
 12,13
 21,13
 22,13
 1,14
 2,14
 3,14
 11,14
 21,14
 22,14
 1,15
 2,15
 22,15
 1,16

Russ Fiedler

unread,
Jun 21, 2017, 8:23:40 PM6/21/17
to mom-...@googlegroups.com

Hi Mariona,

For the first error message "diamond" references the handling of icebergs . It looks like this part of the code doesn't support processor masking.

You'll need to set do_icebergs=.false. in ice_model_nml if you want to use masking.

Cheers,
Russ
--
You received this message because you are subscribed to the Google Groups "MOM Users Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mom-users+...@googlegroups.com.
To post to this group, send email to mom-...@googlegroups.com.

Mariona Claret

unread,
Jun 22, 2017, 2:17:45 AM6/22/17
to MOM Users Mailing List
Hi Russ,

  Thanks for your swift reply and help, it works.

Have a good day,
Mariona

Mariona Claret

unread,
Jun 22, 2017, 6:36:58 PM6/22/17
to MOM Users Mailing List
Hi again,

  After setting do_icebergs=false I'm running into a segmentation fault. When backtracing the error I'm getting the following message:

Image              PC                Routine            Line        Source            
fms_MOM_SIS.x      0000000007F74872  mpp_domains_mod_m         146  mpp_do_global_field.h
fms_MOM_SIS.x      0000000007F612A9  mpp_domains_mod_m          37  mpp_global_field.h
fms_MOM_SIS.x      0000000007F60DDF  mpp_domains_mod_m          16  mpp_global_field.h
fms_MOM_SIS.x      00000000028D3826  ocean_bihgen_fric        1569  ocean_bihgen_friction.F90
fms_MOM_SIS.x      00000000027F49DD  ocean_bihgen_fric         723  ocean_bihgen_friction.F90
fms_MOM_SIS.x      000000000278F73C  ocean_bih_frictio         219  ocean_bih_friction.F90
fms_MOM_SIS.x      00000000005587C4  ocean_model_mod_m        1305  ocean_model.F90
fms_MOM_SIS.x      000000000051D3B0  coupler_main_IP_c        1325  coupler_main.F90
fms_MOM_SIS.x      000000000050FC8E  MAIN__                    428  coupler_main.F90
fms_MOM_SIS.x      000000000040BD16  Unknown               Unknown  Unknown
libc.so.6          00002AD1F7FC6D1D  Unknown               Unknown  Unknown
fms_MOM_SIS.x      000000000040BC09  Unknown               Unknown  Unknown
forrtl: severe (408): fort: (3): Subscript #1 of the array LIST has value -3 which is less than the lower bound of 0

  The issue seems to be in bih_friction? My settings for this are:

 &ocean_bih_friction_nml
      bih_friction_scheme='general'
/

  Any help on what I'm missing would be very appreciated.

Thanks,
Mariona

Russ Fiedler

unread,
Jun 22, 2017, 10:03:45 PM6/22/17
to mom-...@googlegroups.com

Hi,

How did you set your halos for the masking? Biharmonic friction require a halo of 2 but the default is 1.


  !--- namelist interface
  integer :: min_pe=4,max_pe = 128
  integer :: halo = 1
  logical :: show_valid_only = .false.
  namelist /check_mask_nml/ min_pe, max_pe, halo, show_valid_only

This means that you may have masked too many tiles.

Try creating the mask again changing halo to 2. Hopefully, that will work, however....

Ah, it's in the ncar_boundary_scale_create routine. I remember that Aidan Heerdegen had problems with this a few years back. I think the solution was to create the file ahead of time and use the ncar_boundary_scaling_read=.true. option for subsequent runs.


Found the thread!


https://groups.google.com/forum/#!topic/mom-users/BSAnyDX6K8o


So,

! <SUBROUTINE NAME="ncar_boundary_scale_read">
!
! <DESCRIPTION>
!
! Read in the 3d ncar boundary scaling field and use this to
! rescale the background viscosities.
!
! To use this routine, we need to already have generated the field
! ncar_rescale using the routine ncar_boundary_scale_create.
!
! The advantage of reading ncar_rescale is that we do not need to
! introduce any global 2d arrays required for ncar_boundary_scale_create.
! So the idea is to pay the price once by running ncar_boundary_scale_create,
! save ncar_rescale, then read that field in during subsequent runs through
! ncar_boundary_scale_read.
!
! Here are the steps:
! 1/ run one time with ncar_boundary_scaling_read=.false.
! and ncar_boundary_scaling=.true.
! Be sure that the field ncar_rescale is saved in diagnostic table.
! To ensure answers agree whether reading ncar_rescale or creating it
! during initialization, it is necessary to save ncar_rescale using the
! double precision option in the diagnostic table (packing=1).
!
! 2/ extract field ncar_rescale from the diagnostics output
! and place into its own file INPUT/ncar_rescale.nc
! example extraction using ncks:
! ncks -v ncar_rescale 19900101.ocean_month.nc ncar_rescale.nc
!
! 3/ set ncar_boundary_scaling_read=.true.
! and ncar_boundary_scaling=.true., and now run the model
! reading in ncar_rescale rather than regenerating
! it during each initialization (which can be a bottleneck
! for large models on huge processor counts).
!
! 4/ As a check that all is fine, save ncar_rescale as a diagnostic
! for both the create and the read stage and make sure they agree.
! Also, all checksums should agree whether reading in ncar_rescale
! or creating it each initialization, so long as the ncar_rescale.nc
! was saved with double precision  (see step 1/ above).
!


Your first dummy run will require all the processors.

Aidan, did you end up making a stand alone program that did this?

Russ

Aidan Heerdegen

unread,
Jun 22, 2017, 10:21:46 PM6/22/17
to mom-...@googlegroups.com
Hi Russ,

> Aidan, did you end up making a stand alone program that did this?

Nope.

It is possible (while testing) to set

ncar_boundary_scaling_read=.false.

and worry about pre-computing it later.

I don’t actually know what the computational penalty is for computing it every time, nor how it scales with model size.

Sorry, I should have remembered the trigger phrases for that problem.

Also Mariona, be careful if you change masking and/or processor layout at a later date that you collate your restart files.

Cheers

Aidan

Marshall Ward

unread,
Jun 22, 2017, 10:24:46 PM6/22/17
to mom-...@googlegroups.com
No comment on the current issue here, but...

I remember looking at this function, and that it did not scale well.
I think it's another case of point-to-point operations that could be
rewritten to use collectives. The function did not even run on some
platforms with less tolerant MPI libraries.

But it was a one-time call so did not worry much about it at the time.
> --
> You received this message because you are subscribed to the Google Groups "MOM Users Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mom-users+...@googlegroups.com.
> To post to this group, send email to mom-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/mom-users/C46A56AB-AABF-486D-943D-AA03ABE09DF9%40gmail.com.

Mariona Claret

unread,
Jun 23, 2017, 3:06:33 PM6/23/17
to MOM Users Mailing List
Thanks all for your time on replying. I see the way to go and will give it a try.
have a good day,
mariona
Reply all
Reply to author
Forward
0 new messages