File size about 4 time bigger after regrid to same size of grid - RECOMPRESS?

31 views
Skip to first unread message

Peter Willetts

unread,
Dec 21, 2015, 5:58:45 AM12/21/15
to Iris
Hi

I am regridding some wind pp files from 'rho' to 'theta' height levels, and onto a slightly shifted lat lon grid.  The output has the same number of data points as the input, but the file size is ~ 4 times bigger.  This isn't so much of an issue for the file sizes I'm showing here, but I am also doing this for a higher resolution simulation, where 200G input goes to ~900G, and because output from subsequent computations is also ~900G, I am running out of space! I did think that this was to do with compression? I'm basically looking for a way to reduce the file size of the output back to around the input file size, if possible.

Not sure what else to show here at this point.

-rw-r--r--. 1 pwille cascade  19G Nov 11 12:19 2.pp
-rw-r--r--. 1 pwille cascade  76G Dec  5 02:16 2_on_theta.pp

Input:

y_wind / (m s-1)                    (forecast_period: 6; forecast_reference_time: 84; model_level_number: 70; grid_latitude: 599; grid_longitude: 600)
     Dimension coordinates:
          forecast_period                           x                           -                       -                  -                    -
          forecast_reference_time                   -                           x                       -                  -                    -
          model_level_number                        -                           -                       x                  -                    -
          grid_latitude                             -                           -                       -                  x                    -
          grid_longitude                            -                           -                       -                  -                    x
     Auxiliary coordinates:
          time                                      x                           x                       -                  -                    -
          level_height                              -                           -                       x                  -                    -
          sigma                                     -                           -                       x                  -                    -
     Attributes:
          STASH: m01s00i003
          source: Data from Met Office Unified Model
          um_version: 8.2



Output:
y_wind / (m s-1)                    (forecast_period: 6; forecast_reference_time: 84; model_level_number: 70; grid_latitude: 600; grid_longitude: 600)
     Dimension coordinates:
          forecast_period                           x                           -                       -                  -                    -
          forecast_reference_time                   -                           x                       -                  -                    -
          model_level_number                        -                           -                       x                  -                    -
          grid_latitude                             -                           -                       -                  x                    -
          grid_longitude                            -                           -                       -                  -                    x
     Auxiliary coordinates:
          time                                      x                           x                       -                  -                    -
          level_height                              -                           -                       x                  -                    -
          sigma                                     -                           -                       x                  -                    -
     Attributes:
          STASH: m01s00i003
          source: Data from Met Office Unified Model
          um_version: 8.2

marqh

unread,
Dec 30, 2015, 5:53:24 AM12/30/15
to Iris
Hello Peter

commonly PP files are packed using the WGDOS compression.

This compression scheme is controlled by the model when the data is written.  There is not currently a capability to compress the data payload during Iris' save to PP, you will always get uncompressed data.

You may get some success in terms of size on disk from packing the data using a standard process, such as bzip or gzip.  However, you will need to unpack a file on disk before loading it, as Iris cannot unpack a gzip/bzip compressed file as a stream during load.

mark
Reply all
Reply to author
Forward
0 new messages