speed of reading pp files

98 views
Skip to first unread message

Luke Abraham

unread,
Jan 27, 2022, 8:34:29 AM1/27/22
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
Hello,

Are there any ways to reduce the read time of pp files? The eventual aim of this work is to do some post-processing on files during a UM simulation as part of the postproc task on the batch system, so we want to be able to read files in as quickly as possible.

Taking one of our typical daily pp files there are 97 separate fields with 24 time values at each hour. There are a number of different vertical levels, both pressure-based and model levels. 

I've been looking at the speed of reading this pp file using Iris and comparing this to cf-python. I've tried 2 different tests, reading in the whole file and also selecting a single field using the STASH code. I'm using Iris 3.1.0 and cf-python 3.12.0, both installed using conda. Both Iris and cf do lazy loading of data I think, and in cf the work is done in C rather than python I believe.

When reading in the whole file:
  • `iris.load` completed in 446.3398 s
  • `cf.read` completed in 10.0120 s
When selecting a single field from the file:
  • `iris.load` using `iris.AttributeConstraint(STASH='...')` completed in 46.7967 s
  • `cf.read` using `select='stash_code=...' completed in 0.1097 s
Obviously there is a very big difference here. Are there any tricks that I can play to get Iris any faster? If not, is there a function to convert a cf datastructure into an Iris cube?

Many thanks and best wishes,
Luke

Luke Abraham

unread,
Feb 25, 2022, 6:22:37 AM2/25/22
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
As an update to this, a colleague of mine has developed some tools to convert a CF data structure into both an Iris cube and a Community Intercomparison Suite cube, as we need to interface UM output to CIS to allow for the co-location of variables along flight tracks.

This can be seen here:


Best wishes,
Luke

RuthC

unread,
Feb 25, 2022, 6:56:48 AM2/25/22
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
Hi Luke,

depending on the structure of your files, you may be able to take advantage of structured_um_loading:

HTH

Ruth

Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
0 new messages