Hi NetCDF4-Python people
We are happily using NetCDF4-Python in our production system which provides meterological and oceanographic forecasts. Some of our operational processing is limited by both CPU and I/O (reading and writing NetCDF files). So I would like to overlap computations and I/O to speed up stuff. I am using the multiprocessing module for splitting the workload on multiple cores (we have 12 cores per node). As it is now I let the worker processes read in data at will but with a lock/semaphore so that only one process is reading data at a time (with the multiprocessing module it is simply to expensive to have the master thread reading/writing data and passing it to processes since it has to be pickled for transmission). After that the data are processed and the next chunk is read in. But I would really like to do this in a more efficient (as in less wall time) manner (which also means that I would like to read larger chunks of data when the CPU is busy).
What I am thinking is that I could let each process create an "I/O thread" using the threading module. And then just let that thread read data as fast as possible and let the worker thread in that process poll it for data when it needs more data. Do you have any experience doing something similar? Or maybe I should just let the master process read in data (non-blocking) and distribute it using a fast transfer method like mpi4py for the distribution? Any advice?
Best regards,
Jesper Larsen