Speeding up processes using wrf.getvar

277 views
Skip to first unread message

Daniel Vecellio

unread,
Dec 6, 2017, 1:24:30 PM12/6/17
to wrfpython-talk
Hello,

I'm new to using the wrf-python package. I'm attempting to do some potential temperature-height analysis now, but using 'getvar' to calculate theta and z for my WRF output (289 X 39 X 300 X 300) is taking nearly 5 hours. Are there are tips on speeding up code when getting variables that have to be calculated by getvar since they are not a part of the normal WRF output?

Thanks,
Dan

Bill Ladwig

unread,
Dec 6, 2017, 2:24:18 PM12/6/17
to Daniel Vecellio, wrfpython-talk
Hi Daniel,

If you are running this with ALL_TIMES as your time index, you are probably running in to memory issues, since the computation itself isn't that time consuming (certainly no where near 5 hours).  Each one of your variables is over 4 GB (289 x 39 x 300 x 300 x 4) if you extract the whole thing, and the height calculation extracts three of them.  So, this is going to put you well over 16 GB when you include the results for z and theta.  Are you running this on a laptop or super computer?  If it's your laptop, then this is probably taxing your system and a lot of swap memory is being used.  

Not completely knowing your hardware set up, the first thing I would try is to manually loop over each time and fill in your own result array.  This should bypass wrf-python's behavior of extracting the entire dataset before performing the calculation, so it should save a lot of memory (hopefully).  

Below is some code to demonstrate this.  It's completely untested, so use as more of a conceptual guide, but I'll try to give something that works.

I'm assuming your code currently looks like:

from netCDF4 import Dataset 
from wrf import getvar, ALL_TIMES 
 
file_list = [Dataset("path_to_wrf_file"), ...]  # Or this could be a single massive WRF file
theta = getvar(file_list, "theta", ALL_TIMES)
z = getvar(file_list, "z", ALL_TIMES) 
... # Do something with this

Try this instead:
 
from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES 
import numpy as np
 
filenames = ["path/to/wrf_file1", "path/to/wrf_file2", ...] 
 
# Allocate your computation result here.  It could either be theta and z,
# or something else that you compute from them.
# For this example, will create theta and z (note, this is over 8 GB for both) 
theta_final = np.empty((289, 39, 300, 300), np.float32) 
z_final = np.empty((289, 39, 300, 300), np.float32) 
 
 
# Note: This assumes 1 time per file, adjust if using a different WRF file configuration 
for i in range(289):
    # This should add less than 100 MB to the 8 GB above 
    f = Dataset(filenames[i])
    theta = getvar(f, "theta", i) 
    theta_final[i,:] = theta[:] 
    z = getvar(f, "z", i)
    z_final[i, :] = z[:]
    f.close() 
 

If this doesn't help, we'll have to get more creative.  For example, chunking the calculation up in to time chunks and saving intermediate results.  Projects like dask already do this kind of thing, but it'll be a while until wrf-python takes advantage of it.  The next release will include multithreaded support via OpenMP, but I'm still suspecting the problem is with memory issues.  If not, I'll need to know more about the hardware you are using and it would probably help to see your script.

Hope this helps,

Bill


--
You received this message because you are subscribed to the Google Groups "wrfpython-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wrfpython-talk+unsubscribe@ucar.edu.
To post to this group, send email to wrfpyth...@ucar.edu.
Visit this group at https://groups.google.com/a/ucar.edu/group/wrfpython-talk/.

Daniel Vecellio

unread,
Dec 6, 2017, 3:50:46 PM12/6/17
to Bill Ladwig, wrfpython-talk
Bill,

Thanks for the suggestion! It has sped up enough to the point where I'll be able to get data in a useful amount of time. I think that my laptop is just not powerful enough to have it go any quicker. 

Best,
Dan
--

Daniel J. Vecellio

Ph.D. Student - Climate Science Lab, Department of Geography, Texas A&M University
Head Teaching Assistant - Geography 213, Texas A&M University
Education Chair - United States Permafrost Association
ExCom and United States National Representative - Permafrost Young Researchers Network
Communications Director - Students and New Professionals Group, International Society of Biometeorology 

M.S. Atmospheric Science - Texas Tech University
B.S. Meteorology - Pennsylvania State University

Office: 814-1 O&M
Reply all
Reply to author
Forward
0 new messages