Increasing the size of the dimension.

751 views
Skip to first unread message

Chuan-Yuan Hsu

unread,
Apr 28, 2020, 11:38:27 PM4/28/20
to xarray
Hi All, 

I am working on the conversion of the observation data (ascii) to the netcdf file.  In the past, I was using netCDF4 and it is easy if I want to increase the size of the dimension. Here is an example. Saying that there is a variable, named temperature, which has spatial and temporal features (lon, lat, depth, time). At next time step, I will get another dataset (lon, lat, depth). How can I do this by using xarray? 

Temperature (lon, lat, depth, time=32) —> Temperature (lon, lat, depth, time=33)

I have spent most of my time last week trying to figure out how to increase the size of the dimension. Can someone help me on this?

CYHsu


Xylar Asay-Davis

unread,
Apr 29, 2020, 8:21:44 AM4/29/20
to xar...@googlegroups.com
Hi CYHsu,

I'm hoping someone more expert than me will also respond, but here's what I would recommend.

xarray isn't designed to modify dimension sizes in-place in a Dataset or DataArray.  Instead, it has capabilities for combining 2 or more Datasets (or DataArrays) into a new Dataset with the new dimension size.  I think the function you would want for doing this is concat.  There are some examples you can follow about how to combine data sets.  Something like this:

import xarray as xr

ds = xr.open_dataset('temperature_all.nc')
ds_slice = xr.open_dataset('temperature_slice.nc')
ds = xr.concat([ds, ds_slice], dim='time')

For cases like you're talking about where you may want to combine many time slices into a single data set, I have found that making a list of data sets and then concatenating them together is often useful.  In this example, I assume I have a list of files named temperature1.nc, temperature2.nc, etc., each with a single time slice:

import xarray as xr

dsList = []
for index in range(33):
    ds_slice = xr.open_dataset('temperature{}.nc'.format(index + 1))
    dsList.append(ds_slice)
ds = xr.concat(dsList, dim='time')

I hope that helps to get your started.

Cheers,
Xylar


--
You received this message because you are subscribed to the Google Groups "xarray" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xarray+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xarray/C55D097E-CADF-4683-9A4A-BC52BA43C162%40tamu.edu.

Deepak Cherian

unread,
Apr 29, 2020, 10:47:39 AM4/29/20
to xar...@googlegroups.com

It sounds like you want to append timesteps in which case Xylar's answer is a good way to do it.

Other ways to increase size of a dimension would be reindex, align & pad. It really depends on what your inputs and desired outputs are.

Deepak

Chris Barker

unread,
Apr 29, 2020, 12:27:24 PM4/29/20
to xar...@googlegroups.com
On Wed, Apr 29, 2020 at 7:47 AM Deepak Cherian <dpak.c...@gmail.com> wrote:

It sounds like you want to append timesteps in which case Xylar's answer is a good way to do it.

I haven't tested it, but concatenating is likely to be pretty slow. I'd profile and see if that's an issue for you, but if it is, pre-allocating the array, of possible would be the way to go. If you don't know how big it's going to get, you can still pre-allocate a bunch at a time, and then resize to the final size when you're done.

This does strike me as a useful feature request: netcdf4 supports unspecified dimensions (and 3 does for one dimensions) it would be nice if xarrray did too.

-CHB


 


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

Ryan Abernathey

unread,
Apr 29, 2020, 12:36:03 PM4/29/20
to xar...@googlegroups.com
Hi Chuan-Yuan,

It would be useful to see exactly what you’re doing. Can you share a more complete code sample?

Best,
Ryan

Sent from my iPhone

On Apr 28, 2020, at 11:38 PM, Chuan-Yuan Hsu <ch...@tamu.edu> wrote:

Hi All, 

Chuan-Yuan Hsu

unread,
Apr 29, 2020, 1:04:10 PM4/29/20
to xar...@googlegroups.com
Hi, 

Thanks for everyone’s answer. I have tried the concat, align, and pad. I think currently the concat is the one fitting to me. 
But here is the things I am working on.

1. 
I am working on the web crawler to collect the regional dataset and achieve them for the public. 
So, saying that 

 Dimensions:  (lonlat: 2, time: 72)
 Coordinates:
   * lonlat   (lonlat) int64 0 1
   * time     (time) datetime64[ns] 2018-07-01 ... 2018-07-03T23:00:00

Once I retrieve the dataset, I need to attach it to the original dataset. So I will expect the xarray result looks like 

 Dimensions:  (lonlat: 2, time: 73)
 Coordinates:
   * lonlat   (lonlat) int64 0 1
   * time     (time) datetime64[ns] 2018-07-01 ... 2018-07-03T23:30:00


2. 
I am working to perturb the particle movement at each location at each time step. In this case, after 24 time steps, the elements will be reach x ** 24, which x is the numbers of perturbation. So, I thought maybe I can also use xarray/dask to save the dataset since Dask can reduce the memory load.  


CYHsu

Reply all
Reply to author
Forward
0 new messages