Hello,
I hope somebody ran into this issue before. According to the
xarray docs it does not seem that one can append xarray.Dataset variables along unlimited dimension in NetCDF file:
Although xarray provides reasonable support for incremental reads of files on
disk, it does not support incremental writes, which can be a useful strategy
for dealing with datasets too big to fit into memory. Instead, xarray integrates
with dask.array (see Parallel computing with Dask), which provides a fully featured engine for
streaming computation.
Due to the huge data volume that we are processing, I have to store xarray.Dataset in increments to the file. I can do it if writing data to the Zarr store, but it does not seem to work if I want to store data in the NetCDF file and try to concatenate the data along unlimited dimension. I am getting an exception when I try to concatenate data variables of the xr.Dataset along unlimited "t" dimension:
import xarray as xr
import numpy as np
import os
x=[10, 20, 30]
y=[1, 2]
t1 = ['a1', 'b1']
ds1 = xr.Dataset(
{"foo": (( "t", "y", "x"), np.zeros((2, 2, 3)))},
coords={
"x": x,
"y": y,
"t": t1,
},
)
t2 = ['a2', 'b2']
ds2 = xr.Dataset(
{"foo": (( "t", "y", "x"), np.ones((2, 2, 3)))},
coords={
"x": x,
"y": y,
"t": t2,
},
)
ds1.to_netcdf(nc_file, engine="h5netcdf", unlimited_dims=('t'))
ds2.to_netcdf(nc_file, mode='a', engine="h5netcdf", unlimited_dims=('t'))
The code above generates an error:
TypeError: %d format: a number is required, not NoneType
Is there a way in xarray to write in increments to the NetCDF file along unlimited dimension?
Many thanks in advance,
Masha