Append data to NetCDF file along unlimited dimension?

208 views
Skip to first unread message

Masha Liukis

unread,
Feb 11, 2021, 7:07:49 PM2/11/21
to xarray
Hello,

I hope somebody ran into this issue before. According to the xarray docs it does not seem that one can append xarray.Dataset variables along unlimited dimension in NetCDF file:

Although xarray provides reasonable support for incremental reads of files on disk, it does not support incremental writes, which can be a useful strategy for dealing with datasets too big to fit into memory. Instead, xarray integrates with dask.array (see Parallel computing with Dask), which provides a fully featured engine for streaming computation.

Due to the huge data volume that we are processing, I have to store xarray.Dataset in increments to the file. I can do it if writing data to the Zarr store, but it does not seem to work if I want to store data in the NetCDF file and try to concatenate the data along unlimited dimension. I am getting an exception when I try to concatenate data variables of the xr.Dataset along unlimited "t" dimension:

import xarray as xr
import numpy as np
import os

x=[10, 20, 30]
y=[1, 2]
t1 = ['a1', 'b1']

ds1 = xr.Dataset(
    {"foo": (( "t", "y", "x"), np.zeros((2, 2, 3)))},
    coords={
        "x": x,
        "y": y,
        "t": t1,
    },
)

t2 = ['a2', 'b2']
ds2 = xr.Dataset(
    {"foo": (( "t", "y", "x"), np.ones((2, 2, 3)))},
    coords={
        "x": x,
        "y": y,
        "t": t2,
    },
)
   
ds1.to_netcdf(nc_file, engine="h5netcdf", unlimited_dims=('t'))
ds2.to_netcdf(nc_file, mode='a', engine="h5netcdf", unlimited_dims=('t'))

The code above generates an error:
TypeError: %d format: a number is required, not NoneType

Is there a way in xarray to write in increments to the NetCDF file along unlimited dimension?

Many thanks in advance,
Masha
Reply all
Reply to author
Forward
0 new messages