Dataset.copy(deep=True) does not copy coordinates?

92 views
Skip to first unread message

george trojan

unread,
Nov 12, 2018, 5:57:11 PM11/12/18
to xarray
Consider the following code (Jupyter notebook):

times = pd.date_range('2000-01-01', '2000-01-01', name='time')
dx = xr.Dataset({'v': (('time',), [1])}, {'time': times})
dx
dy = dx.copy(deep=True)
dy['time'] += int(86400 * 1e9)
dy['v'] += 1
dx
dy

The result is:

<xarray.Dataset>
Dimensions:  (time: 1)
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01
Data variables:
    v        (time) int64 1
<xarray.Dataset>
Dimensions:  (time: 1)
Coordinates:
  * time     (time) datetime64[ns] 2000-01-02
Data variables:
    v        (time) int64 1
<xarray.Dataset>
Dimensions:  (time: 1)
Coordinates:
  * time     (time) datetime64[ns] 2000-01-02
Data variables:
    v        (time) int64 2

The time coordinate in dx has changed. This seems to contradict the Returns: section in http://xarray.pydata.org/en/stable/generated/xarray.Dataset.copy.html#xarray.Dataset.copy
Is this a bug or I misunderstand the documentation?

George

Stephan Hoyer

unread,
Nov 12, 2018, 6:02:39 PM11/12/18
to xar...@googlegroups.com
This definitely looks like a bug to me. (Possibly an upstream bug in pandas, but something we should work around nonetheless.)

--
You received this message because you are subscribed to the Google Groups "xarray" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xarray+un...@googlegroups.com.
To post to this group, send email to xar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xarray/0f12d486-d1c7-4a7b-924d-09db9a35f208%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

george trojan

unread,
Nov 14, 2018, 1:45:41 PM11/14/18
to xar...@googlegroups.com
Well, after going through the code, I don't consider this behaviour as a bug. Xarray makes coordinates a pandas index. which is immutable. A shift operation as in my example makes sense only for numeric indices, so deep copy makes little sense.  IMHO the code works as intended, may be the documentation should be updated to say that deep argument has no effect on coordinates.

I find semantics of

dy['time'] += int(...)

confusing. It is sort of operation in place, but not quite. It updates shared (between dx and dy) data, but does return a new index, so technically id does not violate index immutability.

The *true* modification in place:.

dy['time'].loc[:] += int(...)

is (rightly) not allowed. Finally,

dy['time'] = dy['time'] + int(...)

works as expected.

What is the rationale for this behaviour?

Sorry if I am stating the obvious, I am new to xarray and I want to find out what might bite me somewhere in a larger chunk of code.



Reply all
Reply to author
Forward
0 new messages