Performance impact of passing around reshaped memory views of the same underlying array

48 views
Skip to first unread message

Adam Li

unread,
Aug 23, 2024, 4:42:21 PM8/23/24
to cython-users
Hi all,

I am wondering if there are any issues (performance, design flaws, etc.) with the following setup I am thinking of.

The extension type class I have wants to take in a 2D array of shape (n_samples, n_features), where each row is a separate sample. The sample is comprised of a vectorized 1D array. However, I want to perform operations on a reshaped version of the vectorized array because it's easier to reason and code.

For example, I have a vectorized array of shape (12,), and I want to then do operations on a reshaped memory view that views that array as (4, 3). In terms of Cython code, I was thinking of something like:

```
cdef class A:
    def __cinit__(self, double [:, ::1] X):
           self.X = X

   cdef inline double compute_something(self):
           cdef double[:, :, ::1] reshaped_X = self.X.reshape(-1, 4, 3)

           # compute something using the reshaped array - only requires read-access 
           cdef double result = 0
           for idx in range(n_samples):
                  result += reshaped_X[idx, 2, 2]
           return result
```

Now, my understanding of memory views is that this is not a copy of the underlying array (instantiated in Python via numpy). And when you reshape the array, you are only changing the strides, so you are also not making a copy of the underlying data.

1. Does the design I posed work, or do I have to do something special to initialize the `reshaped_X` memory view as a "reshaped view" into X?
2. Does the design with `reshaped_X` make a copy of X? It should only change the strides and make a new "view" into the underlying data of X right?
3. Would the design I posed be fine performance-wise, or is there anything I need to consider?

Thanks!

da-woods

unread,
Aug 28, 2024, 1:08:03 PM8/28/24
to cython...@googlegroups.com
Hi Adam,

It basically works as you expect - doesn't make a copy, should be fairly fast (although reshape is a Python call).
--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cython-users/db7c1ae6-c15e-4337-a9f8-ae5bb1ad1a85n%40googlegroups.com.


Chris Barker

unread,
Sep 30, 2024, 5:59:17 PM9/30/24
to cython...@googlegroups.com
hmm -- shouldn't you be able to do this without the reshape call at all?

e.g. just look at the data as a 3-D array, something like:


cdef double[:, :, ::1] reshaped_X = self.X

if it complains at run-time, maybe you'll need a cast, or point the memoryview at the underlying pointer (can't remember the syntax right now, but maybe:

cdef double[:, :, ::1] reshaped_X = self.X[0]

That being said, unless these are tiny arrays, no reason to make the code more error prone.

-CHB










--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov
Reply all
Reply to author
Forward
0 new messages