Is there a way to check if NumPy arrays share the same data?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Take control of your privacy with Proton's trusted, Swiss-based, secure services.
Choose what you need and safeguard your digital life:
Mail: https://go.getproton.me/SH1CU
VPN: https://go.getproton.me/SH1DI
Password Manager: https://go.getproton.me/SH1DJ
Drive: https://go.getproton.me/SH1CT

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 3 Looping

--

Chapters
00:00 Is There A Way To Check If Numpy Arrays Share The Same Data?
00:55 Answer 1 Score 40
01:18 Accepted Answer Score 10
02:11 Answer 3 Score 7
02:34 Answer 4 Score 13
02:58 Thank you

--

Full question
https://stackoverflow.com/questions/1128...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #numpy

#avk47

ANSWER 1

Score 40

You can use the base attribute to check if an array shares the memory with another array:

>>> import numpy as np
>>> a = np.arange(27)
>>> b = a.reshape((3,3,3))
>>> b.base is a
True
>>> a.base is b
False

Not sure if that solves your problem. The base attribute will be None if the array owns its own memory. Note that an array's base will be another array, even if it is a subset:

>>> c = a[2:]
>>> c.base is a
True

ANSWER 2

Score 13

To solve the problem exactly, you can use

import numpy as np

a=np.arange(27)
b=a.reshape((3,3,3))

# Checks exactly by default
np.shares_memory(a, b)

# Checks bounds only
np.may_share_memory(a, b)

Both np.may_share_memory and np.shares_memory take an optional max_work argument that lets you decide how much effort to put in to ensure no false positives. This problem is NP-complete, so always finding the correct answer can be quite computationally expensive.

ACCEPTED ANSWER

Score 10

I think jterrace's answer is probably the best way to go, but here is another possibility.

def byte_offset(a):
    """Returns a 1-d array of the byte offset of every element in `a`.
    Note that these will not in general be in order."""
    stride_offset = np.ix_(*map(range,a.shape))
    element_offset = sum(i*s for i, s in zip(stride_offset,a.strides))
    element_offset = np.asarray(element_offset).ravel()
    return np.concatenate([element_offset + x for x in range(a.itemsize)])

def share_memory(a, b):
    """Returns the number of shared bytes between arrays `a` and `b`."""
    a_low, a_high = np.byte_bounds(a)
    b_low, b_high = np.byte_bounds(b)

    beg, end = max(a_low,b_low), min(a_high,b_high)

    if end - beg > 0:
        # memory overlaps
        amem = a_low + byte_offset(a)
        bmem = b_low + byte_offset(b)

        return np.intersect1d(amem,bmem).size
    else:
        return 0

Example:

>>> a = np.arange(10)
>>> b = a.reshape((5,2))
>>> c = a[::2]
>>> d = a[1::2]
>>> e = a[0:1]
>>> f = a[0:1]
>>> f = f.reshape(())
>>> share_memory(a,b)
80
>>> share_memory(a,c)
40
>>> share_memory(a,d)
40
>>> share_memory(c,d)
0
>>> share_memory(a,e)
8
>>> share_memory(a,f)
8

Here is a plot showing the time for each share_memory(a,a[::2]) call as a function of the number of elements in a on my computer.

share_memory function

ANSWER 4

Score 7

Just do:

a = np.arange(27)
a.__array_interface__['data']

The second line will return a tuple where the first entry is the memory address and the second is whether the array is read only. Combined with the shape and data type, you can figure out the exact span of memory address that the array covers, so you can also work out from this when one array is a subset of another.