Find span where condition is True using NumPy

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Take control of your privacy with Proton's trusted, Swiss-based, secure services.
Choose what you need and safeguard your digital life:
Mail: https://go.getproton.me/SH1CU
VPN: https://go.getproton.me/SH1DI
Password Manager: https://go.getproton.me/SH1DJ
Drive: https://go.getproton.me/SH1CT

Music by Eric Matyas
https://www.soundimage.org
Track title: RPG Blues Looping

--

Chapters
00:00 Find Span Where Condition Is True Using Numpy
00:55 Accepted Answer Score 7
01:29 Answer 2 Score 3
02:15 Answer 3 Score 0
02:51 Thank you

--

Full question
https://stackoverflow.com/questions/1715...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #arrays #search #numpy

#avk47

ACCEPTED ANSWER

Score 7

How's one way. First take the boolean array you have:

In [11]: a
Out[11]: array([0, 0, 0, 2, 2, 0, 2, 2, 2, 0])

In [12]: a1 = a > 1

Shift it one to the left (to get the next state at each index) using roll:

In [13]: a1_rshifted = np.roll(a1, 1)

In [14]: starts = a1 & ~a1_rshifted  # it's True but the previous isn't

In [15]: ends = ~a1 & a1_rshifted

Where this is non-zero is the start of each True batch (or, respectively, end batch):

In [16]: np.nonzero(starts)[0], np.nonzero(ends)[0]
Out[16]: (array([3, 6]), array([5, 9]))

And zipping these together:

In [17]: zip(np.nonzero(starts)[0], np.nonzero(ends)[0])
Out[17]: [(3, 5), (6, 9)]

ANSWER 2

Score 3

If you have access to the scipy library:

You can use scipy.ndimage.measurements.label to identify any regions of non zero value. it returns an array where the value of each element is the id of a span or range in the original array.

You can then use scipy.ndimage.measurements.find_objects to return the slices you would need to extract those ranges. You can access the start / end values directly from those slices.

In your example:

import numpy
from scipy.ndimage.measurements import label, find_objects

data = numpy.array([0, 0, 0, 2, 2, 0, 2, 2, 2, 0])

labels, number_of_regions = label(data)
ranges = find_objects(labels)

for identified_range in ranges:
    print(identified_range[0].start, identified_range[0].stop)

You should see:

3 5
6 9

Hope this helps!

ANSWER 3

Score 0

Rather than using np.roll which has a serious problem that it rolls. You're better off just making two copies. One with right-pad and the other with left-pad. I needed this for an image.

        a = np.pad(im, ((0, 0), (0, 1)), constant_values=0)
        b = np.pad(im, ((0, 0), (1, 0)), constant_values=0)
        starts = a & ~b
        ends = ~a & b
        sx, sy = np.nonzero(starts)
        ex, ey = np.nonzero(ends)

The approved answer has a problem, in that, if you end with True, that gets rolled to the front and messes up the values. You really want to pad these values with 0.

Then you're finding 1 -> 0 transitions and the 0 -> 1 transitions and putting those in your needed format.