The Python Oracle

How do I vectorize this loop in numpy?

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Track title: CC P Beethoven - Piano Sonata No 2 in A

--

Chapters
00:00 Question
00:53 Accepted answer (Score 3)
01:20 Answer 2 (Score 2)
02:34 Answer 3 (Score 0)
02:48 Thank you

--

Full question
https://stackoverflow.com/questions/3176...

Accepted answer links:
[np.where]: http://docs.scipy.org/doc/numpy/referenc...
[np.tile]: http://docs.scipy.org/doc/numpy/referenc...

Answer 2 links:
https://stackoverflow.com/a/31767220/901...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #numpy

#avk47



ACCEPTED ANSWER

Score 3


For your special use case of argmax, you may use np.where and set the masked values to negative infinity:

>>> inf = np.iinfo('i8').max
>>> np.where(cond, arr, -inf).argmax(axis=1)
array([1, 2])

alternatively, you can manually broadcast using np.tile:

>>> np.ma.array(np.tile(arr, 2).reshape(2, 3), mask=~cond).argmax(axis=1)
array([1, 2])



ANSWER 2

Score 2


So you want a vectorized version of:

In [302]: [np.ma.array(arr,mask=~c).argmax() for c in cond]
Out[302]: [1, 2]

What are the realistic dimensions of cond? If the number of rows is small compared to the columns (or length of arr) an iteration like this is probably not expensive.

https://stackoverflow.com/a/31767220/901925 use of tile looks good. Here I change it slightly:

In [308]: np.ma.array(np.tile(arr,(cond.shape[0],1)),mask=~cond).argmax(axis=1)
Out[308]: array([1, 2], dtype=int32)

As expected, the list comprehension times scale with the rows of cond, while the tiling approach is just a bit slower than a single row case. But with times around 92.7 µs this masked array approach is much slower than arr.argmax(). Masking adds a lot of overhead.

The where version is quite a bit faster

np.where(cond, arr, -100).argmax(1)  # 20 µs

A deleted answer suggested

(arr*cond).argmax(1)   # 8 µs

which is even faster. As proposed it didn't work if there are negative arr values. But it can probably be adjusted to handle those.




ANSWER 3

Score 0


arr = np.array([3, 7, 4])

cond = np.array([[False, True, True], [True, False, True]])


def multi_slice_max(bool_arr , x ):
    return np.ma.array(x, mask=~bool_arr).argmax()

np.apply_along_axis(multi_slice_max , 1 , cond , arr)