Replace all elements of Python NumPy Array that are greater than some value
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle4
--
Chapters
00:00 Replace All Elements Of Python Numpy Array That Are Greater Than Some Value
00:39 Accepted Answer Score 470
01:11 Answer 2 Score 63
02:35 Answer 3 Score 15
02:53 Answer 4 Score 24
03:06 Thank you
--
Full question
https://stackoverflow.com/questions/1966...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #arrays #numpy #replace #conditionalstatements
#avk47
ACCEPTED ANSWER
Score 470
I think both the fastest and most concise way to do this is to use NumPy's built-in Fancy indexing. If you have an ndarray named arr, you can replace all elements >255 with a value x as follows:
arr[arr > 255] = x
I ran this on my machine with a 500 x 500 random matrix, replacing all values >0.5 with 5, and it took an average of 7.59ms.
In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit A[A > 0.5] = 5
100 loops, best of 3: 7.59 ms per loop
ANSWER 2
Score 63
If you want a new array result containing a copy of arr whenever arr < 255, and 255 otherwise:
result = np.minimum(arr, 255)
More generally, for a lower and/or upper bound:
result = np.clip(arr, 0, 255)
If you just want to access the values over 255, or something more complicated, @mtitan8's answer is more general, but np.clip and np.minimum (or np.maximum) are nicer and much faster for your case:
In [292]: timeit np.minimum(a, 255)
100000 loops, best of 3: 19.6 µs per loop
In [293]: %%timeit
.....: c = np.copy(a)
.....: c[a>255] = 255
.....:
10000 loops, best of 3: 86.6 µs per loop
If you want to do it in-place (i.e., modify arr instead of creating result) you can use the out parameter of np.minimum:
np.minimum(arr, 255, out=arr)
or
np.clip(arr, 0, 255, arr)
(the out= name is optional since the arguments in the same order as the function's definition.)
For in-place modification, the boolean indexing speeds up a lot (without having to make and then modify the copy separately), but is still not as fast as minimum:
In [328]: %%timeit
.....: a = np.random.randint(0, 300, (100,100))
.....: np.minimum(a, 255, a)
.....:
100000 loops, best of 3: 303 µs per loop
In [329]: %%timeit
.....: a = np.random.randint(0, 300, (100,100))
.....: a[a>255] = 255
.....:
100000 loops, best of 3: 356 µs per loop
For comparison, if you wanted to restrict your values with a minimum as well as a maximum, without clip you would have to do this twice, with something like
np.minimum(a, 255, a)
np.maximum(a, 0, a)
or,
a[a>255] = 255
a[a<0] = 0
ANSWER 3
Score 24
I think you can achieve this the quickest by using the where function:
For example looking for items greater than 0.2 in a numpy array and replacing those with 0:
import numpy as np
nums = np.random.rand(4,3)
print np.where(nums > 0.2, 0, nums)
ANSWER 4
Score 15
You can consider using numpy.putmask:
np.putmask(arr, arr>=T, 255.0)
Here is a performance comparison with the Numpy's builtin indexing:
In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit np.putmask(A, A>0.5, 5)
1000 loops, best of 3: 1.34 ms per loop
In [4]: timeit A[A > 0.5] = 5
1000 loops, best of 3: 1.82 ms per loop