Make Numpy.where stop after first true encountered
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Magical Minnie Puzzles
--
Chapters
00:00 Question
00:43 Accepted answer (Score 3)
01:32 Thank you
--
Full question
https://stackoverflow.com/questions/3196...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #optimization
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Magical Minnie Puzzles
--
Chapters
00:00 Question
00:43 Accepted answer (Score 3)
01:32 Thank you
--
Full question
https://stackoverflow.com/questions/3196...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #optimization
#avk47
ACCEPTED ANSWER
Score 3
If cump is a cumulative density function then it is monotonic, hence sorted. Rather than linearly scanning it, you are going to get best performance guarantees by binary searching it.
First we create some fake data to search over:
>>> import numpy as np
>>> cump = np.cumsum(np.random.rand(11))
>>> cump -= cump[0]
>>> cump /= cump[-1]
>>> cump
array([ 0. , 0.07570573, 0.1417473 , 0.30536346, 0.36277835,
0.47102093, 0.54456142, 0.6859625 , 0.75270741, 0.84691162, 1.
])
Then we create some fake data to search for:
>>> sample = np.random.rand(5)
>>> sample
array([ 0.19597276, 0.37885803, 0.2096784 , 0.57559965, 0.72175056])
And we finally search for it (and find it!):
>>> [np.where(_ < cump)[0][0] for _ in sample]
[3, 5, 3, 7, 8]
>>> np.searchsorted(cump, sample)
array([3, 5, 3, 7, 8], dtype=int64)