Make Numpy.where stop after first true encountered
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Peaceful Mind
--
Chapters
00:00 Make Numpy.Where Stop After First True Encountered
00:36 Accepted Answer Score 3
01:20 Thank you
--
Full question
https://stackoverflow.com/questions/3196...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #optimization
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Peaceful Mind
--
Chapters
00:00 Make Numpy.Where Stop After First True Encountered
00:36 Accepted Answer Score 3
01:20 Thank you
--
Full question
https://stackoverflow.com/questions/3196...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #optimization
#avk47
ACCEPTED ANSWER
Score 3
If cump is a cumulative density function then it is monotonic, hence sorted. Rather than linearly scanning it, you are going to get best performance guarantees by binary searching it.
First we create some fake data to search over:
>>> import numpy as np
>>> cump = np.cumsum(np.random.rand(11))
>>> cump -= cump[0]
>>> cump /= cump[-1]
>>> cump
array([ 0. , 0.07570573, 0.1417473 , 0.30536346, 0.36277835,
0.47102093, 0.54456142, 0.6859625 , 0.75270741, 0.84691162, 1.
])
Then we create some fake data to search for:
>>> sample = np.random.rand(5)
>>> sample
array([ 0.19597276, 0.37885803, 0.2096784 , 0.57559965, 0.72175056])
And we finally search for it (and find it!):
>>> [np.where(_ < cump)[0][0] for _ in sample]
[3, 5, 3, 7, 8]
>>> np.searchsorted(cump, sample)
array([3, 5, 3, 7, 8], dtype=int64)