Frequency counts for unique values in a NumPy array
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Droplet of life
--
Chapters
00:00 Question
00:24 Accepted answer (Score 198)
00:59 Answer 2 (Score 728)
01:34 Answer 3 (Score 157)
02:10 Answer 4 (Score 66)
03:26 Thank you
--
Full question
https://stackoverflow.com/questions/1074...
Accepted answer links:
http://docs.scipy.org/doc/numpy/referenc...
Answer 2 links:
[numpy.unique]: https://numpy.org/doc/stable/reference/g...
[scipy.stats.itemfreq]: https://docs.scipy.org/doc/scipy/referen...
Answer 3 links:
[scipy.stats.itemfreq]: http://docs.scipy.org/doc/scipy/referenc...
Answer 4 links:
[perfplot]: https://github.com/nschloe/perfplot
[image]: https://i.stack.imgur.com/mjDiR.png
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #arrays #performance #numpy
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Droplet of life
--
Chapters
00:00 Question
00:24 Accepted answer (Score 198)
00:59 Answer 2 (Score 728)
01:34 Answer 3 (Score 157)
02:10 Answer 4 (Score 66)
03:26 Thank you
--
Full question
https://stackoverflow.com/questions/1074...
Accepted answer links:
http://docs.scipy.org/doc/numpy/referenc...
Answer 2 links:
[numpy.unique]: https://numpy.org/doc/stable/reference/g...
[scipy.stats.itemfreq]: https://docs.scipy.org/doc/scipy/referen...
Answer 3 links:
[scipy.stats.itemfreq]: http://docs.scipy.org/doc/scipy/referenc...
Answer 4 links:
[perfplot]: https://github.com/nschloe/perfplot
[image]: https://i.stack.imgur.com/mjDiR.png
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #arrays #performance #numpy
#avk47
ANSWER 1
Score 780
Use numpy.unique with return_counts=True (for NumPy 1.9+):
import numpy as np
x = np.array([1,1,1,2,2,2,5,25,1,1])
unique, counts = np.unique(x, return_counts=True)
>>> print(np.asarray((unique, counts)).T)
[[ 1 5]
[ 2 3]
[ 5 1]
[25 1]]
In comparison with scipy.stats.itemfreq:
In [4]: x = np.random.random_integers(0,100,1e6)
In [5]: %timeit unique, counts = np.unique(x, return_counts=True)
10 loops, best of 3: 31.5 ms per loop
In [6]: %timeit scipy.stats.itemfreq(x)
10 loops, best of 3: 170 ms per loop
ACCEPTED ANSWER
Score 203
Take a look at np.bincount:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html
import numpy as np
x = np.array([1,1,1,2,2,2,5,25,1,1])
y = np.bincount(x)
ii = np.nonzero(y)[0]
And then:
zip(ii,y[ii])
# [(1, 5), (2, 3), (5, 1), (25, 1)]
or:
np.vstack((ii,y[ii])).T
# array([[ 1, 5],
[ 2, 3],
[ 5, 1],
[25, 1]])
or however you want to combine the counts and the unique values.
ANSWER 3
Score 170
Use this:
>>> import numpy as np
>>> x = [1,1,1,2,2,2,5,25,1,1]
>>> np.array(np.unique(x, return_counts=True)).T
array([[ 1, 5],
[ 2, 3],
[ 5, 1],
[25, 1]])
Original answer:
Use scipy.stats.itemfreq (warning: deprecated):
>>> from scipy.stats import itemfreq
>>> x = [1,1,1,2,2,2,5,25,1,1]
>>> itemfreq(x)
/usr/local/bin/python:1: DeprecationWarning: `itemfreq` is deprecated! `itemfreq` is deprecated and will be removed in a future version. Use instead `np.unique(..., return_counts=True)`
array([[ 1., 5.],
[ 2., 3.],
[ 5., 1.],
[ 25., 1.]])
ANSWER 4
Score 51
Using pandas module:
>>> import pandas as pd
>>> import numpy as np
>>> x = np.array([1,1,1,2,2,2,5,25,1,1])
>>> pd.value_counts(x)
1 5
2 3
25 1
5 1
dtype: int64