Pandas isin() function for continuous intervals
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Underwater World
--
Chapters
00:00 Pandas Isin() Function For Continuous Intervals
00:28 Accepted Answer Score 10
00:55 Answer 2 Score 1
01:08 Answer 3 Score 1
01:45 Thank you
--
Full question
https://stackoverflow.com/questions/3062...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Underwater World
--
Chapters
00:00 Pandas Isin() Function For Continuous Intervals
00:28 Accepted Answer Score 10
00:55 Answer 2 Score 1
01:08 Answer 3 Score 1
01:45 Thank you
--
Full question
https://stackoverflow.com/questions/3062...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
ACCEPTED ANSWER
Score 10
Series objects (including dataframe columns) have a between method:
>>> s = pd.Series(np.linspace(0, 20, 8))
>>> s
0 0.000000
1 2.857143
2 5.714286
3 8.571429
4 11.428571
5 14.285714
6 17.142857
7 20.000000
dtype: float64
>>> s.between(1, 14.5)
0 False
1 True
2 True
3 True
4 True
5 True
6 False
7 False
dtype: bool
ANSWER 2
Score 1
This works:
df['numdum'] = (df.number >= 1) & (df.number <= 10)
ANSWER 3
Score 1
You could also do the same thing with cut(). No real advantage if there are just two categories:
>>> df['numdum'] = pd.cut( df['number'], [-99,10,99], labels=[1,0] )
number numdum
0 8 1
1 9 1
2 10 1
3 11 0
4 12 0
5 13 0
6 14 0
But it's nice if you have multiple categories:
>>> df['numdum'] = pd.cut( df['number'], [-99,8,10,99], labels=[1,2,3] )
number numdum
0 8 1
1 9 2
2 10 2
3 11 3
4 12 3
5 13 3
6 14 3
Labels can be True and False if that is preferred, or you can not specify the label at all, in which case the labels will contain info on the cutoff points.