Find cells in dataframe where value is between x and y
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Life in a Drop
--
Chapters
00:00 Find Cells In Dataframe Where Value Is Between X And Y
00:58 Accepted Answer Score 4
01:37 Answer 2 Score 0
02:03 Thank you
--
Full question
https://stackoverflow.com/questions/3976...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Life in a Drop
--
Chapters
00:00 Find Cells In Dataframe Where Value Is Between X And Y
00:58 Accepted Answer Score 4
01:37 Answer 2 Score 0
02:03 Thank you
--
Full question
https://stackoverflow.com/questions/3976...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
ACCEPTED ANSWER
Score 4
use & with parentheses (due to operator precedence), and doesn't understand how to treat an array of booleans hence the warning:
In [64]:
df = pd.DataFrame([{1:1,2:2,3:6},{1:9,2:9,3:10}])
(df > 2) & (df < 10)
Out[64]:
1 2 3
0 False False True
1 True True False
It's possible to use between with apply but this will be slower for a large df:
In [66]:
df.apply(lambda x: x.between(2,10, inclusive=False))
Out[66]:
1 2 3
0 False False True
1 True True False
Note that this warning will get raised whenever you try to compare a df or series using and, or, and not, you should use &, |, and ~ respectively as these bitwise operators understand how to treat arrays correctly
ANSWER 2
Score 0
between is a convenient method for this. However, it is only for series objects. we can get around this by either using apply which operates on each row (or column) which is a series. Or, reshape the dataframe to a series with stack
use stack, between, unstack
df.stack().between(2, 10, inclusive=False).unstack()
