pandas: filter group by multiple conditions?
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Orient Looping
--
Chapters
00:00 Question
00:55 Accepted answer (Score 8)
01:07 Answer 2 (Score 3)
01:58 Answer 3 (Score 3)
02:14 Thank you
--
Full question
https://stackoverflow.com/questions/4380...
Answer 2 links:
[pivoting]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #filter #multipleconditions
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Orient Looping
--
Chapters
00:00 Question
00:55 Accepted answer (Score 8)
01:07 Answer 2 (Score 3)
01:58 Answer 3 (Score 3)
02:14 Thank you
--
Full question
https://stackoverflow.com/questions/4380...
Answer 2 links:
[pivoting]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #filter #multipleconditions
#avk47
ACCEPTED ANSWER
Score 8
d1 = df.set_index(['id', 'date']).is_local.unstack()
d1.index[d1['2016-01-01'] & ~d1['2017-01-01']].tolist()
[123]
ANSWER 2
Score 3
Another way of doing this is through pivoting:
In [24]: ids_by_dates = df.pivot(index='id', columns='date',values='is_local')
In [25]: ids_by_dates['2016-01-01'] & ~ids_by_dates['2017-01-01']
Out[25]:
id
123 True
124 False
ANSWER 3
Score 3
You can try using the datetime module from datetime library and pass multiple conditions for the dataframe
from datetime import datetime
df = pd.DataFrame([
{'id': 123, 'date': '2016-01-01', 'is_local': True },
{'id': 123, 'date': '2017-01-01', 'is_local': False },
{'id': 124, 'date': '2016-01-01', 'is_local': True },
{'id': 124, 'date': '2017-01-01', 'is_local': True }
])
df.date = df.date.astype('datetime64[ns]')
Use multiple conditions for slicing out the required dataframe
a = df[(df.is_local==True) & (df.date<datetime(2016,12,31) & (df.date>datetime(2015,12,31))]
b = df[(df.is_local==False) & (df.date<datetime(2017,12,31)) & (df.date>datetime(2016,12,31))]
Use pandas concatenate later
final_df = pd.concat((a,b))
will output you rows 1 and 2
date id is_local
2 2016-01-01 124 True
1 2017-01-01 123 False
In single line as follows
final_df = pd.concat((df[(df.is_local==True) & (df.date<datetime(2016,12,31) & (df.date>datetime(2015,12,31))], df[(df.is_local==False) & (df.date<datetime(2017,12,31)) & (df.date>datetime(2016,12,31))]))