The Python Oracle

pandas: filter group by multiple conditions?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Dreaming in Puzzles

--

Chapters
00:00 Pandas: Filter Group By Multiple Conditions?
00:41 Accepted Answer Score 8
00:48 Answer 2 Score 3
01:00 Answer 3 Score 3
01:41 Thank you

--

Full question
https://stackoverflow.com/questions/4380...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #filter #multipleconditions

#avk47



ACCEPTED ANSWER

Score 8


d1 = df.set_index(['id', 'date']).is_local.unstack()
d1.index[d1['2016-01-01'] & ~d1['2017-01-01']].tolist()

[123]



ANSWER 2

Score 3


Another way of doing this is through pivoting:

In [24]: ids_by_dates = df.pivot(index='id', columns='date',values='is_local')

In [25]: ids_by_dates['2016-01-01'] & ~ids_by_dates['2017-01-01']
Out[25]: 
id
123     True
124    False



ANSWER 3

Score 3


You can try using the datetime module from datetime library and pass multiple conditions for the dataframe

from datetime import datetime
df = pd.DataFrame([
  {'id': 123, 'date': '2016-01-01', 'is_local': True },
  {'id': 123, 'date': '2017-01-01', 'is_local': False },
  {'id': 124, 'date': '2016-01-01', 'is_local': True },
  {'id': 124, 'date': '2017-01-01', 'is_local': True }
])
df.date = df.date.astype('datetime64[ns]')

Use multiple conditions for slicing out the required dataframe

a = df[(df.is_local==True) & (df.date<datetime(2016,12,31) & (df.date>datetime(2015,12,31))]
b = df[(df.is_local==False) & (df.date<datetime(2017,12,31)) & (df.date>datetime(2016,12,31))]

Use pandas concatenate later

final_df = pd.concat((a,b))

will output you rows 1 and 2

    date        id  is_local
2   2016-01-01  124 True
1   2017-01-01  123 False

In single line as follows

final_df = pd.concat((df[(df.is_local==True) & (df.date<datetime(2016,12,31) & (df.date>datetime(2015,12,31))], df[(df.is_local==False) & (df.date<datetime(2017,12,31)) & (df.date>datetime(2016,12,31))]))