The Python Oracle

Group pandas dataframe in unusual way

This video explains
Group pandas dataframe in unusual way

--

Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: The Builders

--

Chapters
00:00 Question
01:05 Accepted answer (Score 9)
01:39 Answer 2 (Score 2)
02:09 Answer 3 (Score 1)
02:46 Thank you

--

Full question
https://stackoverflow.com/questions/6412...

Question links:
[image]: https://i.stack.imgur.com/8Kr8Q.png

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #pandasgroupby

#avk47



ACCEPTED ANSWER

Score 9


Let us try shift + cumsum create the groupby key: BTW I really like the way you display your expected output

s = df.groupby('ID')['value'].apply(lambda x : x.ne(x.shift()).cumsum())
d = {x : y for x ,y in df.groupby(s)}
d[2]
     ID  value
2   100   True
5   200   True
6   200   True
7   200   True
11  300   True
12  300   True
d[1]
     ID  value
0   100  False
1   100  False
4   200  False
9   300  False
10  300  False
d[3]
     ID  value
3   100  False
8   200  False
13  300  False



ANSWER 2

Score 2


Let's try following your logic:

# 1. all False up to first True
group1 = df.loc[df.groupby('ID')['value'].cumsum() == 0]

# 2. all False after last True
group2 = df.loc[df.iloc[::-1].groupby('ID')['value'].cumsum()==0]

# 3. all True
group3 = df[df['value']]

Output:

    ID      value
0   100     False
1   100     False
4   200     False
9   300     False
10  300     False

    ID      value
3   100     False
8   200     False
13  300     False

    ID      value
2   100     True
5   200     True
6   200     True
7   200     True
11  300     True
12  300     True



ANSWER 3

Score 1


This works for your example data

df['groups'] = df.groupby('ID').value.apply(lambda x: x.diff().ne(False).cumsum()).astype('int')
for _,df_groups in df.groupby('groups'):
  print(df_groups)
  print('-'*20)

Out:

     ID  value  groups
0   100  False       1
1   100  False       1
4   200  False       1
9   300  False       1
10  300  False       1
--------------------
     ID  value  groups
2   100   True       2
5   200   True       2
6   200   True       2
7   200   True       2
11  300   True       2
12  300   True       2
--------------------
     ID  value  groups
3   100  False       3
8   200  False       3
13  300  False       3
--------------------