Group pandas dataframe in unusual way
This video explains
Group pandas dataframe in unusual way
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: The Builders
--
Chapters
00:00 Question
01:05 Accepted answer (Score 9)
01:39 Answer 2 (Score 2)
02:09 Answer 3 (Score 1)
02:46 Thank you
--
Full question
https://stackoverflow.com/questions/6412...
Question links:
[image]: https://i.stack.imgur.com/8Kr8Q.png
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #pandasgroupby
#avk47
Group pandas dataframe in unusual way
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: The Builders
--
Chapters
00:00 Question
01:05 Accepted answer (Score 9)
01:39 Answer 2 (Score 2)
02:09 Answer 3 (Score 1)
02:46 Thank you
--
Full question
https://stackoverflow.com/questions/6412...
Question links:
[image]: https://i.stack.imgur.com/8Kr8Q.png
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #pandasgroupby
#avk47
ACCEPTED ANSWER
Score 9
Let us try shift + cumsum create the groupby key: BTW I really like the way you display your expected output
s = df.groupby('ID')['value'].apply(lambda x : x.ne(x.shift()).cumsum())
d = {x : y for x ,y in df.groupby(s)}
d[2]
ID value
2 100 True
5 200 True
6 200 True
7 200 True
11 300 True
12 300 True
d[1]
ID value
0 100 False
1 100 False
4 200 False
9 300 False
10 300 False
d[3]
ID value
3 100 False
8 200 False
13 300 False
ANSWER 2
Score 2
Let's try following your logic:
# 1. all False up to first True
group1 = df.loc[df.groupby('ID')['value'].cumsum() == 0]
# 2. all False after last True
group2 = df.loc[df.iloc[::-1].groupby('ID')['value'].cumsum()==0]
# 3. all True
group3 = df[df['value']]
Output:
ID value
0 100 False
1 100 False
4 200 False
9 300 False
10 300 False
ID value
3 100 False
8 200 False
13 300 False
ID value
2 100 True
5 200 True
6 200 True
7 200 True
11 300 True
12 300 True
ANSWER 3
Score 1
This works for your example data
df['groups'] = df.groupby('ID').value.apply(lambda x: x.diff().ne(False).cumsum()).astype('int')
for _,df_groups in df.groupby('groups'):
print(df_groups)
print('-'*20)
Out:
ID value groups
0 100 False 1
1 100 False 1
4 200 False 1
9 300 False 1
10 300 False 1
--------------------
ID value groups
2 100 True 2
5 200 True 2
6 200 True 2
7 200 True 2
11 300 True 2
12 300 True 2
--------------------
ID value groups
3 100 False 3
8 200 False 3
13 300 False 3
--------------------