The Python Oracle

Pandas Groupby count on multiple columns for specific string values only

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Dreamlands

--

Chapters
00:00 Question
00:53 Accepted answer (Score 4)
01:45 Answer 2 (Score 7)
02:02 Answer 3 (Score 5)
02:21 Answer 4 (Score 1)
02:42 Thank you

--

Full question
https://stackoverflow.com/questions/6678...

Question links:
[image]: https://i.stack.imgur.com/DZ634m.png
[image]: https://i.stack.imgur.com/JRfA9m.png

Accepted answer links:
[@QuangHoang]: https://stackoverflow.com/a/66783324/233...
[@Vaishali]: https://stackoverflow.com/a/66783293/233...
[@QuangHoang]: https://stackoverflow.com/a/66783324/233...

Answer 4 links:
[resample]: https://pandas.pydata.org/docs/reference...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas

#avk47



ANSWER 1

Score 7


True and False in your dummy df are strings, you can convert them to int and sum

dummy.replace({'TRUE':1,'FALSE':0}).groupby('date',as_index = False).sum()

    date        Action1 Action2
0   01/09/2020  2       1
1   02/09/2020  1       1
2   03/09/2020  2       1



ANSWER 2

Score 5


You can also try:

dummy.set_index(['date']).eq('TRUE').sum(level='date')

Output:

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1



ACCEPTED ANSWER

Score 4


Anyone seeing this answer should look at the answers by @QuangHoang or @Vaishali
They are much better answers. I can't control what the OP chooses, but you should go upvote those answers.

Inspired by @QuangHoang

dummy.iloc[:, 1:].eq('TRUE').groupby(dummy.date).sum()

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1

OLD ANSWER

Fix your dataframe such that it has actual True/False values

from ast import literal_eval

dummy = dummy.assign(**dummy[['Action1', 'Action2']].applymap(str.title).applymap(literal_eval))

Then use groupby

dummy.groupby('date').sum()

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1



ANSWER 4

Score 1


In [7]: dummy
Out[7]:
         date Action1 Action2
0  01/09/2020    TRUE   FALSE
1  01/09/2020    TRUE    TRUE
2  02/09/2020   FALSE    TRUE
3  02/09/2020    TRUE   FALSE
4  03/09/2020   FALSE   FALSE
5  03/09/2020    TRUE    TRUE
6  03/09/2020    TRUE   FALSE


In [9]: dummy.groupby(['date'], as_index=False).agg(lambda x: x.eq('TRUE').sum())
Out[9]:
         date  Action1  Action2
0  01/09/2020        2        1
1  02/09/2020        1        1
2  03/09/2020        2        1