The Python Oracle

Pandas Groupby count on multiple columns for specific string values only

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Dream Voyager Looping

--

Chapters
00:00 Pandas Groupby Count On Multiple Columns For Specific String Values Only
00:31 Accepted Answer Score 4
01:05 Answer 2 Score 7
01:20 Answer 3 Score 5
01:34 Answer 4 Score 1
01:48 Thank you

--

Full question
https://stackoverflow.com/questions/6678...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas

#avk47



ANSWER 1

Score 7


True and False in your dummy df are strings, you can convert them to int and sum

dummy.replace({'TRUE':1,'FALSE':0}).groupby('date',as_index = False).sum()

    date        Action1 Action2
0   01/09/2020  2       1
1   02/09/2020  1       1
2   03/09/2020  2       1



ANSWER 2

Score 5


You can also try:

dummy.set_index(['date']).eq('TRUE').sum(level='date')

Output:

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1



ACCEPTED ANSWER

Score 4


Anyone seeing this answer should look at the answers by @QuangHoang or @Vaishali
They are much better answers. I can't control what the OP chooses, but you should go upvote those answers.

Inspired by @QuangHoang

dummy.iloc[:, 1:].eq('TRUE').groupby(dummy.date).sum()

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1

OLD ANSWER

Fix your dataframe such that it has actual True/False values

from ast import literal_eval

dummy = dummy.assign(**dummy[['Action1', 'Action2']].applymap(str.title).applymap(literal_eval))

Then use groupby

dummy.groupby('date').sum()

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1



ANSWER 4

Score 1


In [7]: dummy
Out[7]:
         date Action1 Action2
0  01/09/2020    TRUE   FALSE
1  01/09/2020    TRUE    TRUE
2  02/09/2020   FALSE    TRUE
3  02/09/2020    TRUE   FALSE
4  03/09/2020   FALSE   FALSE
5  03/09/2020    TRUE    TRUE
6  03/09/2020    TRUE   FALSE


In [9]: dummy.groupby(['date'], as_index=False).agg(lambda x: x.eq('TRUE').sum())
Out[9]:
         date  Action1  Action2
0  01/09/2020        2        1
1  02/09/2020        1        1
2  03/09/2020        2        1