Getting lists of indices from pandas dataframe
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Lost Meadow
--
Chapters
00:00 Getting Lists Of Indices From Pandas Dataframe
00:57 Accepted Answer Score 2
01:28 Answer 2 Score 1
01:46 Answer 3 Score 0
01:57 Thank you
--
Full question
https://stackoverflow.com/questions/4579...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Lost Meadow
--
Chapters
00:00 Getting Lists Of Indices From Pandas Dataframe
00:57 Accepted Answer Score 2
01:28 Answer 2 Score 1
01:46 Answer 3 Score 0
01:57 Thank you
--
Full question
https://stackoverflow.com/questions/4579...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe
#avk47
ACCEPTED ANSWER
Score 2
You want to get the index out for each group.
this is stored in the 'groups' attribute of a groupbydataframe.
#filter for coverage==True
#group by 'name'
#access the 'groups' attribute
by_person = df[df.coverage].groupby('name').groups
will return:
{'Jason': Int64Index([0, 5], dtype='int64'),
'Tina': Int64Index([4], dtype='int64')}
From which you can access the individuals as you would a regular dictionary:
by_person['Jason']
returns:
Int64Index([0, 5], dtype='int64')
Which you can treat like a regular list.
ANSWER 2
Score 1
This is doable, using boolean indexing first followed by the groupby:
In [942]: df[df.coverage].groupby('name').agg({'reports' : lambda x: list(x.index)})
Out[942]:
reports
name
Jason [0, 5]
Tina [4]
You may use dfGroupBy.agg to get your output as a column of lists.
ANSWER 3
Score 0
This should work:
grouped=df.groupby('name').apply(lambda x: x.index[x.coverage].values)
output:
name
Jason [0, 5]
Tina [4]