The Python Oracle

Turn Pandas Multi-Index into column

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Romantic Lands Beckon

--

Chapters
00:00 Turn Pandas Multi-Index Into Column
00:34 Accepted Answer Score 331
00:59 Answer 2 Score 38
02:01 Answer 3 Score 19
02:19 Answer 4 Score 5
02:37 Thank you

--

Full question
https://stackoverflow.com/questions/2011...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe #flatten #multiindex

#avk47



ACCEPTED ANSWER

Score 331


The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns).

All you have to do call .reset_index() after the name of the DataFrame:

df = df.reset_index()  



ANSWER 2

Score 38


This doesn't really apply to your case but could be helpful for others (like myself 5 minutes ago) to know. If one's multindex have the same name like this:

                         value
Trial        Trial
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

df.reset_index(inplace=True) will fail, cause the columns that are created cannot have the same names.

So then you need to rename the multindex with df.index = df.index.set_names(['Trial', 'measurement']) to get:

                           value
Trial    measurement       

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

And then df.reset_index(inplace=True) will work like a charm.

I encountered this problem after grouping by year and month on a datetime-column(not index) called live_date, which meant that both year and month were named live_date.




ANSWER 3

Score 19


As @cs95 mentioned in a comment, to drop only one level, use:

df.reset_index(level=[...])

This avoids having to redefine your desired index after reset.




ANSWER 4

Score 5


I ran into Karl's issue as well. I just found myself renaming the aggregated column then resetting the index.

df = pd.DataFrame(df.groupby(['arms', 'success'])['success'].sum()).rename(columns={'success':'sum'})

enter image description here

df = df.reset_index()

enter image description here