The Python Oracle

What does `ValueError: cannot reindex from a duplicate axis` mean?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Beneath the City Looping

--

Chapters
00:00 What Does `Valueerror: Cannot Reindex From A Duplicate Axis` Mean?
01:38 Accepted Answer Score 330
01:56 Answer 2 Score 265
02:09 Answer 3 Score 72
02:35 Answer 4 Score 43
02:50 Thank you

--

Full question
https://stackoverflow.com/questions/2723...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas

#avk47



ACCEPTED ANSWER

Score 330


This error usually rises when you join / assign to a column when the index has duplicate values. Since you are assigning to a row, I suspect that there is a duplicate value in affinity_matrix.columns, perhaps not shown in your question.




ANSWER 2

Score 265


As others have said, you've probably got duplicate values in your original index. To find them do this:

df[df.index.duplicated()]



ANSWER 3

Score 72


Indices with duplicate values often arise if you create a DataFrame by concatenating other DataFrames. IF you don't care about preserving the values of your index, and you want them to be unique values, when you concatenate the the data, set ignore_index=True.

Alternatively, to overwrite your current index with a new one, instead of using df.reindex(), set:

df.index = new_index



ANSWER 4

Score 43


For people who are still struggling with this error, it can also happen if you accidentally create a duplicate column with the same name. Remove duplicate columns like so:

df = df.loc[:,~df.columns.duplicated()]