The Python Oracle

replace() method not working on Pandas DataFrame

This video explains
replace() method not working on Pandas DataFrame

--

Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Over Ancient Waters Looping

--

Chapters
00:00 Question
01:11 Accepted answer (Score 39)
01:46 Answer 2 (Score 136)
02:19 Answer 3 (Score 12)
02:58 Answer 4 (Score 10)
03:25 Thank you

--

Full question
https://stackoverflow.com/questions/3759...

Question links:
[Understanding inplace=True in pandas]: https://stackoverflow.com/questions/4389...

Answer 2 links:
[Series.str.replace]: https://pandas.pydata.org/pandas-docs/st...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe #numpy #replace

#avk47



ANSWER 1

Score 188


Given that this is the top Google result when searching for "Pandas replace is not working" I'd like to also mention that:

replace does full replacement searches, unless you turn on the regex switch. Use regex=True, and it should perform partial replacements as well.

This took me 30 minutes to find out, so hopefully I've saved the next person 30 minutes.




ACCEPTED ANSWER

Score 43


You need to assign back

df = df.replace('white', np.nan)

or pass param inplace=True:

In [50]:
d = {'color' : pd.Series(['white', 'blue', 'orange']),
   'second_color': pd.Series(['white', 'black', 'blue']),
   'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan, inplace=True)
df

Out[50]:
    color second_color  value
0     NaN          NaN    1.0
1    blue        black    2.0
2  orange         blue    3.0

Most pandas ops return a copy and most have param inplace which is usually defaulted to False




ANSWER 3

Score 19


Neither one with inplace=True nor the other with regex=True don't work in my case. So I found a solution with using Series.str.replace instead. It can be useful if you need to replace a substring.

In [4]: df['color'] = df.color.str.replace('e', 'E!')
In [5]: df  
Out[5]: 
     color second_color  value
0   whitE!        white    1.0
1    bluE!        black    2.0
2  orangE!         blue    3.0

or even with a slicing.

In [10]: df.loc[df.color=='blue', 'color'] = df.color.str.replace('e', 'E!')
In [11]: df  
Out[11]: 
    color second_color  value
0   white        white    1.0
1   bluE!        black    2.0
2  orange         blue    3.0



ANSWER 4

Score 7


When you use df.replace() it creates a new temporary object, but doesn't modify yours. You can use one of the two following lines to modify df:

df = df.replace('white', np.nan)
df.replace('white', np.nan, inplace = True)