I have looked up this issue and most questions are for more complex replacements. However in my case I have a very simple dataframe as a test dummy.
The aim is to replace a string anywhere in the dataframe with an nan, however this does not seem to work (i.e. does not replace; no errors whatsoever). I’ve tried replacing with another string and it does not work either. E.g.
d = {'color' : pd.Series(['white', 'blue', 'orange']),
'second_color': pd.Series(['white', 'black', 'blue']),
'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan)
The output is still:
color second_color value 0 white white 1 1 blue black 2 2 orange blue 3
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Given that this is the top Google result when searching for “Pandas replace is not working” I’d like to also mention that:
replace does full replacement searches, unless you turn on the regex
switch. Use regex=True, and it should perform partial replacements as
well.
This took me 30 minutes to find out, so hopefully I’ve saved the next person 30 minutes.
Method 2
You need to assign back
df = df.replace('white', np.nan)
or pass param inplace=True:
In [50]:
d = {'color' : pd.Series(['white', 'blue', 'orange']),
'second_color': pd.Series(['white', 'black', 'blue']),
'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan, inplace=True)
df
Out[50]:
color second_color value
0 NaN NaN 1.0
1 blue black 2.0
2 orange blue 3.0
Most pandas ops return a copy and most have param inplace which is usually defaulted to False
Method 3
Neither one with inplace=True nor the other with regex=True don’t work in my case.
So I found a solution with using Series.str.replace instead. It can be useful if you need to replace a substring.
In [4]: df['color'] = df.color.str.replace('e', 'E!')
In [5]: df
Out[5]:
color second_color value
0 whitE! white 1.0
1 bluE! black 2.0
2 orangE! blue 3.0
or even with a slicing.
In [10]: df.loc[df.color=='blue', 'color'] = df.color.str.replace('e', 'E!')
In [11]: df
Out[11]:
color second_color value
0 white white 1.0
1 bluE! black 2.0
2 orange blue 3.0
Method 4
You might need to check the data type of the column before using replace function directly. It could be the case that you are using replace function on Object data type, in this case, you need to apply replace function after converting it into a string.
Wrong:
df["column-name"] = df["column-name"].replace('abc', 'def')
Correct:
df["column-name"] = df["column-name"].str.replace('abc', 'def')
Method 5
When you use df.replace() it creates a new temporary object, but doesn’t modify yours. You can use one of the two following lines to modify df:
df = df.replace('white', np.nan)
df.replace('white', np.nan, inplace = True)
Method 6
What worked for me was using this dict notation.
{old_value:new_value}
df.replace({10:100},inplace=True)
check the documentation for more info.
https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.replace.html
Method 7
df.replace({'white': np.nan}, inplace=True, regex=True)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0