How to replace text in a string column of a Pandas dataframe?

I have a column in my dataframe like this:

range
"(2,30)"
"(50,290)"
"(400,1000)"
...

and I want to replace the , comma with - dash. I’m currently using this method but nothing is changed.

org_info_exc['range'].replace(',', '-', inplace=True)

Can anybody help?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Use the vectorised str method replace:

df['range'] = df['range'].str.replace(',','-')

df
      range
0    (2-30)
1  (50-290)

EDIT: so if we look at what you tried and why it didn’t work:

df['range'].replace(',','-',inplace=True)

from the docs we see this description:

str or regex: str: string exactly matching to_replace will be replaced
with value

So because the str values do not match, no replacement occurs, compare with the following:

df = pd.DataFrame({'range':['(2,30)',',']})
df['range'].replace(',','-', inplace=True)

df['range']

0    (2,30)
1         -
Name: range, dtype: object

here we get an exact match on the second row and the replacement occurs.

Method 2

For anyone else arriving here from Google search on how to do a string replacement on all columns (for example, if one has multiple columns like the OP’s ‘range’ column):
Pandas has a built in replace method available on a dataframe object.

df.replace(',', '-', regex=True)

Source: Docs

Method 3

Replace all commas with underscore in the column names

data.columns= data.columns.str.replace(' ','_',regex=True)

Method 4

If you only need to replace characters in one specific column, somehow regex=True and in place=True all failed, I think this way will work:

data["column_name"] = data["column_name"].apply(lambda x: x.replace("characters_need_to_replace", "new_characters"))

lambda is more like a function that works like a for loop in this scenario.
x here represents every one of the entries in the current column.

The only thing you need to do is to change the “column_name”, “characters_need_to_replace” and “new_characters”.

Method 5

In addition, for those looking to replace more than one character in a column, you can do it using regular expressions:

import re
chars_to_remove = ['.', '-', '(', ')', '']
regular_expression = '[' + re.escape (''. join (chars_to_remove)) + ']'

df['string_col'].str.replace(regular_expression, '', regex=True)

Method 6

Almost similar to the answer by Nancy K, this works for me:

data["column_name"] = data["column_name"].apply(lambda x: x.str.replace("characters_need_to_replace", "new_characters"))

Method 7

If you want to remove two or more elements from a string, example the characters ‘$’ and ‘,’ :

Column_Name
===========
$100,000
$1,100,000

… then use:

data.Column_Name.str.replace("[$,]", "", regex=True)

=> [ 100000, 1100000 ]


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x