I have a 20*5 data table and I want to find the mean value of one of the columns which is the price column. I know I have to use this method for finding the mean value
mean= df["price"].mean()
the problem is that in my data file, the prices are not integers and they are strings and all of them are written in this format e.x.(2000dollars not just 2000). How can I remove those currencies from the numbers and then finding the mean value of the numbers?(The currencies are all the same)
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Try normalising you data first,
df["price"] = df["price"].apply(lambda x: float(x.replace("dollars", "")))
I am assuming the prices are stored like 5000dollars, if you have any other abnormality, you can replace it with empty value.
or if you dont wanna update this columns entry, you can make a new column and use that one for mean,
df["new_price"] = df["price"].apply(lambda x: float(x.replace("dollars", "")))
Method 2
df['price'].str.replace('dollars', '').astype('float').mean()
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0