I have a dataframe (df) that looks like:
date A 2001-01-02 1.0022 2001-01-03 1.1033 2001-01-04 1.1496 2001-01-05 1.1033 2015-03-30 126.3700 2015-03-31 124.4300 2015-04-01 124.2500 2015-04-02 124.8900
For the entire time-series I’m trying to divide today’s value by yesterdays and log the result using the following:
df["B"] = math.log(df["A"] / df["A"].shift(1))
However I get the following error:
TypeError: cannot convert the series to <class 'float'>
How can I fix this? I’ve tried to cast as float using:
df["B"] .astype(float)
But can’t get anything to work.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can use numpy.log instead. Math.log is expecting a single number, not array.
Method 2
You can use lambda operator to apply your functions to the pandas data frame or to the series. More specifically if you want to convert each element on a column to a floating point number, you should do it like this:
df['A'].apply(lambda x: float(x))
here the lambda operator will take the values on that column (as x) and return them back as a float value.
Method 3
If you just write df["A"].astype(float) you will not change df. You would need to assign the output of the astype method call to something else, including to the existing series using df['A'] = df['A'].astype(float). Also you might want to either use numpy as @user3582076 suggests, or use .apply on the Series that results from dividing today’s value by yesterday’s.
Method 4
I had the same issue, for me the answer was to look at the cause of why I had series in the first place. After looking for a long time about how to change the series into the different assigned data type, I realised that I had defined the same column name twice in the dataframe and that was why I had a series.
Removing the accidental duplication of column name removes this issue 🙂
Method 5
I used in a different way but it is same as @cemosambora
(df.A).apply(lambda x: float(x))
Here, df is the pandas dataframe and A is a column name
Method 6
Pandas data frame doesn’t allow direct use of arithmetic operations on series. you can work around this issue by using FOR Loops in python.
I have done a code example below.
import math
df['B'] = ""
for i, row in df.iterrows():
df_index = df.loc[i]
results = math.log(df_index['A'] / df_index['A'])
df['B'] = results
print(df)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0