I am currently trying to compare values from a json file(on which I can already work on) to values from a csv file(which might be the issue). My current code looks like this:
for data in trades['timestamp']:
data = pd.to_datetime(data)
print(data)
if data == ask_minute['lastUpdated']:
#....'do something'
Which gives:
“:The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().”
My current print(data) looks like this:
2018-10-03 18:03:38.067000 2018-10-03 18:03:38.109000 2018-10-03 18:04:28 2018-10-03 18:04:28.685000
However, I am still unable to compare these timestamps from my CSV file to those of my Json file. Does someone have an idea?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Let’s reduce it to a simpler example. By doing for instance the following comparison:
3 == pd.Series([3,2,4,1]) 0 True 1 False 2 False 3 False dtype: bool
The result you get is a Series of booleans, equal in size to the pd.Series in the right hand side of the expression. So really what’s happening here is that the integer is being broadcast across the series, and then they are compared. So when you do:
if 3 == pd.Series([3,2,4,1]):
pass
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
You get an error. The problem here is that you are comparing a pd.Series with a value, so you’ll have multiple True and multiple False values, as in the case above. This of course is ambiguous, since the condition is neither True or False.
So you need to further aggregate the result so that a single boolean value results from the operation. For that you’ll have to use either any or all depending on whether you want at least one (any) or all values to satisfy the condition.
(3 == pd.Series([3,2,4,1])).all() # False
or
(3 == pd.Series([3,2,4,1])).any() # True
Method 2
The problem I see is that even if you are evaluating one row in a dataframe, the code knows that a dataframe has the ability to have many rows. The code doesn’t just assume you want the only row that exists. You have to tell it explicitly. The way I solved it was like this:
if data.iloc[0] == ask_minute['lastUpdated']:
then the code knows you are selecting the one row that exists.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0