I’m trying to do boolean indexing with a couple conditions using Pandas. My original DataFrame is called df. If I perform the below, I get the expected result:
temp = df[df["bin"] == 3] temp = temp[(~temp["Def"])] temp = temp[temp["days since"] > 7] temp.head()
However, if I do this (which I think should be equivalent), I get no rows back:
temp2 = df[df["bin"] == 3] temp2 = temp2[~temp2["Def"] & temp2["days since"] > 7] temp2.head()
Any idea what accounts for the difference?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Use () because operator precedence:
temp2 = df[~df["Def"] & (df["days since"] > 7) & (df["bin"] == 3)]
Alternatively, create conditions on separate rows:
cond1 = df["bin"] == 3 cond2 = df["days since"] > 7 cond3 = ~df["Def"] temp2 = df[cond1 & cond2 & cond3]
Sample:
df = pd.DataFrame({'Def':[True] *2 + [False]*4,
'days since':[7,8,9,14,2,13],
'bin':[1,3,5,3,3,3]})
print (df)
Def bin days since
0 True 1 7
1 True 3 8
2 False 5 9
3 False 3 14
4 False 3 2
5 False 3 13
temp2 = df[~df["Def"] & (df["days since"] > 7) & (df["bin"] == 3)]
print (temp2)
Def bin days since
3 False 3 14
5 False 3 13
Method 2
OR
df_train[(df_train["fold"]==1) | (df_train["fold"]==2)]
AND
df_train[(df_train["fold"]==1) & (df_train["fold"]==2)]
Method 3
Alternatively, you can use the method query:
df.query('not Def & (`days since` > 7) & (bin == 3)')
Method 4
If you want multiple conditions:
Del_Det_5k_top_10 = Del_Det[(Del_Det['State'] == 'NSW') & (Del_Det['route'] == 2) |
(Del_Det['State'] == 'VIC') & (Del_Det['route'] == 3)]
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0