pandas equivalent of np.where

np.where has the semantics of a vectorized if/else (similar to Apache Spark’s when/otherwise DataFrame method). I know that I can use np.where on pandas.Series, but pandas often defines its own API to use instead of raw numpy functions, which is usually more convenient with pd.Series/pd.DataFrame.

Sure enough, I found pandas.DataFrame.where. However, at first glance, it has completely different semantics. I could not find a way to rewrite the most basic example of np.where using pandas where:

# df is pd.DataFrame
# how to write this using df.where?
df['C'] = np.where((df['A']<0) | (df['B']>0), df['A']+df['B'], df['A']/df['B'])

Am I missing something obvious? Or is pandas’ where intended for a completely different use case, despite same name as np.where?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Try:

(df['A'] + df['B']).where((df['A'] < 0) | (df['B'] > 0), df['A'] / df['B'])

The difference between the numpy where and DataFrame where is that the default values are supplied by the DataFrame that the where method is being called on (docs).

I.e.

np.where(m, A, B)

is roughly equivalent to

A.where(m, B)

If you wanted a similar call signature using pandas, you could take advantage of the way method calls work in Python:

pd.DataFrame.where(cond=(df['A'] < 0) | (df['B'] > 0), self=df['A'] + df['B'], other=df['A'] / df['B'])

or without kwargs (Note: that the positional order of arguments is different from the numpy where argument order):

pd.DataFrame.where(df['A'] + df['B'], (df['A'] < 0) | (df['B'] > 0), df['A'] / df['B'])

Method 2

I prefer using pandas’ mask over where since it is less counterintuitive (at least for me).

(df['A']/df['B']).mask(df['A']<0) | (df['B']>0), df['A']+df['B'])

Here, column A and B are added where the condition holds, otherwise their ratio stays untouched.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x