How to filter in NaN (pandas)?

I have a pandas dataframe (df), and I want to do something like:

newdf = df[(df.var1 == 'a') & (df.var2 == NaN)]

I’ve tried replacing NaN with np.NaN, or 'NaN' or 'nan' etc, but nothing evaluates to True. There’s no pd.NaN.

I can use df.fillna(np.nan) before evaluating the above expression but that feels hackish and I wonder if it will interfere with other pandas operations that rely on being able to identify pandas-format NaN’s later.

I get the feeling there should be an easy answer to this question, but somehow it has eluded me. Any advice is appreciated. Thank you.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Simplest of all solutions:

filtered_df = df[df['var2'].isnull()]

This filters and gives you rows which has only NaN values in 'var2' column.

Method 2

This doesn’t work because NaN isn’t equal to anything, including NaN. Use pd.isnull(df.var2) instead.

Method 3

df[df['var'].isna()]

where

df  : The DataFrame
var : The Column Name

Method 4

Pandas uses numpy‘s NaN value. Use numpy.isnan to obtain a Boolean vector from a pandas series.

Method 5

You can also use query here:

df.query('var2 != var2')

This works since np.nan != np.nan.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x