How can I obtain the element-wise logical NOT of a pandas Series?

I have a pandas Series object containing boolean values. How can I get a series containing the logical NOT of each value?

For example, consider a series containing:

True
True
True
False

The series I’d like to get would contain:

False
False
False
True

This seems like it should be reasonably simple, but apparently I’ve misplaced my mojo =(

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

To invert a boolean Series, use ~s:

In [7]: s = pd.Series([True, True, False, True])

In [8]: ~s
Out[8]: 
0    False
1    False
2     True
3    False
dtype: bool

Using Python2.7, NumPy 1.8.0, Pandas 0.13.1:

In [119]: s = pd.Series([True, True, False, True]*10000)

In [10]:  %timeit np.invert(s)
10000 loops, best of 3: 91.8 µs per loop

In [11]: %timeit ~s
10000 loops, best of 3: 73.5 µs per loop

In [12]: %timeit (-s)
10000 loops, best of 3: 73.5 µs per loop

As of Pandas 0.13.0, Series are no longer subclasses of numpy.ndarray; they are now subclasses of pd.NDFrame. This might have something to do with why np.invert(s) is no longer as fast as ~s or -s.

Caveat: timeit results may vary depending on many factors including hardware, compiler, OS, Python, NumPy and Pandas versions.

Method 2

@unutbu’s answer is spot on, just wanted to add a warning that your mask needs to be dtype bool, not ‘object’. Ie your mask can’t have ever had any nan’s. See here – even if your mask is nan-free now, it will remain ‘object’ type.

The inverse of an ‘object’ series won’t throw an error, instead you’ll get a garbage mask of ints that won’t work as you expect.

In[1]: df = pd.DataFrame({'A':[True, False, np.nan], 'B':[True, False, True]})
In[2]: df.dropna(inplace=True)
In[3]: df['A']
Out[3]:
0    True
1   False
Name: A, dtype object
In[4]: ~df['A']
Out[4]:
0   -2
0   -1
Name: A, dtype object

After speaking with colleagues about this one I have an explanation: It looks like pandas is reverting to the bitwise operator:

In [1]: ~True
Out[1]: -2

As @geher says, you can convert it to bool with astype before you inverse with ~

~df['A'].astype(bool)
0    False
1     True
Name: A, dtype: bool
(~df['A']).astype(bool)
0    True
1    True
Name: A, dtype: bool

Method 3

I just give it a shot:

In [9]: s = Series([True, True, True, False])

In [10]: s
Out[10]: 
0     True
1     True
2     True
3    False

In [11]: -s
Out[11]: 
0    False
1    False
2    False
3     True

Method 4

You can also use numpy.invert:

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: s = pd.Series([True, True, False, True])

In [4]: np.invert(s)
Out[4]: 
0    False
1    False
2     True
3    False

EDIT: The difference in performance appears on Ubuntu 12.04, Python 2.7, NumPy 1.7.0 – doesn’t seem to exist using NumPy 1.6.2 though:

In [5]: %timeit (-s)
10000 loops, best of 3: 26.8 us per loop

In [6]: %timeit np.invert(s)
100000 loops, best of 3: 7.85 us per loop

In [7]: %timeit ~s
10000 loops, best of 3: 27.3 us per loop

Method 5

In support to the excellent answers here, and for future convenience, there may be a case where you want to flip the truth values in the columns and have other values remain the same (nan values for instance)

In[1]: series = pd.Series([True, np.nan, False, np.nan])
In[2]: series = series[series.notna()] #remove nan values
 
In[3]: series # without nan                                            
Out[3]: 
0     True
2    False
dtype: object

# Out[4] expected to be inverse of Out[3], pandas applies bitwise complement 
# operator instead as in `lambda x : (-1*x)-1`

In[4]: ~series
Out[4]: 
0    -2
2    -1
dtype: object

as a simple non-vectorized solution you can just, 1. check types2. inverse bools

In[1]: series = pd.Series([True, np.nan, False, np.nan])

In[2]: series = series.apply(lambda x : not x if x is bool else x)
Out[2]: 
Out[2]: 
0     True
1      NaN
2    False
3      NaN
dtype: object

Method 6

NumPy is slower because it casts the input to boolean values (so None and 0 becomes False and everything else becomes True).

import pandas as pd
import numpy as np
s = pd.Series([True, None, False, True])
np.logical_not(s)

gives you

0    False
1     True
2     True
3    False
dtype: object

whereas ~s would crash. In most cases tilde would be a safer choice than NumPy.

Pandas 0.25, NumPy 1.17


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x