I have a dataframe that look like this:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(4,4), columns=list('abcd'))
df
a b c d
0 0.418762 0.042369 0.869203 0.972314
1 0.991058 0.510228 0.594784 0.534366
2 0.407472 0.259811 0.396664 0.894202
3 0.726168 0.139531 0.324932 0.906575
How I can get all columns except b?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
When the columns are not a MultiIndex, df.columns is just an array of column names so you can do:
df.loc[:, df.columns != 'b']
a c d
0 0.561196 0.013768 0.772827
1 0.882641 0.615396 0.075381
2 0.368824 0.651378 0.397203
3 0.788730 0.568099 0.869127
Method 2
Don’t use ix. It’s deprecated. The most readable and idiomatic way of doing this is df.drop():
>>> df
a b c d
0 0.175127 0.191051 0.382122 0.869242
1 0.414376 0.300502 0.554819 0.497524
2 0.142878 0.406830 0.314240 0.093132
3 0.337368 0.851783 0.933441 0.949598
>>> df.drop('b', axis=1)
a c d
0 0.175127 0.382122 0.869242
1 0.414376 0.554819 0.497524
2 0.142878 0.314240 0.093132
3 0.337368 0.933441 0.949598
Note that by default, .drop() does not operate inplace; despite the ominous name, df is unharmed by this process. If you want to permanently remove b from df, do df.drop('b', inplace=True).
df.drop() also accepts a list of labels, e.g. df.drop(['a', 'b'], axis=1) will drop column a and b.
Method 3
df[df.columns.difference(['b'])]
Out:
a c d
0 0.427809 0.459807 0.333869
1 0.678031 0.668346 0.645951
2 0.996573 0.673730 0.314911
3 0.786942 0.719665 0.330833
Method 4
You can use df.columns.isin()
df.loc[:, ~df.columns.isin(['b'])]
When you want to drop multiple columns, as simple as:
df.loc[:, ~df.columns.isin(['col1', 'col2'])]
Method 5
You can drop columns in index:
df[df.columns.drop('b')]
or
df.loc[:, df.columns.drop('b')]
Output:
a c d 0 0.418762 0.869203 0.972314 1 0.991058 0.594784 0.534366 2 0.407472 0.396664 0.894202 3 0.726168 0.324932 0.906575
Method 6
Here is another way:
df[[i for i in list(df.columns) if i != '<your column>']]
You just pass all columns to be shown except of the one you do not want.
Method 7
Here is a one line lambda:
df[map(lambda x :x not in ['b'], list(df.columns))]
before:
import pandas
import numpy as np
df = pd.DataFrame(np.random.rand(4,4), columns = list('abcd'))
df
a b c d
0 0.774951 0.079351 0.118437 0.735799
1 0.615547 0.203062 0.437672 0.912781
2 0.804140 0.708514 0.156943 0.104416
3 0.226051 0.641862 0.739839 0.434230
after:
df[map(lambda x :x not in ['b'], list(df.columns))]
a c d
0 0.774951 0.118437 0.735799
1 0.615547 0.437672 0.912781
2 0.804140 0.156943 0.104416
3 0.226051 0.739839 0.434230
Method 8
I think the best way to do is the way mentioned by @Salvador Dali. Not that the others are wrong.
Because when you have a data set where you just want to select one column and put it into one variable and the rest of the columns into another for comparison or computational purposes. Then dropping the column of the data set might not help. Of course there are use cases for that as well.
x_cols = [x for x in data.columns if x != 'name of column to be excluded']
Then you can put those collection of columns in variable x_cols into another variable like x_cols1 for other computation.
ex: x_cols1 = data[x_cols]
Method 9
Another slight modification to @Salvador Dali enables a list of columns to exclude:
df[[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]
or
df.loc[:,[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]
Method 10
I think a nice solution is with the function filter of pandas and regex (match everything except “b”):
df.filter(regex="^(?!b$)")
Method 11
Similar to @Toms answer, it is also possible to select all columns except “b” without using .loc, like so:
df[df.columns[~df.columns.isin(['b'])]]
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0