How do I add multiple empty columns to a DataFrame from a list?
I can do:
df["B"] = None
df["C"] = None
df["D"] = None
But I can’t do:
df[["B", "C", "D"]] = None
KeyError: "['B' 'C' 'D'] not in index"
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You could use df.reindex to add new columns:
In [18]: df = pd.DataFrame(np.random.randint(10, size=(5,1)), columns=['A'])
In [19]: df
Out[19]:
A
0 4
1 7
2 0
3 7
4 6
In [20]: df.reindex(columns=list('ABCD'))
Out[20]:
A B C D
0 4 NaN NaN NaN
1 7 NaN NaN NaN
2 0 NaN NaN NaN
3 7 NaN NaN NaN
4 6 NaN NaN NaN
reindex will return a new DataFrame, with columns appearing in the order they are listed:
In [31]: df.reindex(columns=list('DCBA'))
Out[31]:
D C B A
0 NaN NaN NaN 4
1 NaN NaN NaN 7
2 NaN NaN NaN 0
3 NaN NaN NaN 7
4 NaN NaN NaN 6
The reindex method as a fill_value parameter as well:
In [22]: df.reindex(columns=list('ABCD'), fill_value=0)
Out[22]:
A B C D
0 4 0 0 0
1 7 0 0 0
2 0 0 0 0
3 7 0 0 0
4 6 0 0 0
Method 2
I’d concat using a DataFrame:
In [23]:
df = pd.DataFrame(columns=['A'])
df
Out[23]:
Empty DataFrame
Columns: [A]
Index: []
In [24]:
pd.concat([df,pd.DataFrame(columns=list('BCD'))])
Out[24]:
Empty DataFrame
Columns: [A, B, C, D]
Index: []
So by passing a list containing your original df, and a new one with the columns you wish to add, this will return a new df with the additional columns.
Caveat: See the discussion of performance in the other answers and/or the comment discussions. reindex may be preferable where performance is critical.
Method 3
If you don’t want to rewrite the name of the old columns, then you can use reindex:
df.reindex(columns=[*df.columns.tolist(), 'new_column1', 'new_column2'], fill_value=0)
Full example:
In [1]: df = pd.DataFrame(np.random.randint(10, size=(3,1)), columns=['A']) In [1]: df Out[1]: A 0 4 1 7 2 0 In [2]: df.reindex(columns=[*df.columns.tolist(), 'col1', 'col2'], fill_value=0) Out[2]: A col1 col2 0 1 0 0 1 2 0 0
And, if you already have a list with the column names, :
In [3]: my_cols_list=['col1','col2'] In [4]: df.reindex(columns=[*df.columns.tolist(), *my_cols_list], fill_value=0) Out[4]: A col1 col2 0 1 0 0 1 2 0 0
Method 4
Summary of alternative solutions:
columns_add = ['a', 'b', 'c']
-
for loop:
for newcol in columns_add: df[newcol]= None -
dict method:
df.assign(**dict([(_,None) for _ in columns_add]))
-
tuple assignment:
df['a'], df['b'], df['c'] = None, None, None
Method 5
Why not just use loop:
for newcol in ['B','C','D']:
df[newcol]=np.nan
Method 6
You can make use of Pandas broadcasting:
df = pd.DataFrame({'A': [1, 1, 1]})
df[['B', 'C']] = 2, 3
# df[['B', 'C']] = [2, 3]
Result:
A B C 0 1 2 3 1 1 2 3 2 1 2 3
To add empty columns:
df[['B', 'C', 'D']] = 3 * [np.nan]
Result:
A B C D 0 1 NaN NaN NaN 1 1 NaN NaN NaN 2 1 NaN NaN NaN
Method 7
I’d use
df["B"], df["C"], df["D"] = None, None, None
or
df["B"], df["C"], df["D"] = ["None" for a in range(3)]
Method 8
Just to add to the list of funny ways:
columns_add = ['a', 'b', 'c'] df = df.assign(**dict(zip(columns_add, [0] * len(columns_add)))
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0