Can I insert a column at a specific column index in pandas?
import pandas as pd
df = pd.DataFrame({'l':['a','b','c','d'], 'v':[1,2,1,2]})
df['n'] = 0
This will put column n as the last column of df, but isn’t there a way to tell df to put n at the beginning?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
see docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.insert.html
using loc = 0 will insert at the beginning
df.insert(loc, column, value)
df = pd.DataFrame({'B': [1, 2, 3], 'C': [4, 5, 6]})
df
Out:
B C
0 1 4
1 2 5
2 3 6
idx = 0
new_col = [7, 8, 9] # can be a list, a Series, an array or a scalar
df.insert(loc=idx, column='A', value=new_col)
df
Out:
A B C
0 7 1 4
1 8 2 5
2 9 3 6
Method 2
If you want a single value for all rows:
df.insert(0,'name_of_column','') df['name_of_column'] = value
Edit:
You can also:
df.insert(0,'name_of_column',value)
Method 3
You could try to extract columns as list, massage this as you want, and reindex your dataframe:
>>> cols = df.columns.tolist() >>> cols = [cols[-1]]+cols[:-1] # or whatever change you need >>> df.reindex(columns=cols) n l v 0 0 a 1 1 0 b 2 2 0 c 1 3 0 d 2
EDIT: this can be done in one line ; however, this looks a bit ugly. Maybe some cleaner proposal may come…
>>> df.reindex(columns=['n']+df.columns[:-1].tolist()) n l v 0 0 a 1 1 0 b 2 2 0 c 1 3 0 d 2
Method 4
df.insert(loc, column_name, value)
This will work if there is no other column with the same name. If a column, with your provided name already exists in the dataframe, it will raise a ValueError.
You can pass an optional parameter allow_duplicates with True value to create a new column with already existing column name.
Here is an example:
>>> df = pd.DataFrame({'b': [1, 2], 'c': [3,4]})
>>> df
b c
0 1 3
1 2 4
>>> df.insert(0, 'a', -1)
>>> df
a b c
0 -1 1 3
1 -1 2 4
>>> df.insert(0, 'a', -2)
Traceback (most recent call last):
File "", line 1, in
File "C:Python39libsite-packagespandascoreframe.py", line 3760, in insert
self._mgr.insert(loc, column, value, allow_duplicates=allow_duplicates)
File "C:Python39libsite-packagespandascoreinternalsmanagers.py", line 1191, in insert
raise ValueError(f"cannot insert {item}, already exists")
ValueError: cannot insert a, already exists
>>> df.insert(0, 'a', -2, allow_duplicates = True)
>>> df
a a b c
0 -2 -1 1 3
1 -2 -1 2 4
Method 5
Here is a very simple answer to this(only one line).
You can do that after you added the ‘n’ column into your df as follows.
import pandas as pd
df = pd.DataFrame({'l':['a','b','c','d'], 'v':[1,2,1,2]})
df['n'] = 0
df
l v n
0 a 1 0
1 b 2 0
2 c 1 0
3 d 2 0
# here you can add the below code and it should work.
df = df[list('nlv')]
df
n l v
0 0 a 1
1 0 b 2
2 0 c 1
3 0 d 2
However, if you have words in your columns names instead of letters. It should include two brackets around your column names.
import pandas as pd
df = pd.DataFrame({'Upper':['a','b','c','d'], 'Lower':[1,2,1,2]})
df['Net'] = 0
df['Mid'] = 2
df['Zsore'] = 2
df
Upper Lower Net Mid Zsore
0 a 1 0 2 2
1 b 2 0 2 2
2 c 1 0 2 2
3 d 2 0 2 2
# here you can add below line and it should work
df = df[list(('Mid','Upper', 'Lower', 'Net','Zsore'))]
df
Mid Upper Lower Net Zsore
0 2 a 1 0 2
1 2 b 2 0 2
2 2 c 1 0 2
3 2 d 2 0 2
Method 6
A general 4-line routine
You can have the following 4-line routine whenever you want to create a new column and insert into a specific location loc.
df['new_column'] = ... #new column's definition col = df.columns.tolist() col.insert(loc, col.pop()) #loc is the column's index you want to insert into df = df[col]
In your example, it is simple:
df['n'] = 0 col = df.columns.tolist() col.insert(0, col.pop()) df = df[col]
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0