df = pd.DataFrame([[1,2,3], [10,20,30], [100,200,300]])
df.columns = pd.MultiIndex.from_tuples((("a", "b"), ("a", "c"), ("d", "f")))
df
returns
a d
b c f
0 1 2 3
1 10 20 30
2 100 200 300
and
df.columns.levels[1]
returns
Index([u'b', u'c', u'f'], dtype='object')
I want to rename "f" to "e". According to pandas.MultiIndex.rename I run:
df.columns.rename(["b1", "c1", "f1"], level=1)
But it raises
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-110-b171a2b5706c> in <module>()
----> 1 df.columns.rename(["b1", "c1", "f1"], level=1)
C:UsersUSERNAMEAppDataLocalContinuumMiniconda2libsite-packagespandasindexesbase.pyc in set_names(self, names, level, inplace)
994 if level is not None and not is_list_like(level) and is_list_like(
995 names):
--> 996 raise TypeError("Names must be a string")
997
998 if not is_list_like(names) and level is None and self.nlevels > 1:
TypeError: Names must be a string
I use Python 2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]' and pandas 0.19.1
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Use set_levels:
In [22]:
df.columns.set_levels(['b1','c1','f1'],level=1,inplace=True)
df
Out[22]:
a d
b1 c1 f1
0 1 2 3
1 10 20 30
2 100 200 300
rename sets the name for the index, it doesn’t rename the column names:
In [26]:
df.columns = df.columns.rename("b1", level=1)
df
Out[26]:
a d
b1 b c f
0 1 2 3
1 10 20 30
2 100 200 300
This is why you get the error
Method 2
In pandas 0.21.0+ use parameter level=1:
d = dict(zip(df.columns.levels[1], ["b1", "c1", "f1"]))
print (d)
{'c': 'c1', 'b': 'b1', 'f': 'f1'}
df = df.rename(columns=d, level=1)
print (df)
a d
b1 c1 f1
0 1 2 3
1 10 20 30
2 100 200 300
Method 3
You can use pandas.DataFrame.rename() directly
Say you have the following dataframe
print(df)
a d
b c f
0 1 2 3
1 10 20 30
2 100 200 300
df = df.rename(columns={'f': 'f1', 'd': 'd1'})
print(df)
a d1
b c f1
0 1 2 3
1 10 20 30
2 100 200 300
You see, column name mapper doesn’t relate with level.
Say you have the following dataframe
a d
b f f
0 1 2 3
1 10 20 30
2 100 200 300
If you want to rename the f under a, you can do
df.columns = df.columns.values
df.columns = pd.MultiIndex.from_tuples(df.rename(columns={('a', 'f'): ('a', 'af')}))
print(df)
a d
b af f
0 1 2 3
1 10 20 30
2 100 200 300
Method 4
There is also index.set_names (code)
df.index.set_names(["b1", "c1", "f1"], inplace=True)
Method 5
Another thing you can’t do is
df.rename(columns={('d', 'f'): ('e', 'g')}), even though it seems correct. In other words:.rename()does not do what one expects, <…>— Lukas at comment
The “hacky” way is something like this (as far as for pandas 1.0.5)
def rename_columns(df, columns, inplace=False):
"""Rename dataframe columns.
Parameters
----------
df : pandas.DataFrame
Dataframe.
columns : dict-like
Alternative to specifying axis. If `df.columns` is
:obj: `pandas.MultiIndex`-object and has a few levels, pass equal-size tuples.
Returns
-------
pandas.DataFrame or None
Returns dataframe with modifed columns or ``None`` (depends on `inplace` parameter value).
Examples
--------
>>> columns = pd.Index([1, 2, 3])
>>> df = pd.DataFrame([[1, 2, 3], [10, 20, 30]], columns=columns)
... 1 2 3
... 0 1 2 3
... 1 10 20 30
>>> rename_columns(df, columns={1 : 10})
... 10 2 3
... 0 1 2 3
... 1 10 20 30
MultiIndex
>>> columns = pd.MultiIndex.from_tuples([("A0", "B0", "C0"), ("A1", "B1", "C1"), ("A2", "B2", "")])
>>> df = pd.DataFrame([[1, 2, 3], [10, 20, 30]], columns=columns)
>>> df
... A0 A1 A2
... B0 B1 B2
... C0 C1
... 0 1 2 3
... 1 10 20 30
>>> rename_columns(df, columns={("A2", "B2", "") : ("A3", "B3", "")})
... A0 A1 A3
... B0 B1 B3
... C0 C1
... 0 1 2 3
... 1 10 20 30
"""
columns_new = []
for col in df.columns.values:
if col in columns:
columns_new.append(columns[col])
else:
columns_new.append(col)
columns_new = pd.Index(columns_new, tupleize_cols=True)
if inplace:
df.columns = columns_new
else:
df_new = df.copy()
df_new.columns = columns_new
return df_new
So just
>>> df = pd.DataFrame([[1,2,3], [10,20,30], [100,200,300]])
>>> df.columns = pd.MultiIndex.from_tuples((("a", "b"), ("a", "c"), ("d", "f")))
>>> rename_columns(df, columns={('d', 'f'): ('e', 'g')})
... a e
... b c g
... 0 1 2 3
... 1 10 20 30
... 2 100 200 300
What does the pandas-team think about this? Why is this behavior not default?
Method 6
Using dicts to rename tuples
Since multi-index stores values as tuples, and python dicts accept tuples as keys and values, we can replace them using a dict.
mapping_dict = {("d","f"):("d","e")}
# Dictionary allows using tuples as keys and values
def rename_tuple(tuple_, dict_):
"""Replaces tuple if present in tuple dict"""
if tuple_ in dict_.keys():
return dict_[tuple_]
return tuple_
# Rename chosen elements from list of tuples from df.columns
altered_index_list = [rename_tuple(tuple_,mapping_dict) for tuple_ in df.columns.to_list()]
# Update columns with new renamed columns
df.columns = pd.Index(altered_index_list)
Which returns the intended df
a d
b c e
0 1 2 3
1 10 20 30
2 100 200 300
Aggregating in a function
This could then be aggregated in a function to simplify things
def rename_multi_index(index,mapper):
"""Renames pandas multi_index"""
return pd.Index([rename_tuple(tuple_,mapper) for tuple_ in index])
# And now simply do
df.columns = rename_multi_index(df.columns,mapping_dict)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0