Rename MultiIndex columns in Pandas

df = pd.DataFrame([[1,2,3], [10,20,30], [100,200,300]])
df.columns = pd.MultiIndex.from_tuples((("a", "b"), ("a", "c"), ("d", "f")))
df

returns

     a         d
     b    c    f
0    1    2    3
1   10   20   30
2  100  200  300

and

df.columns.levels[1]

returns

Index([u'b', u'c', u'f'], dtype='object')

I want to rename "f" to "e". According to pandas.MultiIndex.rename I run:

df.columns.rename(["b1", "c1", "f1"], level=1)

But it raises

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-110-b171a2b5706c> in <module>()
----> 1 df.columns.rename(["b1", "c1", "f1"], level=1)

C:UsersUSERNAMEAppDataLocalContinuumMiniconda2libsite-packagespandasindexesbase.pyc in set_names(self, names, level, inplace)
    994         if level is not None and not is_list_like(level) and is_list_like(
    995                 names):
--> 996             raise TypeError("Names must be a string")
    997 
    998         if not is_list_like(names) and level is None and self.nlevels > 1:

TypeError: Names must be a string

I use Python 2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]' and pandas 0.19.1

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Use set_levels:

In [22]:
df.columns.set_levels(['b1','c1','f1'],level=1,inplace=True)
df

Out[22]:
     a         d
    b1   c1   f1
0    1    2    3
1   10   20   30
2  100  200  300

rename sets the name for the index, it doesn’t rename the column names:

In [26]:
df.columns = df.columns.rename("b1", level=1)
df

Out[26]:
      a         d
b1    b    c    f
0     1    2    3
1    10   20   30
2   100  200  300

This is why you get the error

Method 2

In pandas 0.21.0+ use parameter level=1:

d = dict(zip(df.columns.levels[1], ["b1", "c1", "f1"]))
print (d)
{'c': 'c1', 'b': 'b1', 'f': 'f1'}

df = df.rename(columns=d, level=1)
print (df)
     a         d
    b1   c1   f1
0    1    2    3
1   10   20   30
2  100  200  300

Method 3

You can use pandas.DataFrame.rename() directly

Say you have the following dataframe

print(df)

     a         d
     b    c    f
0    1    2    3
1   10   20   30
2  100  200  300
df = df.rename(columns={'f': 'f1', 'd': 'd1'})
print(df)

     a        d1
     b    c   f1
0    1    2    3
1   10   20   30
2  100  200  300

You see, column name mapper doesn’t relate with level.

Say you have the following dataframe

     a         d
     b    f    f
0    1    2    3
1   10   20   30
2  100  200  300

If you want to rename the f under a, you can do

df.columns = df.columns.values
df.columns = pd.MultiIndex.from_tuples(df.rename(columns={('a', 'f'): ('a', 'af')}))
print(df)

     a         d
     b   af    f
0    1    2    3
1   10   20   30
2  100  200  300

Method 4

There is also index.set_names (code)

df.index.set_names(["b1", "c1", "f1"], inplace=True)

Method 5

Another thing you can’t do is df.rename(columns={('d', 'f'): ('e', 'g')}), even though it seems correct. In other words: .rename() does not do what one expects, <…>

— Lukas at comment

The “hacky” way is something like this (as far as for pandas 1.0.5)

def rename_columns(df, columns, inplace=False):
    """Rename dataframe columns.

    Parameters
    ----------
    df : pandas.DataFrame
        Dataframe.
    columns : dict-like
        Alternative to specifying axis. If `df.columns` is
        :obj: `pandas.MultiIndex`-object and has a few levels, pass equal-size tuples.

    Returns
    -------
    pandas.DataFrame or None
        Returns dataframe with modifed columns or ``None`` (depends on `inplace` parameter value).
    
    Examples
    --------
    >>> columns = pd.Index([1, 2, 3])
    >>> df = pd.DataFrame([[1, 2, 3], [10, 20, 30]], columns=columns)
    ...     1   2   3
    ... 0   1   2   3
    ... 1  10  20  30
    >>> rename_columns(df, columns={1 : 10})
    ...    10   2   3
    ... 0   1   2   3
    ... 1  10  20  30
    
    MultiIndex
    
    >>> columns = pd.MultiIndex.from_tuples([("A0", "B0", "C0"), ("A1", "B1", "C1"), ("A2", "B2", "")])
    >>> df = pd.DataFrame([[1, 2, 3], [10, 20, 30]], columns=columns)
    >>> df
    ...    A0  A1  A2
    ...    B0  B1  B2
    ...    C0  C1
    ... 0   1   2   3
    ... 1  10  20  30
    >>> rename_columns(df, columns={("A2", "B2", "") : ("A3", "B3", "")})
    ...    A0  A1  A3
    ...    B0  B1  B3
    ...    C0  C1
    ... 0   1   2   3
    ... 1  10  20  30
    """
    columns_new = []
    for col in df.columns.values:
        if col in columns:
            columns_new.append(columns[col])
        else:
            columns_new.append(col)
    columns_new = pd.Index(columns_new, tupleize_cols=True)

    if inplace:
        df.columns = columns_new
    else:
        df_new = df.copy()
        df_new.columns = columns_new
        return df_new

So just

>>> df = pd.DataFrame([[1,2,3], [10,20,30], [100,200,300]])
>>> df.columns = pd.MultiIndex.from_tuples((("a", "b"), ("a", "c"), ("d", "f")))
>>> rename_columns(df, columns={('d', 'f'): ('e', 'g')})
...      a         e
...      b    c    g
... 0    1    2    3
... 1   10   20   30
... 2  100  200  300

What does the pandas-team think about this? Why is this behavior not default?

Method 6

Using dicts to rename tuples

Since multi-index stores values as tuples, and python dicts accept tuples as keys and values, we can replace them using a dict.

mapping_dict = {("d","f"):("d","e")}

# Dictionary allows using tuples as keys and values
def rename_tuple(tuple_, dict_):
    """Replaces tuple if present in tuple dict"""
    if tuple_ in dict_.keys():
        return dict_[tuple_]
    return tuple_

# Rename chosen elements from list of tuples from df.columns
altered_index_list = [rename_tuple(tuple_,mapping_dict) for tuple_ in df.columns.to_list()]

# Update columns with new renamed columns
df.columns = pd.Index(altered_index_list)

Which returns the intended df

     a         d
     b    c    e
0    1    2    3
1   10   20   30
2  100  200  300

Aggregating in a function

This could then be aggregated in a function to simplify things

def rename_multi_index(index,mapper):
    """Renames pandas multi_index"""
    return pd.Index([rename_tuple(tuple_,mapper) for tuple_ in index])

# And now simply do
df.columns = rename_multi_index(df.columns,mapping_dict)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x