Unpivot Pandas Data

I currently have a DataFrame laid out as:

        Jan Feb Mar Apr ...
2001    1   12  12  19  
2002    9   ...
2003    ...

and I would like to “unpivot” the data to look like:

Date    Value
Jan 2001    1
Feb 2001    1
Mar 2001    12
...
Jan 2002    9

What is the best way to accomplish this using pandas/NumPy?

Contents hide

Answers:

Method 1

Method 2

Method 3

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You just have to do df.unstack() and that will create a MultiIndexed Series with month as a first level and the year as the second level index. If you want them to be columns then just call reset_index() after that.

>>> df
      Jan  Feb
2001    3    4
2002    2    7
>>> df.unstack()
Jan  2001    3
     2002    2
Feb  2001    4
     2002    7
>>> df = df.unstack().reset_index(name='value')
>>> df
  level_0  level_1  value
0     Jan     2001      3
1     Jan     2002      2
2     Feb     2001      4
3     Feb     2002      7
>>> df.rename(columns={'level_0': 'month', 'level_1': 'year'}, inplace=True)
>>> df
  month  year  value
0   Jan  2001      3
1   Jan  2002      2
2   Feb  2001      4
3   Feb  2002      7

Method 2

Another solution would be to use pandas.melt to avoid unnecessary creation of a MultiIndex, though this isn’t that expensive if your frame is small and with my solution you still have to create a temporary for the “molten” data. The guts of melt suggest that both id_vars and value are copied since id_vars creation uses tile and value creation uses df.values.ravel('F') ~~which I believe makes a copy if your data are not in Fortran order~~.

EDIT: I’m not exactly sure when a copy is made when ravel is called since the order parameter only indicates how you want your data read and the docstring says a copy is made only when needed.

In [99]: mons
Out[99]:
['Jan',
 'Feb',
 'Mar',
 'Apr',
 'May',
 'Jun',
 'Jul',
 'Aug',
 'Sep',
 'Oct',
 'Nov',
 'Dec']

In [100]: df = DataFrame(randn(201, len(mons)), columns=mons, index=map(str, arange(1901, 2102)))

In [101]: df.head()
Out[101]:
        Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct  
1901  1.141 -0.270  0.329  0.214 -1.030  0.324 -1.448  2.003 -0.061  0.477
1902  0.136  0.151  0.447 -0.493  1.329  1.410  0.020 -0.705  0.870  0.478
1903 -0.000  0.689  1.768 -0.057 -1.471  0.515 -0.315  0.703  2.511  0.592
1904  1.199  1.246 -0.255  0.182 -0.454 -0.452  1.074  0.178  2.495 -0.543
1905  1.073  1.375 -1.837  1.048 -0.139 -0.273 -0.958 -1.164 -1.012  0.950

        Nov    Dec
1901  0.102  0.122
1902  2.941  0.654
1903  0.347 -1.636
1904 -0.047  0.457
1905  1.277 -0.284

In [102]: df.reset_index(inplace=True)

In [103]: df.head()
Out[103]:
  index    Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct  
0  1901  1.141 -0.270  0.329  0.214 -1.030  0.324 -1.448  2.003 -0.061  0.477
1  1902  0.136  0.151  0.447 -0.493  1.329  1.410  0.020 -0.705  0.870  0.478
2  1903 -0.000  0.689  1.768 -0.057 -1.471  0.515 -0.315  0.703  2.511  0.592
3  1904  1.199  1.246 -0.255  0.182 -0.454 -0.452  1.074  0.178  2.495 -0.543
4  1905  1.073  1.375 -1.837  1.048 -0.139 -0.273 -0.958 -1.164 -1.012  0.950

     Nov    Dec
0  0.102  0.122
1  2.941  0.654
2  0.347 -1.636
3 -0.047  0.457
4  1.277 -0.284

In [104]: res = pd.melt(df, id_vars=['index'], var_name=['months'])

In [105]: res['date'] = res['months'] + ' ' + res['index']

In [106]: res.head()
Out[106]:
  index months  value      date
0  1901    Jan  1.141  Jan 1901
1  1902    Jan  0.136  Jan 1902
2  1903    Jan -0.000  Jan 1903
3  1904    Jan  1.199  Jan 1904
4  1905    Jan  1.073  Jan 1905

Method 3

df.reset_index().melt(...) should unpivot it (credit):

>>> df = pd.DataFrame({
...     'foo': ['one', 'one', 'one', 'two', 'two', 'two'],
...     'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
...     'baz': [1, 2, 3, 4, 5, 6],
... })
>>> df
   foo bar  baz
0  one   A    1
1  one   B    2
2  one   C    3
3  two   A    4
4  two   B    5
5  two   C    6

>>> df_piv = df.pivot(index='foo', columns='bar', values='baz')
>>> df_piv
bar  A   B   C
foo
one  1   2   3
two  4   5   6

>>> def_piv_unpiv = df_piv.reset_index().melt(id_vars='foo', var_name='bar', value_name='baz')
>>> def_piv_unpiv
   foo bar  baz
0  one   A    1
1  two   A    4
2  one   B    2
3  two   B    5
4  one   C    3
5  two   C    6

Edit: I later realized this is basically a more concise version of Phillip Cloud’s answer.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating