Pandas: Shift down values by one row within a group

I have a Pandas dataframe, and I want to create a new column whose values are that of another column, shifted down by one row. The last row should show NaN.

The catch is that I want to do this by group, with the last row of each group showing NaN. NOT have the last row of a group “steal” a value from a group that happens to be adjacent in the dataframe.

My attempted implementation is quite shamefully broken, so I’m clearly misunderstanding something fundamental.

df['B_shifted'] = df.groupby(['A'])['B'].transform(lambda x:x.values[1:])

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Newer versions of pandas can now perform a shift on a group:

df['B_shifted'] = df.groupby(['A'])['B'].shift(1)

Note that when shifting down, it’s the first row that has NaN.

Method 2

Shift works on the output of the groupby clause:

>>> df = pandas.DataFrame(numpy.random.randint(1,3, (10,5)), columns=['a','b','c','d','e'])
>>> df
   a  b  c  d  e
0  2  1  2  1  1
1  2  1  1  1  1
2  1  2  2  1  2
3  1  2  1  1  2
4  2  2  1  1  2
5  2  2  2  2  1
6  2  2  1  1  1
7  2  2  2  1  1
8  2  2  2  2  1
9  2  2  2  2  1


for k, v in df.groupby('a'):
    print k
    print 'normal'
    print v
    print 'shifted'
    print v.shift(1)

1
normal
   a  b  c  d  e
2  1  2  2  1  2
3  1  2  1  1  2
shifted
    a   b   c   d   e
2 NaN NaN NaN NaN NaN
3   1   2   2   1   2
2
normal
   a  b  c  d  e
0  2  1  2  1  1
1  2  1  1  1  1
4  2  2  1  1  2
5  2  2  2  2  1
6  2  2  1  1  1
7  2  2  2  1  1
8  2  2  2  2  1
9  2  2  2  2  1
shifted
    a   b   c   d   e
0 NaN NaN NaN NaN NaN
1   2   1   2   1   1
4   2   1   1   1   1
5   2   2   1   1   2
6   2   2   2   2   1
7   2   2   1   1   1
8   2   2   2   1   1
9   2   2   2   2   1

Method 3

@EdChum’s comment is a better answer to this question, so I’m posting it here for posterity:

df['B_shifted'] = df.groupby(['A'])['B'].transform(lambda x:x.shift())

or similarly

df['B_shifted'] = df.groupby(['A'])['B'].transform('shift').

The former notation is more flexible, of course (e.g. if you want to shift by 2).


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x