Shift NaNs to the end of their respective rows

I have a DataFrame like :

     0    1    2
0  0.0  1.0  2.0
1  NaN  1.0  2.0
2  NaN  NaN  2.0

What I want to get is

Out[116]: 
     0    1    2
0  0.0  1.0  2.0
1  1.0  2.0  NaN
2  2.0  NaN  NaN

This is my approach as of now.

df.apply(lambda x : (x[x.notnull()].values.tolist()+x[x.isnull()].values.tolist()),1)
Out[117]: 
     0    1    2
0  0.0  1.0  2.0
1  1.0  2.0  NaN
2  2.0  NaN  NaN

Is there any efficient way to achieve this ? apply Here is way to slow .
Thank you for your assistant!:)

My real data size

df.shape
Out[117]: (54812040, 1522)

Contents hide

Answers:

Method 1

Method 2

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Here’s a NumPy solution using justify –

In [455]: df
Out[455]: 
     0    1    2
0  0.0  1.0  2.0
1  NaN  1.0  2.0
2  NaN  NaN  2.0

In [456]: pd.DataFrame(justify(df.values, invalid_val=np.nan, axis=1, side='left'))
Out[456]: 
     0    1    2
0  0.0  1.0  2.0
1  1.0  2.0  NaN
2  2.0  NaN  NaN

If you want to save memory, assign it back instead –

df[:] = justify(df.values, invalid_val=np.nan, axis=1, side='left')

Method 2

Your ~~best~~ easiest option is to use sorted on df.apply/df.transform and sort by nullity.

df = df.apply(lambda x: sorted(x, key=pd.isnull), 1)
df
     0    1    2
0  0.0  1.0  2.0
1  1.0  2.0  NaN
2  2.0  NaN  NaN

You may also pass np.isnan to the key argument.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating