How to remove nan value while combining two column in Panda Data frame?

I am trying but not able to remove nan while combining two columns of a DataFrame.

Data is like:

feedback_id                  _id
568a8c25cac4991645c287ac     nan    
568df45b177e30c6487d3603     nan    
nan                          568df434832b090048f34974       
nan                          568cd22e9e82dfc166d7dff1   
568df3f0832b090048f34711     nan
nan                          568e5a38b4a797c664143dda

I want:

feedback_request_id
568a8c25cac4991645c287ac
568df45b177e30c6487d3603
568df434832b090048f34974
568cd22e9e82dfc166d7dff1
568df3f0832b090048f34711
568e5a38b4a797c664143dda

Here is my code:

df3['feedback_request_id'] = ('' if df3['_id'].empty else df3['_id'].map(str)) + ('' if df3['feedback_id'].empty else df3['feedback_id'].map(str))

Output I’m getting:

feedback_request_id
568a8c25cac4991645c287acnan
568df45b177e30c6487d3603nan
nan568df434832b090048f34974
nan568cd22e9e82dfc166d7dff1
568df3f0832b090048f34711nan
nan568e5a38b4a797c664143dda

I have tried this, also:

df3['feedback_request_id'] = ('' if df3['_id']=='nan' else df3['_id'].map(str)) + ('' if df3['feedback_id']=='nan' else df3['feedback_id'].map(str))

But it’s giving the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Contents hide

Answers:

Method 1

Method 2

Method 3

Method 4

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can use combine_first or fillna:

print df['feedback_id'].combine_first(df['_id'])
0    568a8c25cac4991645c287ac
1    568df45b177e30c6487d3603
2    568df434832b090048f34974
3    568cd22e9e82dfc166d7dff1
4    568df3f0832b090048f34711
5    568e5a38b4a797c664143dda
Name: feedback_id, dtype: object

print df['feedback_id'].fillna(df['_id'])
0    568a8c25cac4991645c287ac
1    568df45b177e30c6487d3603
2    568df434832b090048f34974
3    568cd22e9e82dfc166d7dff1
4    568df3f0832b090048f34711
5    568e5a38b4a797c664143dda
Name: feedback_id, dtype: object

Method 2

If you want a solution that doesn’t require referencing df twice or any of its columns explicitly:

df.bfill(axis=1).iloc[:, 0]

With two columns, this will copy non-null values from the right column into the left, then select the left column.

Method 3

For an in-place solution, you can use pd.Series.update with pd.DataFrame.pop:

df['feedback_id'].update(df.pop('_id'))

print(df)

                feedback_id
0  568a8c25cac4991645c287ac
1  568df45b177e30c6487d3603
2  568df434832b090048f34974
3  568cd22e9e82dfc166d7dff1
4  568df3f0832b090048f34711
5  568e5a38b4a797c664143dda

Method 4

below should works, if not, check with the null in your columns are np.nan or pd.NaT, only pd.NaT will work

df[['col1','col2']].bfill(axis=1).iloc[:, 0]

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating