I am trying but not able to remove nan while combining two columns of a DataFrame.
Data is like:
feedback_id _id 568a8c25cac4991645c287ac nan 568df45b177e30c6487d3603 nan nan 568df434832b090048f34974 nan 568cd22e9e82dfc166d7dff1 568df3f0832b090048f34711 nan nan 568e5a38b4a797c664143dda
I want:
feedback_request_id 568a8c25cac4991645c287ac 568df45b177e30c6487d3603 568df434832b090048f34974 568cd22e9e82dfc166d7dff1 568df3f0832b090048f34711 568e5a38b4a797c664143dda
Here is my code:
df3['feedback_request_id'] = ('' if df3['_id'].empty else df3['_id'].map(str)) + ('' if df3['feedback_id'].empty else df3['feedback_id'].map(str))
Output I’m getting:
feedback_request_id 568a8c25cac4991645c287acnan 568df45b177e30c6487d3603nan nan568df434832b090048f34974 nan568cd22e9e82dfc166d7dff1 568df3f0832b090048f34711nan nan568e5a38b4a797c664143dda
I have tried this, also:
df3['feedback_request_id'] = ('' if df3['_id']=='nan' else df3['_id'].map(str)) + ('' if df3['feedback_id']=='nan' else df3['feedback_id'].map(str))
But it’s giving the error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can use combine_first or fillna:
print df['feedback_id'].combine_first(df['_id']) 0 568a8c25cac4991645c287ac 1 568df45b177e30c6487d3603 2 568df434832b090048f34974 3 568cd22e9e82dfc166d7dff1 4 568df3f0832b090048f34711 5 568e5a38b4a797c664143dda Name: feedback_id, dtype: object print df['feedback_id'].fillna(df['_id']) 0 568a8c25cac4991645c287ac 1 568df45b177e30c6487d3603 2 568df434832b090048f34974 3 568cd22e9e82dfc166d7dff1 4 568df3f0832b090048f34711 5 568e5a38b4a797c664143dda Name: feedback_id, dtype: object
Method 2
If you want a solution that doesn’t require referencing df twice or any of its columns explicitly:
df.bfill(axis=1).iloc[:, 0]
With two columns, this will copy non-null values from the right column into the left, then select the left column.
Method 3
For an in-place solution, you can use pd.Series.update with pd.DataFrame.pop:
df['feedback_id'].update(df.pop('_id'))
print(df)
feedback_id
0 568a8c25cac4991645c287ac
1 568df45b177e30c6487d3603
2 568df434832b090048f34974
3 568cd22e9e82dfc166d7dff1
4 568df3f0832b090048f34711
5 568e5a38b4a797c664143dda
Method 4
below should works, if not, check with the null in your columns are np.nan or pd.NaT, only pd.NaT will work
df[['col1','col2']].bfill(axis=1).iloc[:, 0]
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0