I am trying to fill none values in a Pandas dataframe with 0’s for only some subset of columns.
When I do:
import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df
The output:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 NaN 7.0
3 NaN 6.0 8.0
a b c
0 1.0 4.0 0.0
1 2.0 5.0 0.0
2 3.0 0.0 7.0
3 0.0 6.0 8.0
It replaces every None with 0‘s. What I want to do is, only replace Nones in columns a and b, but not c.
What is the best way of doing this?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can select your desired columns and do it by assignment:
df[['a', 'b']] = df[['a','b']].fillna(value=0)
The resulting output is as expected:
a b c 0 1.0 4.0 NaN 1 2.0 5.0 NaN 2 3.0 0.0 7.0 3 0.0 6.0 8.0
Method 2
You can using dict , fillna with different value for different column
df.fillna({'a':0,'b':0})
Out[829]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
After assign it back
df=df.fillna({'a':0,'b':0})
df
Out[831]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
Method 3
You can avoid making a copy of the object using Wen’s solution and inplace=True:
df.fillna({'a':0, 'b':0}, inplace=True)
print(df)
Which yields:
a b c 0 1.0 4.0 NaN 1 2.0 5.0 NaN 2 3.0 0.0 7.0 3 0.0 6.0 8.0
Method 4
using the top answer produces a warning about making changes to a copy of a df slice. Assuming that you have other columns, a better way to do this is to pass a dictionary:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)
Method 5
This should work and without copywarning
df[['a', 'b']] = df.loc[:,['a', 'b']].fillna(value=0)
Method 6
Here’s how you can do it all in one line:
df[['a', 'b']].fillna(value=0, inplace=True)
Breakdown: df[['a', 'b']] selects the columns you want to fill NaN values for, value=0 tells it to fill NaNs with zero, and inplace=True will make the changes permanent, without having to make a copy of the object.
Method 7
Or something like:
df.loc[df['a'].isnull(),'a']=0 df.loc[df['b'].isnull(),'b']=0
and if there is more:
for i in your_list:
df.loc[df[i].isnull(),i]=0
Method 8
For some odd reason this DID NOT work (using Pandas: ‘0.25.1’)
df[['col1', 'col2']].fillna(value=0, inplace=True)
Another solution:
subset_cols = ['col1','col2'] [df[col].fillna(0, inplace=True) for col in subset_cols]
Example:
df = pd.DataFrame(data={'col1':[1,2,np.nan,], 'col2':[1,np.nan,3], 'col3':[np.nan,2,3]})
output:
col1 col2 col3 0 1.00 1.00 nan 1 2.00 nan 2.00 2 nan 3.00 3.00
Apply list comp. to fillna values:
subset_cols = ['col1','col2'] [df[col].fillna(0, inplace=True) for col in subset_cols]
Output:
col1 col2 col3 0 1.00 1.00 nan 1 2.00 0.00 2.00 2 0.00 3.00 3.00
Method 9
Sometimes this syntax wont work:
df[['col1','col2']] = df[['col1','col2']].fillna()
Use the following instead:
df['col1','col2']
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0