Compute row average in pandas

       Y1961      Y1962      Y1963      Y1964      Y1965  Region
0  82.567307  83.104757  83.183700  83.030338  82.831958  US
1   2.699372   2.610110   2.587919   2.696451   2.846247  US
2  14.131355  13.690028  13.599516  13.649176  13.649046  US
3   0.048589   0.046982   0.046583   0.046225   0.051750  US
4   0.553377   0.548123   0.582282   0.577811   0.620999  US

In the above dataframe, I would like to get average of each row. currently, I am doing this:

df.mean(axis=0)

However, this does away with the Region column as well. how can I compute mean and also retain Region column

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can specify a new column. You also need to compute the mean along the rows, so use axis=1.

df['mean'] = df.mean(axis=1)
>>> df
       Y1961      Y1962      Y1963      Y1964      Y1965 Region       mean
0  82.567307  83.104757  83.183700  83.030338  82.831958     US  82.943612
1   2.699372   2.610110   2.587919   2.696451   2.846247     US   2.688020
2  14.131355  13.690028  13.599516  13.649176  13.649046     US  13.743824
3   0.048589   0.046982   0.046583   0.046225   0.051750     US   0.048026
4   0.553377   0.548123   0.582282   0.577811   0.620999     US   0.576518

Method 2

We can find the the mean of a row using the range function, i.e in your case, from the Y1961 column to the Y1965

df['mean'] = df.iloc[:, 0:4].mean(axis=1)

And if you want to select individual columns

df['mean'] = df.iloc[:, [0,1,2,3,4].mean(axis=1)

Method 3

I think this is what you are looking for:

df.drop('Region', axis=1).apply(lambda x: x.mean(), axis=1)

Method 4

Taking the mean based on the column names

I am just sharing this which might be useful for those folks who want to take average of a few columns based on the their names, instead of counting the column index. This simply would be done using pandas’s loc instead of iloc. For instance, taking the odd-year average would be:

df["mean_odd_year"] = df.loc[:, ["Y1961","Y1963","Y1965"]].mean(axis = 1)

Method 5

If you are looking to average column wise. Try this,

df.drop('Region', axis=1).apply(lambda x: x.mean())

# it drops the Region column
df.drop('Region', axis=1,inplace=True)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x