pandas get column average/mean

I can’t get the average or mean of a column in pandas. A have a dataframe. Neither of things I tried below gives me the average of the column weight

>>> allDF 
         ID           birthyear  weight
0        619040       1962       0.1231231
1        600161       1963       0.981742
2      25602033       1963       1.3123124     
3        624870       1987       0.94212

The following returns several values, not one:

allDF[['weight']].mean(axis=1)

So does this:

allDF.groupby('weight').mean()

Contents hide

Answers:

Method 1

Method 2

Method 3

Method 4

Method 5

Method 6

Method 7

Method 8

Method 9

Method 10

Method 11

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

If you only want the mean of the weight column, select the column (which is a Series) and call .mean():

In [479]: df
Out[479]: 
         ID  birthyear    weight
0    619040       1962  0.123123
1    600161       1963  0.981742
2  25602033       1963  1.312312
3    624870       1987  0.942120

In [480]: df["weight"].mean()
Out[480]: 0.83982437500000007

Method 2

Try df.mean(axis=0) , axis=0 argument calculates the column wise mean of the dataframe so the result will be axis=1 is row wise mean so you are getting multiple values.

Method 3

Do try to give print (df.describe()) a shot. I hope it will be very helpful to get an overall description of your dataframe.

Method 4

Mean for each column in df :

    A   B   C
0   5   3   8
1   5   3   9
2   8   4   9

df.mean()

A    6.000000
B    3.333333
C    8.666667
dtype: float64

and if you want average of all columns:

df.stack().mean()
6.0

Method 5

you can use

df.describe()

you will get basic statistics of the dataframe and to get mean of specific column you can use

df["columnname"].mean()

Method 6

You can also access a column using the dot notation (also called attribute access) and then calculate its mean:

df.your_column_name.mean()

Method 7

You can use either of the two statements below:

numpy.mean(df['col_name'])
# or
df['col_name'].mean()

Method 8

Additionally if you want to get the round value after finding the mean.

#Create a DataFrame
df1 = {
    'Subject':['semester1','semester2','semester3','semester4','semester1',
               'semester2','semester3'],
   'Score':[62.73,47.76,55.61,74.67,31.55,77.31,85.47]}
df1 = pd.DataFrame(df1,columns=['Subject','Score'])

rounded_mean = round(df1['Score'].mean()) # specified nothing as decimal place
print(rounded_mean) # 62

rounded_mean_decimal_0 = round(df1['Score'].mean(), 0) # specified decimal place as 0
print(rounded_mean_decimal_0) # 62.0

rounded_mean_decimal_1 = round(df1['Score'].mean(), 1) # specified decimal place as 1
print(rounded_mean_decimal_1) # 62.2

Method 9

You can simply go for:
df.describe()
that will provide you with all the relevant details you need, but to find the min, max or average value of a particular column (say ‘weights’ in your case), use:

    df['weights'].mean(): For average value
    df['weights'].max(): For maximum value
    df['weights'].min(): For minimum value

Method 10

Do note that it needs to be in the numeric data type in the first place.

 import pandas as pd
 df['column'] = pd.to_numeric(df['column'], errors='coerce')

Next find the mean on one column or for all numeric columns using describe().

df['column'].mean()
df.describe()

Example of result from describe:

          column 
count    62.000000 
mean     84.678548 
std     216.694615 
min      13.100000 
25%      27.012500 
50%      41.220000 
75%      70.817500 
max    1666.860000

Method 11

You can easily follow the following code

import pandas as pd 
import numpy as np 
        
classxii = {'Name':['Karan','Ishan','Aditya','Anant','Ronit'],
            'Subject':['Accounts','Economics','Accounts','Economics','Accounts'],
            'Score':[87,64,58,74,87],
            'Grade':['A1','B2','C1','B1','A2']}

df = pd.DataFrame(classxii,index = ['a','b','c','d','e'],columns=['Name','Subject','Score','Grade'])
print(df)

#use the below for mean if you already have a dataframe
print('mean of score is:')
print(df[['Score']].mean())

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating