I am trying to apply my own function. Below you can see the data and function.
import pandas as pd
import numpy as np
data_test = {
'sales_2017': [100,0,300,0,200],
'profit_2017': [20,0,30,50,0],
}
df = pd.DataFrame(data_test, columns = ['sales_2017','profit_2017','sales_2018','profit_2018'])
df['effective']= df['profit_2017']/df['sales_2017']
df
# Create distribution table
conditions = [
(df['effective'] == 0),
(df['effective'] > 0.1) & (df['effective'] < 0.20),
(df['effective'] > 0.20),
(df['effective'] == "NaN"),
(df['effective'] == "inf"),
]
values = ['Equal to zero','Between 0.1 and 0.2', 'Above 0.2', 'Equal to NaN', "Equal to infinity"]
df['effective_range'] = np.select(conditions, values)
distribution_table = df.groupby('effective_range').agg(count=('effective_range','count'))
So main idea here is to create a distribution table in accordance with this condtions ‘Equal to zero’,’Between 0.1 and 0.2′, ‘Above 0.2’, ‘Equal to NaN’, “Equal to infinity”.
My set have values with 'Nan' and also with 'inf' and this causes a problem with final table and below you can see pic.
So can anybody help me how to solve this problem and to have a table like a table below?
effective_range count
Equal to zero 1
Between 0.1 and 0.2 0
Above 0.2 1
Equal to NaN 1
Equal to infinity 1
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Use Series.isna and numpy.isinf methods:
# Create distribution table
conditions = [
(df['effective'] == 0),
(df['effective'] > 0.1) & (df['effective'] < 0.20),
(df['effective'] > 0.20),
(df['effective'].isna()),
(np.isinf(df['effective'])),
]
values = ['Equal to zero','Between 0.1 and 0.2', 'Above 0.2',
'Equal to NaN', "Equal to infinity"]
df['effective_range'] = np.select(conditions, values)
distribution_table = df.groupby('effective_range').agg(count=('effective_range','count'))
print (distribution_table)
count
effective_range
0 2
Above 0.2 1
Equal to NaN 1
Equal to zero 1
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0
