Please find attached snap and provide me how to reach to a solution of desired output mentioned in image description?
Code to generate input dataframe:
df = pd.DataFrame({'timestamp':pd.date_range('2022-04-30 00:00:00', periods=19, freq='S'),
'fault_code':['A']*4+['B']*4+['A']*2+['C']*5+['B']*2+['A']*2})
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can try something like this:
import pandas as pd
import numpy as np
df = pd.DataFrame({'timestamp':pd.date_range('2022-04-30 00:00:00', periods=19, freq='S'),
'fault_code':['A']*4+['B']*4+['A']*2+['C']*5+['B']*2+['A']*2})
df['group'] = (df['fault_code'] != df['fault_code'].shift()).cumsum()
df_s = df.groupby(['fault_code','group'], as_index=False)['timestamp']
.agg(lambda x: int(np.ptp(x).total_seconds()))
df_out = df_s.groupby('fault_code').agg(occurrence=('fault_code','count'),
duration=('timestamp', list),
total_duration=('timestamp','sum'))
df_out
Output:
occurrence duration total_duration fault_code A 3 [3, 1, 1] 5 B 2 [3, 1] 4 C 1 [4] 4
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0
