I want to evaluate a data set with precipitation data. The data is available as a csv file, which I have read in with pandas as dataframe. From this then follows the following table:
year month day value 0 1981 1 1 0.522592 1 1981 1 2 2.692495 2 1981 1 3 0.556698 3 1981 1 4 0.000000 4 1981 1 5 0.000000 ... ... ... ... ... 43824 2100 12 27 0.000000 43825 2100 12 28 0.185120 43826 2100 12 29 10.252080 43827 2100 12 30 13.389290 43828 2100 12 31 3.523566
Now I want to convert the daily precipitation values into monthly precipitation values and that for each month (for this I would need the sum of each day of a month). For this I probably need a loop or something similar. However, I do not know how to proceed. Maybe via a conditional selection over ‘year’ and ‘month’?!
I would be very happy about feedback! 🙂
That´s what I tried now:
for i in range(len(dataframe)):
print(dataframe.loc[i, 'year'], dataframe.loc[i, 'month'])
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
I would start out by making a single column with the date:
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])
From here you can make the date the index:
df.set_index('date', inplace=True)
# I'll drop the unneeded year, month, and day columns as well.
df = df[['value']]
My data now looks like:
value date 1981-01-01 0.522592 1981-01-02 2.692495 1981-01-03 0.556698 1981-01-04 0.000000 1981-01-05 0.000000
From here, let’s try resampling the data!
# let's doing a 2 day sum. To do monthly, you'd replace '2d' with 'M'.
df.resample('2d').sum()
Output:
value date 1981-01-01 3.215087 1981-01-03 0.556698 1981-01-05 0.000000
Hopefully this gives you something to start with~
Method 2
Have you tried groupby?
Df.groupby(['year', 'month'])['value'].agg('sum')
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0