Evaluation of a data set with conditional selection of columns

I want to evaluate a data set with precipitation data. The data is available as a csv file, which I have read in with pandas as dataframe. From this then follows the following table:

year  month  day      value
0      1981      1    1   0.522592
1      1981      1    2   2.692495
2      1981      1    3   0.556698
3      1981      1    4   0.000000
4      1981      1    5   0.000000
...     ...    ...  ...        ...
43824  2100     12   27   0.000000
43825  2100     12   28   0.185120
43826  2100     12   29  10.252080
43827  2100     12   30  13.389290
43828  2100     12   31   3.523566

Now I want to convert the daily precipitation values into monthly precipitation values and that for each month (for this I would need the sum of each day of a month). For this I probably need a loop or something similar. However, I do not know how to proceed. Maybe via a conditional selection over ‘year’ and ‘month’?!
I would be very happy about feedback! 🙂

That´s what I tried now:

for i in range(len(dataframe)):
    print(dataframe.loc[i, 'year'], dataframe.loc[i, 'month'])

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

I would start out by making a single column with the date:

df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

From here you can make the date the index:

df.set_index('date', inplace=True)
# I'll drop the unneeded year, month, and day columns as well.
df = df[['value']]

My data now looks like:

               value
date
1981-01-01  0.522592
1981-01-02  2.692495
1981-01-03  0.556698
1981-01-04  0.000000
1981-01-05  0.000000

From here, let’s try resampling the data!

# let's doing a 2 day sum. To do monthly, you'd replace '2d' with 'M'.
df.resample('2d').sum()

Output:

               value
date
1981-01-01  3.215087
1981-01-03  0.556698
1981-01-05  0.000000

Hopefully this gives you something to start with~

Method 2

Have you tried groupby?

Df.groupby(['year', 'month'])['value'].agg('sum')


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x