Based on the following sample data, the following data frame is built:
day = [1, 2, 3, 2, 3, 1, 2] item_id = [1, 1, 1, 2, 2, 3, 3] item_name = ['A', 'A', 'A', 'B', 'B', 'C', 'C'] increase = [4, 0, 4, 3, 3, 3, 3] decrease = [2, 2, 2, 1, 1, 1, 1] my_df = pd.DataFrame(list(zip(day, item_id, item_name, increase, decrease)), columns=['day', 'item_id', 'item_name', 'increase', 'decrease']) my_df = my_df.set_index(['item_id', 'item_name'])
I would like to create two new columns:
- starting_quantity[0] would have each initial value set to 0 for the index (or multi-index)
- ending_quantity adds the
increase
and subtracts thedecrease
- starting_quantity[1, 2, 3, …] is equal to the ending_quantity of the previous day.
The output I’d like to create is as follows:
I appreciate if you could assist with any or all of the 3 steps above!
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Try:
my_df = my_df.set_index(["item_id", "item_name"])
g = my_df.groupby(level=0)
my_df["tmp"] = my_df["increase"] - my_df["decrease"]
my_df["starting_quantity"] = g["tmp"].shift().fillna(0)
my_df["starting_quantity"] = g["starting_quantity"].cumsum().astype(int)
my_df["ending_quantity"] = g["tmp"].cumsum()
my_df = my_df.drop(columns="tmp")
print(my_df)
Prints:
day increase decrease starting_quantity ending_quantity
item_id item_name
1 A 1 4 2 0 2
A 2 0 2 2 0
A 3 4 2 0 2
2 B 2 3 1 0 2
B 3 3 1 2 4
3 C 1 3 1 0 2
C 2 3 1 2 4
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0