I have a pandas dataframe that have been filtered without resetting the indexes. It looks like this:
Animal_name Shape_name Entry_frame Exit_frame 1 Animal_1 open arm 136 142 2 Animal_1 open arm 143 148 3 Animal_1 open arm 149 190 4 Animal_1 open arm 191 318 5 Animal_1 open arm 320 340 6 Animal_1 open arm 341 350 7 Animal_1 open arm 352 357 8 Animal_1 open arm 358 398 9 Animal_1 open arm 400 408 10 Animal_1 open arm 409 410 11 Animal_1 open arm 417 420 12 Animal_1 open arm 421 433 61 Animal_1 open arm 1034 1038
I would like to group this dataframe into groups where the indexes are adjacent in original index value.
Thus, I would like to have the rows with index values 1-9, 10-12, and 61 in three different groups.
How would I go about doing this?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Assuming you want to group rows based on whether or not their row indices differ by 1, we can try the following:
import pandas as pd
import numpy as np
# Generate toy data
random_data = np.random.random(14)
# Create a dataframe with toy data. Adjacent indices don't always differ by 1 here.
df = pd.DataFrame(data={'Exit_frame': random_data}, index=[1, 2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 18, 19, 20])
# print(df)
# Store grouped rows as a dataframe in a list
grouped_rows = []
# `df.index` is a list. Track the position of the first element in the group by storing its index in the `df.index` list.
first_in_group = 0
for i in range(len(df.index)):
# Ignore the first row. We need two indices to compare!
if i==0:
continue
# If adjacent indices don't differ by 1, start grouping rows:
if df.index[i] - df.index[i-1] != 1:
# Select 'adjacent rows' using indices; used `.loc` to select multiple rows at once
grouped_rows.append(df.loc[df.index[first_in_group]:df.index[i-1]])
# Update the position of the first element for an upcoming group
first_in_group = i
print(grouped_rows)
print(type(grouped_rows[0]))
This gives us:
[ Exit_frame 1 0.236379 2 0.217071 3 0.470676 4 0.767448 5 0.125440, Exit_frame 7 0.930271 8 0.985220, Exit_frame 11 0.042104 12 0.483714 13 0.270400, Exit_frame 15 0.047957] <class 'pandas.core.frame.DataFrame'>
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0