Splitting pandas dataframe based on index value

I have a pandas dataframe that have been filtered without resetting the indexes. It looks like this:

Animal_name Shape_name  Entry_frame  Exit_frame
1      Animal_1   open arm          136         142
2      Animal_1   open arm          143         148
3      Animal_1   open arm          149         190
4      Animal_1   open arm          191         318
5      Animal_1   open arm          320         340
6      Animal_1   open arm          341         350
7      Animal_1   open arm          352         357
8      Animal_1   open arm          358         398
9      Animal_1   open arm          400         408
10     Animal_1   open arm          409         410
11     Animal_1   open arm          417         420
12     Animal_1   open arm          421         433
61     Animal_1   open arm         1034        1038

I would like to group this dataframe into groups where the indexes are adjacent in original index value.

Thus, I would like to have the rows with index values 1-9, 10-12, and 61 in three different groups.

How would I go about doing this?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Assuming you want to group rows based on whether or not their row indices differ by 1, we can try the following:

import pandas as pd
import numpy as np

# Generate toy data
random_data = np.random.random(14) 

# Create a dataframe with toy data. Adjacent indices don't always differ by 1 here. 
df = pd.DataFrame(data={'Exit_frame': random_data}, index=[1, 2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 18, 19, 20])
# print(df)

# Store grouped rows as a dataframe in a list
grouped_rows = []

# `df.index` is a list. Track the position of the first element in the group by storing its index in the `df.index` list.
first_in_group = 0

for i in range(len(df.index)):
    # Ignore the first row. We need two indices to compare!
    if i==0:
        continue

    # If adjacent indices don't differ by 1, start grouping rows:
    if df.index[i] - df.index[i-1] != 1:

        # Select 'adjacent rows' using indices; used `.loc` to select multiple rows at once
        grouped_rows.append(df.loc[df.index[first_in_group]:df.index[i-1]])

        # Update the position of the first element for an upcoming group
        first_in_group = i

print(grouped_rows)
print(type(grouped_rows[0]))

This gives us:

[   Exit_frame
1    0.236379
2    0.217071
3    0.470676
4    0.767448
5    0.125440,    Exit_frame
7    0.930271
8    0.985220,     Exit_frame
11    0.042104
12    0.483714
13    0.270400,     Exit_frame
15    0.047957]
<class 'pandas.core.frame.DataFrame'>


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x