What is the pythonic way to slice a dataframe by more index ranges (eg. by 10:12 and 25:28)?
I want this in a more elegant way:
df = pd.DataFrame({'a':range(10,100)})
df.iloc[[i for i in range(10,12)] + [i for i in range(25,28)]]
Result:
a 10 20 11 21 25 35 26 36 27 37
Something like this would be more elegant:
df.iloc[(10:12, 25:28)]
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can use numpy’s r_ “slicing trick”:
df = pd.DataFrame({'a':range(10,100)})
df.iloc[pd.np.r_[10:12, 25:28]]
NOTE: this now gives a warning The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead. To do that, you can import numpy as np and then slice the following way:
df.iloc[np.r_[10:12, 25:28]]
This gives:
a 10 20 11 21 25 35 26 36 27 37
Method 2
You can take advantage of pandas isin function.
df = pd.DataFrame({'a':range(10,100)})
ls = [i for i in range(10,12)] + [i for i in range(25,28)]
df[df.index.isin(ls)]
a
10 20
11 21
25 35
26 36
27 37
Method 3
Building off of @KevinOelen’s use of Panda’s isin function, here is a pythonic way (Python 3.8) to glance at a Pandas DataFrame or GeoPandas GeoDataFrame revealing just a few rows of the head and tail. This method does not require importing numpy.
To use just call glance(your_df). Additional explanation in docstring.
import pandas as pd
import geopandas as gpd # if not needed, remove gpd.GeoDataFrame from the type hinting and no need to import Union
from typing import Union
def glance(df: Union[pd.DataFrame, gpd.GeoDataFrame], size: int = 2) -> None:
""" Provides a shortened head and tail summary of a Dataframe or GeoDataFrame in Jupyter Notebook or Lab.
Usage
----------
# default glance (2 head rows, 2 tail rows)
glance( df )
# glance defined number of rows in head and tail (3 head rows, 3 tails rows)
glance( df, size=3 )
Parameters
----------
:param df: Union[pd.DataFrame, gpd.GeoDataFrame]: A (Geo)Pandas data frame to glance at.
:param size: int: The number of rows in the head and tail to display, total rows will be double provided size.
:return: None: Displays result in Notebook or Lab.
"""
# min and max of the provided dataframe index
min_ = df.index.min()
max_ = df.index.max()
# define slice
sample = [i for i in range(min_, size)] + [i for i in range(max_ - size, max_)]
# slice
df = df[df.index.isin(sample)]
# display
display( df )
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0