Finding label location in a DataFrame Index

I have a pandas dataframe:

import pandas as pnd
d = pnd.Timestamp('2013-01-01 16:00')
dates = pnd.bdate_range(start=d, end = d+pnd.DateOffset(days=10), normalize = False)

df = pnd.DataFrame(index=dates, columns=['a'])
df['a'] = 6

print(df)
                     a
2013-01-01 16:00:00  6
2013-01-02 16:00:00  6
2013-01-03 16:00:00  6
2013-01-04 16:00:00  6
2013-01-07 16:00:00  6
2013-01-08 16:00:00  6
2013-01-09 16:00:00  6
2013-01-10 16:00:00  6
2013-01-11 16:00:00  6

I am interested in find the label location of one of the labels, say,

ds = pnd.Timestamp('2013-01-02 16:00')

Looking at the index values, I know that is integer location of this label 1. How can get pandas to tell what the integer value of this label is?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You’re looking for the index method get_loc:

In [11]: df.index.get_loc(ds)
Out[11]: 1

Method 2

Get dataframe integer index given a date key:

>>> import pandas as pd

>>> df = pd.DataFrame(
    index=pd.date_range(pd.datetime(2008,1,1), pd.datetime(2008,1,5)),
    columns=("foo", "bar"))

>>> df["foo"] = [10,20,40,15,10]

>>> df["bar"] = [100,200,40,-50,-38]

>>> df
            foo  bar
2008-01-01   10  100
2008-01-02   20  200
2008-01-03   40   40
2008-01-04   15  -50
2008-01-05   10  -38

>>> df.index.get_loc(df["bar"].argmax())
1

>>> df.index.get_loc(df["foo"].argmax())
2

In column bar, the index of the maximum value is 1

In column foo, the index of the maximum value is 2

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.get_loc.html

Method 3

get_loc can be used for rows and columns according to:

import pandas as pnd
d = pnd.Timestamp('2013-01-01 16:00')
dates = pnd.bdate_range(start=d, end = d+pnd.DateOffset(days=10), normalize = False)

df = pnd.DataFrame(index=dates)
df['a'] = 5
df['b'] = 6
print(df.head())    
                     a  b
2013-01-01 16:00:00  5  6
2013-01-02 16:00:00  5  6
2013-01-03 16:00:00  5  6
2013-01-04 16:00:00  5  6
2013-01-07 16:00:00  5  6

#for rows
print(df.index.get_loc('2013-01-01 16:00:00'))  
 0
#for columns
print(df.columns.get_loc('b'))
 1

Method 4

Because get_loc returns a mask rather than a list of integer index locations when there are multiple instances of the key in the index, I was toying with an answer using reset_index():

# Add a duplicate!!!
dup = pd.Timestamp('2013-01-07 16:00')
df = df.append(pd.DataFrame([7],columns=['a'],index=[dup]))
df

                    a
2013-01-01 16:00:00 6
2013-01-02 16:00:00 6
2013-01-03 16:00:00 6
2013-01-04 16:00:00 6
2013-01-07 16:00:00 6
2013-01-08 16:00:00 6
2013-01-09 16:00:00 6
2013-01-10 16:00:00 6
2013-01-11 16:00:00 6
2013-01-07 16:00:00 7
2013-01-08 16:00:00 3

# Only use this method if the key has duplicates
if (df.loc[dup].index.has_duplicates):
    df.reset_index().loc[df.index.get_loc(dup)].index.to_list()

array([4, 9])


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x