Convert pandas DateTimeIndex to Unix Time?

What is the idiomatic way of converting a pandas DateTimeIndex to (an iterable of) Unix Time?
This is probably not the way to go:

[time.mktime(t.timetuple()) for t in my_data_frame.index.to_pydatetime()]

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

As DatetimeIndex is ndarray under the hood, you can do the conversion without a comprehension (much faster).

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: from datetime import datetime

In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
   ...: index = pd.DatetimeIndex(dates)
   ...: 
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000], 
        dtype=int64)

In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)

%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop

%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop

Method 2

Note: Timestamp is just unix time with nanoseconds (so divide it by 10**9):

[t.value // 10 ** 9 for t in tsframe.index]

For example:

In [1]: t = pd.Timestamp('2000-02-11 00:00:00')

In [2]: t
Out[2]: <Timestamp: 2000-02-11 00:00:00>

In [3]: t.value
Out[3]: 950227200000000000L

In [4]: time.mktime(t.timetuple())
Out[4]: 950227200.0

As @root points out it’s faster to extract the array of values directly:

tsframe.index.astype(np.int64) // 10 ** 9

Method 3

A summary of other answers:

df['<time_col>'].astype(np.int64) // 10**9

If you want to keep the milliseconds divide by 10**6 instead

Method 4

Complementing the other answers: //10**9 will do a flooring divide, which gives full past seconds rather than the nearest value in seconds. A simple way to get more reasonable rounding, if that is desired, is to add 5*10**8 - 1 before doing the flooring divide.

Method 5

To address the case of NaT, which above solutions will convert to large negative ints, in pandas>=0.24 a possible solution would be:

def datetime_to_epoch(ser):
    """Don't convert NaT to large negative values."""
    if ser.hasnans:
        res = ser.dropna().astype('int64').astype('Int64').reindex(index=ser.index)
    else:
        res = ser.astype('int64')

    return res // 10**9

In the case of missing values this will return the nullable int type ‘Int64’ (ExtensionType pd.Int64Dtype):

In [5]: dt = pd.to_datetime(pd.Series(["2019-08-21", "2018-07-28", np.nan]))                                                                                                                                                                                                    
In [6]: datetime_to_epoch(dt)                                                                                                                                                                                                                                                   
Out[6]: 
0    1566345600
1    1532736000
2           NaN
dtype: Int64

Otherwise a regular int64:

In [7]: datetime_to_epoch(dt[:2])                                                                                                                                                                                                                                               
Out[7]: 
0    1566345600
1    1532736000
dtype: int64

Method 6

If you have tried this on the datetime column of your dataframe:

dframe['datetime'].astype(np.int64) // 10**9

& that you are struggling with the following error:TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp' you can just use these two lines :

dframe.index = pd.DatetimeIndex(dframe['datetime'])
dframe['datetime']= dframe.index.astype(np.int64)// 10**9

Method 7

The code from the other answers

dframe['datetime'].astype(np.int64) // 10**9

prints the following warning as of the time of my post:

FutureWarning: casting datetime64[ns] values to int64 with
.astype(…) is deprecated and will raise in a future version. Use
.view(…) instead.

So use the following instead:

dframe['datetime'].view(np.int64) // 10 ** 9


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x