I have a pandas dataframe with over 1000 timestamps (below) that I would like to loop through:
2016-02-22 14:59:44.561776
I’m having a hard time splitting this time stamp into 2 columns- ‘date’ and ‘time’. The date format can stay the same, but the time needs to be converted to CST (including milliseconds).
Thanks for the help
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Had same problem and this worked for me.
Suppose the date column in your dataset is called “date”
import pandas as pd df = pd.read_csv(file_path) df['Dates'] = pd.to_datetime(df['date']).dt.date df['Time'] = pd.to_datetime(df['date']).dt.time
This will give you two columns “Dates” and “Time” with splited dates.
Method 2
I’m not sure why you would want to do this in the first place, but if you really must…
df = pd.DataFrame({'my_timestamp': pd.date_range('2016-1-1 15:00', periods=5)})
>>> df
my_timestamp
0 2016-01-01 15:00:00
1 2016-01-02 15:00:00
2 2016-01-03 15:00:00
3 2016-01-04 15:00:00
4 2016-01-05 15:00:00
df['new_date'] = [d.date() for d in df['my_timestamp']]
df['new_time'] = [d.time() for d in df['my_timestamp']]
>>> df
my_timestamp new_date new_time
0 2016-01-01 15:00:00 2016-01-01 15:00:00
1 2016-01-02 15:00:00 2016-01-02 15:00:00
2 2016-01-03 15:00:00 2016-01-03 15:00:00
3 2016-01-04 15:00:00 2016-01-04 15:00:00
4 2016-01-05 15:00:00 2016-01-05 15:00:00
The conversion to CST is more tricky. I assume that the current timestamps are ‘unaware’, i.e. they do not have a timezone attached? If not, how would you expect to convert them?
For more details:
https://docs.python.org/2/library/datetime.html
How to make an unaware datetime timezone aware in python
EDIT
An alternative method that only loops once across the timestamps instead of twice:
new_dates, new_times = zip(*[(d.date(), d.time()) for d in df['my_timestamp']]) df = df.assign(new_date=new_dates, new_time=new_times)
Method 3
The easiest way is to use the pandas.Series dt accessor, which works on columns with a datetime dtype (see pd.to_datetime). For this case, pd.date_range creates an example column with a datetime dtype, therefore use .dt.date and .dt.time:
df = pd.DataFrame({'full_date': pd.date_range('2016-1-1 10:00:00.123', periods=10, freq='5H')})
df['date'] = df['full_date'].dt.date
df['time'] = df['full_date'].dt.time
In [166]: df
Out[166]:
full_date date time
0 2016-01-01 10:00:00.123 2016-01-01 10:00:00.123000
1 2016-01-01 15:00:00.123 2016-01-01 15:00:00.123000
2 2016-01-01 20:00:00.123 2016-01-01 20:00:00.123000
3 2016-01-02 01:00:00.123 2016-01-02 01:00:00.123000
4 2016-01-02 06:00:00.123 2016-01-02 06:00:00.123000
5 2016-01-02 11:00:00.123 2016-01-02 11:00:00.123000
6 2016-01-02 16:00:00.123 2016-01-02 16:00:00.123000
7 2016-01-02 21:00:00.123 2016-01-02 21:00:00.123000
8 2016-01-03 02:00:00.123 2016-01-03 02:00:00.123000
9 2016-01-03 07:00:00.123 2016-01-03 07:00:00.123000
Method 4
If your timestamps are already in pandas format (not string), then:
df["date"] = df["timestamp"].date dt["time"] = dt["timestamp"].time
If your timestamp is a string, you can parse it using the datetime module:
from datetime import datetime
data1["timestamp"] = df["timestamp"].apply(lambda x:
datetime.strptime(x,"%Y-%m-%d %H:%M:%S.%f"))
Source:
http://pandas.pydata.org/pandas-docs/stable/timeseries.html
Method 5
If your timestamp is a string, you can convert it to a datetime object:
from datetime import datetime timestamp = '2016-02-22 14:59:44.561776' dt = datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S.%f')
From then on you can bring it to whatever format you like.
Method 6
Try
s = '2016-02-22 14:59:44.561776' date,time = s.split()
then convert time as needed.
If you want to further split the time,
hour, minute, second = time.split(':')
Method 7
try this:
def time_date(datetime_obj):
date_time = datetime_obj.split(' ')
time = date_time[1].split('.')
return date_time[0], time[0]
Method 8
In addition to @Alexander if you want a single liner
df['new_date'],df['new_time'] = zip(*[(d.date(), d.time()) for d in df['my_timestamp']])
Method 9
If your timestamp is a string, you can convert it to pandas timestamp before splitting it.
#convert to pandas timestamp data["old_date"] = pd.to_datetime(data.old_date) #split columns data["new_date"] = data["old_date"].dt.date data["new_time"] = data["old_date"].dt.time
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0