Hi I am using pandas to convert a column to month.
When I read my data they are objects:
Date object dtype: object
So I am first making them to date time and then try to make them as months:
import pandas as pd file = '/pathtocsv.csv' df = pd.read_csv(file, sep = ',', encoding='utf-8-sig', usecols= ['Date', 'ids']) df['Date'] = pd.to_datetime(df['Date']) df['Month'] = df['Date'].dt.month
Also if that helps:
In [10]: df['Date'].dtype
Out[10]: dtype('O')
So, the error I get is like this:
/Library/Frameworks/Python.framework/Versions/2.7/bin/User/lib/python2.7/site-packages/pandas/core/series.pyc in _make_dt_accessor(self)
2526 return maybe_to_datetimelike(self)
2527 except Exception:
-> 2528 raise AttributeError("Can only use .dt accessor with datetimelike "
2529 "values")
2530
AttributeError: Can only use .dt accessor with datetimelike values
EDITED:
Date columns are like this:
0 2014-01-01 1 2014-01-01 2 2014-01-01 3 2014-01-01 4 2014-01-03 5 2014-01-03 6 2014-01-03 7 2014-01-07 8 2014-01-08 9 2014-01-09
Do you have any ideas?
Thank you very much!
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Your problem here is that to_datetime silently failed so the dtype remained as str/object, if you set param errors='coerce' then if the conversion fails for any particular string then those rows are set to NaT.
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
So you need to find out what is wrong with those specific row values.
See the docs
Method 2
First you need to define the format of date column.
df['Date'] = pd.to_datetime(df.Date, format='%Y-%m-%d %H:%M:%S')
For your case base format can be set to;
df['Date'] = pd.to_datetime(df.Date, format='%Y-%m-%d')
After that you can set/change your desired output as follows;
df['Date'] = df['Date'].dt.strftime('%Y-%m-%d')
Method 3
Your problem here is that the dtype of ‘Date’ remained as str/object. You can use the parse_dates parameter when using read_csv
import pandas as pd file = '/pathtocsv.csv' df = pd.read_csv(file, sep = ',', parse_dates= [col],encoding='utf-8-sig', usecols= ['Date', 'ids'],) df['Month'] = df['Date'].dt.month
From the documentation for the parse_dates parameter
parse_dates : bool or list of int or names or list of lists or dict, default False
The behavior is as follows:
- boolean. If True -> try parsing the index.
- list of int or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.
- list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.
- dict, e.g. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’
If a column or index cannot be represented as an array of datetimes, say because of an unparseable value or a mixture of timezones, the column or index will be returned unaltered as an object data type. For non-standard datetime parsing, use
pd.to_datetimeafterpd.read_csv. To parse an index or column with a mixture of timezones, specifydate_parserto be a partially-appliedpandas.to_datetime()withutc=True. See Parsing a CSV with mixed timezones for more.Note: A fast-path exists for iso8601-formatted dates.
The relevant case for this question is the “list of int or names” one.
col is the columns index of ‘Date’ which parses as a separate date column.
Method 4
#Convert date into the proper format so that date time operation can be easily performed
df_Time_Table["Date"] = pd.to_datetime(df_Time_Table["Date"])
# Cal Year
df_Time_Table['Year'] = df_Time_Table['Date'].dt.strftime('%Y')
Method 5
When you write
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df['Date'] = df['Date'].dt.strftime('%m/%d')
It can fixed
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0