Scenario: I have a dataframe with multiple columns retrieved from excel worksheets. Some of these columns are dates: some have just the date (yyyy:mm:dd) and some have date and timestamp (yyyy:mm:dd 00.00.000000).
Question: How can I remove the time stamp from the dates when they are not the index of my dataframe?
What I already tried: From other posts here in SO (working with dates in pandas – remove unseen characters in datetime and convert to string and How to strip a pandas datetime of date, hours and seconds) I found:
pd.DatetimeIndex(dfST['timestamp']).date
and
strfitme (df['timestamp'].apply(lambda x: x.strftime('%Y-%m-%d'))
But I can’t seem to find a way to use those directly to the wanted column when it is not the index of my dataframe.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can do the following:
dfST['timestamp'] = pd.to_datetime(dfST['timestamp'])
to_datetime() will infer the formatting of the date column. You can also pass errors='coerce' if the column contains non-date values.
After completing the above, you’ll be able to create a new column containing only date values:
dfST['new_date_column'] = dfST['timestamp'].dt.date
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0