I’m using pandas and I’m wondering what’s the easiest way to get the business days between a start and end date using pandas?
There are a lot of posts out there regarding doing this in Python (for example), but I would be interested to use directly pandas as I think that pandas can probably handle this quite easy.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can also use date_range for this purpose.
In [3]: pd.date_range('2011-01-05', '2011-01-09', freq=BDay())
Out[3]: DatetimeIndex(['2011-01-05', '2011-01-06', '2011-01-07'], dtype='datetime64[ns]', freq='B', tz=None)
EDIT
Or even simpler
In [7]: pd.bdate_range('2011-01-05', '2011-01-09')
Out[7]: DatetimeIndex(['2011-01-05', '2011-01-06', '2011-01-07'], dtype='datetime64[ns]', freq='B', tz=None)
Note that both start and end dates are inclusive.
Source: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.bdate_range.html
Method 2
As of v0.14 you can use holiday calendars.
from pandas.tseries.holiday import USFederalHolidayCalendar from pandas.tseries.offsets import CustomBusinessDay us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar()) print pd.DatetimeIndex(start='2010-01-01',end='2010-01-15', freq=us_bd)
returns:
DatetimeIndex(['2010-01-04', '2010-01-05', '2010-01-06', '2010-01-07',
'2010-01-08', '2010-01-11', '2010-01-12', '2010-01-13',
'2010-01-14', '2010-01-15'],
dtype='datetime64[ns]', freq='C')
Method 3
Just be careful when using bdate_range or BDay() – the name might mislead you to think that it is a range of business days, whereas in reality it’s just calendar days with weekends stripped out (ie. it doesn’t take holidays into account).
Method 4
Use BDay() to get the business days in range.
from pandas.tseries.offsets import * In [185]: s Out[185]: 2011-01-01 -0.011629 2011-01-02 -0.089666 2011-01-03 -1.314430 2011-01-04 -1.867307 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 2011-01-08 0.800262 2011-01-09 0.376406 2011-01-10 -0.469988 Freq: D In [186]: s.asfreq(BDay()) Out[186]: 2011-01-03 -1.314430 2011-01-04 -1.867307 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 2011-01-10 -0.469988 Freq: B
With slicing:
In [187]: x=datetime(2011, 1, 5) In [188]: y=datetime(2011, 1, 9) In [189]: s.ix[x:y] Out[189]: 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 2011-01-08 0.800262 2011-01-09 0.376406 Freq: D In [190]: s.ix[x:y].asfreq(BDay()) Out[190]: 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 Freq: B
and count()
In [191]: s.ix[x:y].asfreq(BDay()).count() Out[191]: 3
Method 5
On top of this answer and xone, we can write a short function to return the trading days of US exchange:
from xone import calendar
def business_dates(start, end):
us_cal = calendar.USTradingCalendar()
kw = dict(start=start, end=end)
return pd.bdate_range(**kw).drop(us_cal.holidays(**kw))
In [1]: business_dates(start='2018-12-20', end='2018-12-31')
Out[1]: DatetimeIndex(['2018-12-20', '2018-12-21', '2018-12-24', '2018-12-26',
'2018-12-27', '2018-12-28', '2018-12-31'],
dtype='datetime64[ns]', freq=None)
Edit March 2019:
Replace DatetimeIndex with bdate_range for pandas 0.24.0 update:
- Creating a TimedeltaIndex, DatetimeIndex, or PeriodIndex by passing range arguments start, end, and periods is deprecated in favor of timedelta_range(), date_range(), or period_range()
Method 6
We can use pd.bdate_range.
Example:
In [1]: pd.bdate_range("2020-01-01", "2020-01-06")
Out[1]: DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-06'], dtype='datetime64[ns]', freq='B')
Method 7
If you also work on Saturdays or have an unusual working week, you also want to exclude public holidays in your country.
import pandas as pd
from datetime import datetime
weekmask = 'Sun Mon Tue Wed Thu'
exclude = [pd.datetime(2020, 5, 1),
pd.datetime(2020, 5, 2),
pd.datetime(2020, 5, 3)]
pd.bdate_range('2020/4/30','2020/5/26',
freq='C',
weekmask = weekmask,
holidays=exclude )
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0