Extracting just Month and Year separately from Pandas Datetime column
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: The World Wide Mind
--
Chapters
00:00 Extracting Just Month And Year Separately From Pandas Datetime Column
01:07 Answer 1 Score 165
02:35 Accepted Answer Score 618
02:54 Answer 3 Score 50
03:15 Answer 4 Score 374
03:40 Thank you
--
Full question
https://stackoverflow.com/questions/2514...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #datetime
#avk47
ACCEPTED ANSWER
Score 618
If you want new columns showing year and month separately you can do this:
df['year'] = pd.DatetimeIndex(df['ArrivalDate']).year
df['month'] = pd.DatetimeIndex(df['ArrivalDate']).month
or...
df['year'] = df['ArrivalDate'].dt.year
df['month'] = df['ArrivalDate'].dt.month
Then you can combine them or work with them just as they are.
ANSWER 2
Score 374
The df['date_column'] has to be in date time format.
df['month_year'] = df['date_column'].dt.to_period('M')
You could also use D for Day, 2M for 2 Months etc. for different sampling intervals, and in case one has time series data with time stamp, we can go for granular sampling intervals such as 45Min for 45 min, 15Min for 15 min sampling etc.
ANSWER 3
Score 165
You can directly access the year and month attributes, or request a datetime.datetime:
In [15]: t = pandas.tslib.Timestamp.now()
In [16]: t
Out[16]: Timestamp('2014-08-05 14:49:39.643701', tz=None)
In [17]: t.to_pydatetime() #datetime method is deprecated
Out[17]: datetime.datetime(2014, 8, 5, 14, 49, 39, 643701)
In [18]: t.day
Out[18]: 5
In [19]: t.month
Out[19]: 8
In [20]: t.year
Out[20]: 2014
One way to combine year and month is to make an integer encoding them, such as: 201408 for August, 2014. Along a whole column, you could do this as:
df['YearMonth'] = df['ArrivalDate'].map(lambda x: 100*x.year + x.month)
or many variants thereof.
I'm not a big fan of doing this, though, since it makes date alignment and arithmetic painful later and especially painful for others who come upon your code or data without this same convention. A better way is to choose a day-of-month convention, such as final non-US-holiday weekday, or first day, etc., and leave the data in a date/time format with the chosen date convention.
The calendar module is useful for obtaining the number value of certain days such as the final weekday. Then you could do something like:
import calendar
import datetime
df['AdjustedDateToEndOfMonth'] = df['ArrivalDate'].map(
    lambda x: datetime.datetime(
        x.year,
        x.month,
        max(calendar.monthcalendar(x.year, x.month)[-1][:5])
    )
)
If you happen to be looking for a way to solve the simpler problem of just formatting the datetime column into some stringified representation, for that you can just make use of the strftime function from the datetime.datetime class, like this:
In [5]: df
Out[5]: 
            date_time
0 2014-10-17 22:00:03
In [6]: df.date_time
Out[6]: 
0   2014-10-17 22:00:03
Name: date_time, dtype: datetime64[ns]
In [7]: df.date_time.map(lambda x: x.strftime('%Y-%m-%d'))
Out[7]: 
0    2014-10-17
Name: date_time, dtype: object
ANSWER 4
Score 50
If you want the month year unique pair, using apply is pretty sleek.
df['mnth_yr'] = df['date_column'].apply(lambda x: x.strftime('%B-%Y')) 
Outputs month-year in one column.
Don't forget to first change the format to date-time before, I generally forget.
df['date_column'] = pd.to_datetime(df['date_column'])