The Python Oracle

Create a weighted mean for a irregular timeseries in pandas

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Lost Jungle Looping

--

Chapters
00:00 Create A Weighted Mean For A Irregular Timeseries In Pandas
01:06 Accepted Answer Score 3
02:00 Thank you

--

Full question
https://stackoverflow.com/questions/2634...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #weightedaverage

#avk47



ACCEPTED ANSWER

Score 3


Well, I figured out how to solve my problem. I don't know, if it is a nice solution, but it works.

I changed the original code in the question by exchanging datetime.time by datetime.datetime, otherwise it won't work (there is no method total_seconds() for datetime.time-Objects). I also had to import numpy to be able to use numpy.average.

So now the code would be:

import datetime
import numpy as np
import pandas as pd
time_vec =     [datetime.datetime(2007,1,1,0,0)
               ,datetime.datetime(2007,1,1,0,0) 
               ,datetime.datetime(2007,1,1,0,5)     
               ,datetime.datetime(2007,1,1,0,7)
               ,datetime.datetime(2007,1,1,0,10)]
df = pd.DataFrame([1,2,4,3,6],index = time_vec)

This little function solved my problem:

def time_based_weighted_mean(tv_df):
    time_delta = [(x-y).total_seconds() for x,y in zip(df.index[1:],df.index[:-1])]
    weights = [x+y for x,y in zip([0]+ time_delta,time_delta+[0])]
    res = np.average(df[0],weights = weights) 
    return res
print time_based_weighted_mean(df[0])

I first tried to use pd.index.diff() to compute the time_delta-Array, but this resulted in a numpy.datetime64 Series, where I did not know how to convert them into floats, as np.average requires floats as input-type for weights.

I'm thankful for any suggestions to improve the code.