The Python Oracle

Combining multiple timeseries data to one 2d numpy array

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 2 Looping

--

Chapters
00:00 Combining Multiple Timeseries Data To One 2d Numpy Array
02:16 Accepted Answer Score 4
02:46 Thank you

--

Full question
https://stackoverflow.com/questions/1168...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #numpy #timeseries #pandas

#avk47



ACCEPTED ANSWER

Score 4


Half a million is not a number you could not manage with a python dictionary.

Read data for all sensors from database, fill a dictionary and then build a numpy array, or even better, convert it to pandas.DataFrame:

import pandas as pd

inp1 = [(1316275620,   1), (1316275680,   2)]
inp2 = [(1316275620,  10), (1316275740,  20)]
inp3 = [(1316275680, 100), (1316275740, 200)]

inps = [('s1', inp1), ('s2', inp2), ('s3', inp3)]

data = {}
for name, inp in inps:
    d = data.setdefault(name, {})
    for timestamp, value in inp:
        d[timestamp] = value
df = pd.DataFrame.from_dict(data)

df is now:

            s1  s2   s3
1316275620   1  10  NaN
1316275680   2 NaN  100
1316275740 NaN  20  200