Combining multiple timeseries data to one 2d numpy array

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Track title: CC P Beethoven - Piano Sonata No 2 in A

--

Chapters
00:00 Question
02:57 Accepted answer (Score 4)
03:44 Thank you

--

Full question
https://stackoverflow.com/questions/1168...

Question links:
[solution from the Pandas Docs]: http://pandas.sourceforge.net/dsintro.ht...

Accepted answer links:
[pandas.DataFrame]: http://pandas.pydata.org/

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #numpy #timeseries #pandas

#avk47

ACCEPTED ANSWER

Score 4

Half a million is not a number you could not manage with a python dictionary.

Read data for all sensors from database, fill a dictionary and then build a numpy array, or even better, convert it to pandas.DataFrame:

import pandas as pd

inp1 = [(1316275620,   1), (1316275680,   2)]
inp2 = [(1316275620,  10), (1316275740,  20)]
inp3 = [(1316275680, 100), (1316275740, 200)]

inps = [('s1', inp1), ('s2', inp2), ('s3', inp3)]

data = {}
for name, inp in inps:
    d = data.setdefault(name, {})
    for timestamp, value in inp:
        d[timestamp] = value
df = pd.DataFrame.from_dict(data)

df is now:

            s1  s2   s3
1316275620   1  10  NaN
1316275680   2 NaN  100
1316275740 NaN  20  200