Combining multiple timeseries data to one 2d numpy array
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Track title: CC P Beethoven - Piano Sonata No 2 in A
--
Chapters
00:00 Question
02:57 Accepted answer (Score 4)
03:44 Thank you
--
Full question
https://stackoverflow.com/questions/1168...
Question links:
[solution from the Pandas Docs]: http://pandas.sourceforge.net/dsintro.ht...
Accepted answer links:
[pandas.DataFrame]: http://pandas.pydata.org/
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #timeseries #pandas
#avk47
--
Track title: CC P Beethoven - Piano Sonata No 2 in A
--
Chapters
00:00 Question
02:57 Accepted answer (Score 4)
03:44 Thank you
--
Full question
https://stackoverflow.com/questions/1168...
Question links:
[solution from the Pandas Docs]: http://pandas.sourceforge.net/dsintro.ht...
Accepted answer links:
[pandas.DataFrame]: http://pandas.pydata.org/
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #timeseries #pandas
#avk47
ACCEPTED ANSWER
Score 4
Half a million is not a number you could not manage with a python dictionary.
Read data for all sensors from database, fill a dictionary and then build a numpy array, or even better, convert it to pandas.DataFrame:
import pandas as pd
inp1 = [(1316275620, 1), (1316275680, 2)]
inp2 = [(1316275620, 10), (1316275740, 20)]
inp3 = [(1316275680, 100), (1316275740, 200)]
inps = [('s1', inp1), ('s2', inp2), ('s3', inp3)]
data = {}
for name, inp in inps:
d = data.setdefault(name, {})
for timestamp, value in inp:
d[timestamp] = value
df = pd.DataFrame.from_dict(data)
df is now:
s1 s2 s3
1316275620 1 10 NaN
1316275680 2 NaN 100
1316275740 NaN 20 200