The Python Oracle

Convert Python dict into a dataframe

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Track title: CC F Haydns String Quartet No 53 in D

--

Chapters
00:00 Question
01:51 Accepted answer (Score 796)
02:45 Answer 2 (Score 344)
03:20 Answer 3 (Score 170)
03:54 Answer 4 (Score 84)
04:27 Thank you

--

Full question
https://stackoverflow.com/questions/1883...

Question links:
[Unicode]: http://en.wikipedia.org/wiki/Unicode

Answer 1 links:
[the pandas docs]: https://pandas.pydata.org/pandas-docs/st...

Answer 2 links:
[pandas.DataFrame.from_dict]: https://pandas.pydata.org/pandas-docs/st...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe

#avk47



ACCEPTED ANSWER

Score 868


The error here, is since calling the DataFrame constructor with scalar values (where it expects values to be a list/dict/... i.e. have multiple columns):

pd.DataFrame(d)
ValueError: If using all scalar values, you must must pass an index

You could take the items from the dictionary (i.e. the key-value pairs):

In [11]: pd.DataFrame(d.items())  # or list(d.items()) in python 3
Out[11]:
            0    1
0  2012-07-01  391
1  2012-07-02  392
2  2012-07-03  392
3  2012-07-04  392
4  2012-07-05  392
5  2012-07-06  392

In [12]: pd.DataFrame(d.items(), columns=['Date', 'DateValue'])
Out[12]:
         Date  DateValue
0  2012-07-01        391
1  2012-07-02        392
2  2012-07-03        392
3  2012-07-04        392
4  2012-07-05        392
5  2012-07-06        392

But I think it makes more sense to pass the Series constructor:

In [20]: s = pd.Series(d, name='DateValue')

In [21]: s
Out[21]:
2012-07-01    391
2012-07-02    392
2012-07-03    392
2012-07-04    392
2012-07-05    392
2012-07-06    392
Name: DateValue, dtype: int64

In [22]: s.index.name = 'Date'

In [23]: s.reset_index()
Out[23]:
         Date  DateValue
0  2012-07-01        391
1  2012-07-02        392
2  2012-07-03        392
3  2012-07-04        392
4  2012-07-05        392
5  2012-07-06        392



ANSWER 2

Score 388


When converting a dictionary into a pandas dataframe where you want the keys to be the columns of said dataframe and the values to be the row values, you can do simply put brackets around the dictionary like this:

>>> dict_ = {'key 1': 'value 1', 'key 2': 'value 2', 'key 3': 'value 3'}
>>> pd.DataFrame([dict_])
 
    key 1     key 2     key 3
0   value 1   value 2   value 3

EDIT: In the pandas docs one option for the data parameter in the DataFrame constructor is a list of dictionaries. Here we're passing a list with one dictionary in it.




ANSWER 3

Score 183


As explained on another answer, using DataFrame() directly here will not act as you think.

What you can do is use DataFrame.from_dict() with orient='index':

In [5]: d = {
   ...:     u'2012-07-01': 391,
   ...:     u'2012-07-02': 392,
   ...:     u'2012-07-03': 392,
   ...:     u'2012-07-04': 392,
   ...:     u'2012-07-05': 392,
   ...:     u'2012-07-06': 392}

In [6]: df = pd.DataFrame.from_dict(d, orient='index', columns=['DateValue'])

In [7]: df
Out[7]: 
            DateValue
2012-07-01        391
2012-07-02        392
2012-07-03        392
2012-07-04        392
2012-07-05        392
2012-07-06        392

To get exactly what you wanted:

In [8]: df.reset_index(names='Date')
Out[8]: 
         Date  DateValue
0  2012-07-01        391
...



ANSWER 4

Score 86


Pass the items of the dictionary to the DataFrame constructor, and give the column names. After that parse the Date column to get Timestamp values.

Note the difference between python 2.x and 3.x:

In python 2.x:

df = pd.DataFrame(data.items(), columns=['Date', 'DateValue'])
df['Date'] = pd.to_datetime(df['Date'])

In Python 3.x: (requiring an additional 'list')

df = pd.DataFrame(list(data.items()), columns=['Date', 'DateValue'])
df['Date'] = pd.to_datetime(df['Date'])