The Python Oracle

Python pandas insert list into a cell

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Unforgiving Himalayas Looping

--

Chapters
00:00 Python Pandas Insert List Into A Cell
02:10 Accepted Answer Score 41
02:32 Answer 2 Score 206
03:14 Answer 3 Score 65
04:44 Answer 4 Score 8
05:05 Thank you

--

Full question
https://stackoverflow.com/questions/2648...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #list #pandas #insert #dataframe

#avk47



ANSWER 1

Score 206


Since set_value has been deprecated since version 0.21.0, you should now use at. It can insert a list into a cell without raising a ValueError as loc does. I think this is because at always refers to a single value, while loc can refer to values as well as rows and columns.

df = pd.DataFrame(data={'A': [1, 2, 3], 'B': ['x', 'y', 'z']})

df.at[1, 'B'] = ['m', 'n']

df =
    A   B
0   1   x
1   2   [m, n]
2   3   z

You also need to make sure the column you are inserting into has dtype=object. For example

>>> df = pd.DataFrame(data={'A': [1, 2, 3], 'B': [1,2,3]})
>>> df.dtypes
A    int64
B    int64
dtype: object

>>> df.at[1, 'B'] = [1, 2, 3]
ValueError: setting an array element with a sequence

>>> df['B'] = df['B'].astype('object')
>>> df.at[1, 'B'] = [1, 2, 3]
>>> df
   A          B
0  1          1
1  2  [1, 2, 3]
2  3          3



ANSWER 2

Score 65


Pandas >= 0.21

set_value has been deprecated. You can now use DataFrame.at to set by label, and DataFrame.iat to set by integer position.

Setting Cell Values with at/iat

# Setup
>>> df = pd.DataFrame({'A': [12, 23], 'B': [['a', 'b'], ['c', 'd']]})
>>> df

    A       B
0  12  [a, b]
1  23  [c, d]

>>> df.dtypes

A     int64
B    object
dtype: object

If you want to set a value in second row of the "B" column to some new list, use DataFrame.at:

>>> df.at[1, 'B'] = ['m', 'n']
>>> df

    A       B
0  12  [a, b]
1  23  [m, n]

You can also set by integer position using DataFrame.iat

>>> df.iat[1, df.columns.get_loc('B')] = ['m', 'n']
>>> df

    A       B
0  12  [a, b]
1  23  [m, n]

What if I get ValueError: setting an array element with a sequence?

I'll try to reproduce this with:

>>> df
    A   B
0  12 NaN
1  23 NaN

>>> df.dtypes
A      int64
B    float64
dtype: object
>>> df.at[1, 'B'] = ['m', 'n']
# ValueError: setting an array element with a sequence.

This is because of a your object is of float64 dtype, whereas lists are objects, so there's a mismatch there. What you would have to do in this situation is to convert the column to object first.

>>> df['B'] = df['B'].astype(object)
>>> df.dtypes

A     int64
B    object
dtype: object

Then, it works:

>>> df.at[1, 'B'] = ['m', 'n']
>>> df
    
    A       B
0  12     NaN
1  23  [m, n]

Possible, But Hacky

Even more wacky, I've found that you can hack through DataFrame.loc to achieve something similar if you pass nested lists.

>>> df.loc[1, 'B'] = [['m'], ['n'], ['o'], ['p']]
>>> df

    A             B
0  12        [a, b]
1  23  [m, n, o, p]

You can read more about why this works here.




ACCEPTED ANSWER

Score 41


df3.set_value(1, 'B', abc) works for any dataframe. Take care of the data type of column 'B'. For example, a list can not be inserted into a float column, at that case df['B'] = df['B'].astype(object) can help.




ANSWER 4

Score 8


Quick work around

Simply enclose the list within a new list, as done for col2 in the data frame below. The reason it works is that python takes the outer list (of lists) and converts it into a column as if it were containing normal scalar items, which is lists in our case and not normal scalars.

mydict={'col1':[1,2,3],'col2':[[1, 4], [2, 5], [3, 6]]}
data=pd.DataFrame(mydict)
data


   col1     col2
0   1       [1, 4]
1   2       [2, 5]
2   3       [3, 6]