Truly deep copying Pandas DataFrames
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Ancient Construction
--
Chapters
00:00 Question
01:11 Accepted answer (Score 2)
01:34 Answer 2 (Score 1)
02:21 Thank you
--
Full question
https://stackoverflow.com/questions/6631...
Answer 1 links:
[documents]: https://pandas.pydata.org/pandas-docs/st...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe #deepcopy
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Ancient Construction
--
Chapters
00:00 Question
01:11 Accepted answer (Score 2)
01:34 Answer 2 (Score 1)
02:21 Thank you
--
Full question
https://stackoverflow.com/questions/6631...
Answer 1 links:
[documents]: https://pandas.pydata.org/pandas-docs/st...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe #deepcopy
#avk47
ACCEPTED ANSWER
Score 2
One way is to convert df_in to Python dictionary which works better with copy:
def pop(df_in):
df = pd.DataFrame(copy.deepcopy(df_in.to_dict()) )
print(df['sets'].apply(lambda x: set([x.pop()])))
for i in range(3): pop(df)
Output:
0 {1}
Name: sets, dtype: object
0 {1}
Name: sets, dtype: object
0 {1}
Name: sets, dtype: object
ANSWER 2
Score 1
The problem is that your objects are mutable as they are sets. The documents explicitly call out this behavior with a warning (emphasis my own):
When deep=True, data is copied but actual Python objects will not be copied recursively, only the reference to the object.
So as always with references to mutable objects, if you change it it affects it everywhere. We can see for ourselves despite the deep copy the objects have the same ID.
import pandas as pd
df = pd.DataFrame({'sets': [{1,2}]}, index=[0])
df1 = df.copy(deep=True)
id(df['sets'].iloc[0])
#4592957024
id1(df['sets'].iloc[0])
#4592957024