Finding difference between two list of dictionary in Python
--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Luau
--
Chapters
00:00 Finding Difference Between Two List Of Dictionary In Python
01:02 Accepted Answer Score 6
01:50 Answer 2 Score 0
02:27 Thank you
--
Full question
https://stackoverflow.com/questions/3679...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #python3x #list #dictionary
#avk47
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Luau
--
Chapters
00:00 Finding Difference Between Two List Of Dictionary In Python
01:02 Accepted Answer Score 6
01:50 Answer 2 Score 0
02:27 Thank you
--
Full question
https://stackoverflow.com/questions/3679...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #python3x #list #dictionary
#avk47
ACCEPTED ANSWER
Score 6
A set is the perfect solution for this problem. Unfortunately, python will not let you add dictionaries to a set, because they are mutable and their hashcode could change between insert and lookup.
If you "freeze" the items to make them immutable, you can then add them to set objects instead of a list; and then take a set difference using the minus operator:
In [20]: i_set = { frozenset(row.items()) for row in incoming_rows }
In [21]: a_set = { frozenset(row.items()) for row in available_row }
In [22]: (i_set - a_set)
Out[22]:
{frozenset({('column_name', 'CONFIG_ID'),
('data_type', 'numeric(10,0)'),
('table_name', 'CONFIG')}),
frozenset({('column_name', 'CREATE_DATE'),
('data_type', 'VARCHAR(20)'),
('table_name', 'CONFIG')}),
frozenset({('column_name', 'CONFIG_TYPE'),
('data_type', 'varchar(1)'),
('table_name', 'CONFIG')})}
Edit: To unfreeze:
In [25]: [dict(i) for i in i_set - a_set]
Out[25]:
[{'column_name': 'CONFIG_ID',
'data_type': 'numeric(10,0)',
'table_name': 'CONFIG'},
{'column_name': 'CREATE_DATE',
'data_type': 'VARCHAR(20)',
'table_name': 'CONFIG'},
{'column_name': 'CONFIG_TYPE',
'data_type': 'varchar(1)',
'table_name': 'CONFIG'}]
ANSWER 2
Score 0
For large datasets, and especially when you are working with numeric data, you may find better performance with 3rd party libraries. For example, Pandas accepts lists of directories directly:
import pandas as pd
# convert lists of dictionaries to dataframes
df_incoming, df_available = map(pd.DataFrame, (incoming_rows, available_row))
# merge data, adding indicator, and filter
res = df_available.merge(df_incoming, indicator=True, how='outer')
res = res[res['_merge'] == 'right_only']
print(res)
column_name data_type table_name _merge
3 CREATE_DATE VARCHAR(20) CONFIG right_only
4 CONFIG_TYPE varchar(1) CONFIG right_only
5 CONFIG_ID numeric(10,0) CONFIG right_only
If you require a list of dictionaries as output:
print(res.drop('_merge', 1).to_dict('records'))
[{'column_name': 'CREATE_DATE', 'data_type': 'VARCHAR(20)', 'table_name': 'CONFIG'},
{'column_name': 'CONFIG_TYPE', 'data_type': 'varchar(1)', 'table_name': 'CONFIG'},
{'column_name': 'CONFIG_ID', 'data_type': 'numeric(10,0)', 'table_name': 'CONFIG'}]