Pandas check if row exist in another dataframe and append index
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Mysterious Puzzle
--
Chapters
00:00 Pandas Check If Row Exist In Another Dataframe And Append Index
02:01 Accepted Answer Score 9
02:38 Thank you
--
Full question
https://stackoverflow.com/questions/3958...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
    Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Mysterious Puzzle
--
Chapters
00:00 Pandas Check If Row Exist In Another Dataframe And Append Index
02:01 Accepted Answer Score 9
02:38 Thank you
--
Full question
https://stackoverflow.com/questions/3958...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
ACCEPTED ANSWER
Score 9
you can do it this way:
Data (pay attention at the index in the B DF):
In [276]: cols = ['SampleID', 'ParentID']
In [277]: A
Out[277]:
   Real_ID  SampleID  ParentID Something AnotherThing
0      NaN        10        11         a            b
1      NaN        20        21         a            b
2      NaN        40        51         a            b
In [278]: B
Out[278]:
   SampleID  ParentID
3        10        11
5        20        21
Solution:
In [279]: merged = pd.merge(A[cols], B, on=cols, how='outer', indicator=True)
In [280]: merged
Out[280]:
   SampleID  ParentID     _merge
0        10        11       both
1        20        21       both
2        40        51  left_only
In [281]: B = pd.concat([B, merged.ix[merged._merge=='left_only', cols]])
In [282]: B
Out[282]:
   SampleID  ParentID
3        10        11
5        20        21
2        40        51
In [285]: A['Real_ID'] = pd.merge(A[cols], B.reset_index(), on=cols)['index']
In [286]: A
Out[286]:
   Real_ID  SampleID  ParentID Something AnotherThing
0        3        10        11         a            b
1        5        20        21         a            b
2        2        40        51         a            b