The Python Oracle

wrong result while comparing two columns of a dataframes in python

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: RPG Blues Looping

--

Chapters
00:00 Wrong Result While Comparing Two Columns Of A Dataframes In Python
00:20 Accepted Answer Score 1
00:44 Thank you

--

Full question
https://stackoverflow.com/questions/4540...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe #dataanalysis

#avk47



ACCEPTED ANSWER

Score 1


EDIT If need return Falses for empty column, you can add condition for check if column is not empty:

df = pd.DataFrame(columns=['P1','P2','P3'])
print (df)
Empty DataFrame
Columns: [P1, P2, P3]
Index: []

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0    False
1    False
Name: Names, dtype: bool
df = pd.DataFrame({'P1':['Kumar']}, columns=['P1','P2','P3'])
print (df)
      P1   P2   P3
0  Kumar  NaN  NaN

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0     True
1    False
Name: Names, dtype: bool