The Python Oracle

wrong result while comparing two columns of a dataframes in python

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Unforgiving Himalayas Looping

--

Chapters
00:00 Question
00:45 Accepted answer (Score 1)
01:18 Thank you

--

Full question
https://stackoverflow.com/questions/4540...

Accepted answer links:
[empty]: https://pandas.pydata.org/pandas-docs/st...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe #dataanalysis

#avk47



ACCEPTED ANSWER

Score 1


EDIT If need return Falses for empty column, you can add condition for check if column is not empty:

df = pd.DataFrame(columns=['P1','P2','P3'])
print (df)
Empty DataFrame
Columns: [P1, P2, P3]
Index: []

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0    False
1    False
Name: Names, dtype: bool
df = pd.DataFrame({'P1':['Kumar']}, columns=['P1','P2','P3'])
print (df)
      P1   P2   P3
0  Kumar  NaN  NaN

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0     True
1    False
Name: Names, dtype: bool