The Python Oracle

Pandas Dataframe: how can i compare values in two columns of a row are equal to the ones in the same columns of a subsequent row?

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Track title: CC F Haydns String Quartet No 53 in D

--

Chapters
00:00 Question
00:52 Accepted answer (Score 5)
01:19 Answer 2 (Score 0)
01:40 Answer 3 (Score 0)
01:53 Thank you

--

Full question
https://stackoverflow.com/questions/6735...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas

#avk47



ACCEPTED ANSWER

Score 5


You had the right idea about shifted comparison, but you need to shift backwards so you compare the current row with the next one. Finally use an all condition to enforce that ALL columns are equal in a row:

df['Validity'] = df[['Fruit', 'Color']].eq(df[['Fruit', 'Color']].shift(-1)).all(axis=1)

df
    Fruit   Color  Weight  Validity
0   apple     red      50      True
1   apple     red      75     False
2   apple   green      45     False
3  orange  orange      80      True
4  orange  orange      90     False
5  orange     red      90     False



ANSWER 2

Score 0


Another alternative -

subset_df = df[['Fruit','Color']].apply(''.join, axis=1)
df['Validity'] = np.where(subset_df == subset_df.shift(-1), True,False)



ANSWER 3

Score 0


Similarly with other answers:

df['Validity']=(df[['Fruit', 'Color']]==pd.concat([df['Fruit'].shift(-1), df['Color'].shift(-1)], axis=1)).all(axis=1)

>>> print(df)
       Fruit   Color  Weight  Validity
0   apple     red      50      True
1   apple     red      75     False
2   apple   green      45     False
3  orange  orange      80      True
4  orange  orange      90     False
5  orange     red      90     False