Assessing values to a pandas column with conditions depending on other columns
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Cool Puzzler LoFi
--
Chapters
00:00 Assessing Values To A Pandas Column With Conditions Depending On Other Columns
01:01 Accepted Answer Score 7
01:32 Answer 2 Score 5
02:00 Thank you
--
Full question
https://stackoverflow.com/questions/7114...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe
#avk47
    Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Cool Puzzler LoFi
--
Chapters
00:00 Assessing Values To A Pandas Column With Conditions Depending On Other Columns
01:01 Accepted Answer Score 7
01:32 Answer 2 Score 5
02:00 Thank you
--
Full question
https://stackoverflow.com/questions/7114...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe
#avk47
ACCEPTED ANSWER
Score 7
Create 2 boolean masks then combine them and find the highest id_res value per col:
m1 = df['col'].duplicated(keep=False)
m2 = ~df['id_res'].duplicated(keep=False)
df['check'] = df.index.isin(df[m1 & m2].groupby('col')['id_res'].idxmax())
print(df)
# Output
      col  id_res  check
0   paris      12  False
1   paris      12  False
2  nantes      14  False
3  berlin      28   True
4  berlin       8  False
5  berlin       4  False
6   tokyo      89  False
Details:
>>> pd.concat([df, m1.rename('m1'), m2.rename('m2')])
      col  id_res  check     m1     m2
0   paris      12  False   True  False
1   paris      12  False   True  False
2  nantes      14  False  False   True
3  berlin      28   True   True   True  # <-  group to check
4  berlin       8  False   True   True  # <-     because 
5  berlin       4  False   True   True  # <- m1 and m2 are True
6   tokyo      89  False  False   True
ANSWER 2
Score 5
You basically have 3 conditions, so use masks and take the logical intersection (AND/&):
g = df_test.groupby('col')['id_res']
# is col duplicated?
m1 = df_test['col'].duplicated(keep=False)
# [ True  True False  True  True  True False]
# is id_res max of its group?
m2 = df_test['id_res'].eq(g.transform('max'))
# [ True  True  True  True False False  True]
# is group diverse? (more than 1 id_res)
m3 = g.transform('nunique').gt(1)
# [False False False  True  True  True False]
# check if all conditions True
df_test['check'] = m1&m2&m3
Output:
      col  id_res  check
0   paris      12  False
1   paris      12  False
2  nantes      14  False
3  berlin      28   True
4  berlin       8  False
5  berlin       4  False
6   tokyo      89  False