The Python Oracle

Check uniqueness for a specific value in each column

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Dreaming in Puzzles

--

Chapters
00:00 Check Uniqueness For A Specific Value In Each Column
01:02 Accepted Answer Score 2
01:24 Answer 2 Score 1
01:47 Thank you

--

Full question
https://stackoverflow.com/questions/6415...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe

#avk47



ACCEPTED ANSWER

Score 2


Let's use DataFrame.eq to create a boolean mask corresponding to each of the candidates then use sum to get total count of occurrences in each column, finally use .lt + .all to check if its unique in all columns:

pd.DataFrame([{'cand': c, 'unique': df.eq(c).sum().lt(2).all()} for c in cand])

  cand  unique
0    a    True
1    b   False
2    c    True
3    g    True



ANSWER 2

Score 1


I change yoour idea - added DataFrame.reindex for filter by list, then compare for greater like 1, test if at least one row and invert mask by ~, last convert Series to 2 column DataFrame:

df1 = ((~df.apply(pd.value_counts)
           .reindex(candidates)
           .gt(1)
           .any(axis=1))
           .rename_axis('cand')
           .reset_index(name='unique'))
print (df1)
  cand  unique
0    a    True
1    b   False
2    c    True
3    g    True