The Python Oracle

Check uniqueness for a specific value in each column

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Isolated

--

Chapters
00:00 Question
01:30 Accepted answer (Score 2)
01:59 Answer 2 (Score 1)
02:27 Thank you

--

Full question
https://stackoverflow.com/questions/6415...

Question links:
[pd.DataFrame.apply()]: https://pandas.pydata.org/pandas-docs/st...

Accepted answer links:
[DataFrame.eq]: https://pandas.pydata.org/pandas-docs/st...
[sum]: https://pandas.pydata.org/pandas-docs/st...
[.lt]: https://pandas.pydata.org/pandas-docs/st...
[.all]: https://pandas.pydata.org/pandas-docs/st...

Answer 2 links:
[DataFrame.reindex]: http://pandas.pydata.org/pandas-docs/sta...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe

#avk47



ACCEPTED ANSWER

Score 2


Let's use DataFrame.eq to create a boolean mask corresponding to each of the candidates then use sum to get total count of occurrences in each column, finally use .lt + .all to check if its unique in all columns:

pd.DataFrame([{'cand': c, 'unique': df.eq(c).sum().lt(2).all()} for c in cand])

  cand  unique
0    a    True
1    b   False
2    c    True
3    g    True



ANSWER 2

Score 1


I change yoour idea - added DataFrame.reindex for filter by list, then compare for greater like 1, test if at least one row and invert mask by ~, last convert Series to 2 column DataFrame:

df1 = ((~df.apply(pd.value_counts)
           .reindex(candidates)
           .gt(1)
           .any(axis=1))
           .rename_axis('cand')
           .reset_index(name='unique'))
print (df1)
  cand  unique
0    a    True
1    b   False
2    c    True
3    g    True