The Python Oracle

Pandas counting occurrence of list contained in column of lists

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puddle Jumping Looping

--

Chapters
00:00 Pandas Counting Occurrence Of List Contained In Column Of Lists
02:50 Answer 1 Score 0
03:03 Accepted Answer Score 5
03:30 Thank you

--

Full question
https://stackoverflow.com/questions/4741...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #vectorization

#avk47



ACCEPTED ANSWER

Score 5


You can utilise DataFrame.apply along with the builtin set.issubset method and then .sum() which all operate at a lower level (normally C level) than Python equivalents do.

subset_wanted = {2, 3}
count = df.m.apply(subset_wanted.issubset).sum()

I can't see shaving more time off that than writing a custom C-level function which'd be the equivalent of a custom sum with a check there's a subset to determine 0/1 on a row by row basis. At which point, you could have run this thousands upon thousands of times anyway.




ANSWER 2

Score 0


Since you are looking more a set-like behavior

(df.m.apply(lambda x: set(x).intersection(set([2,3]))) == set([2,3])).sum()

Returns

3