The Python Oracle

How to vectorize this operation

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Lost Civilization

--

Chapters
00:00 Question
02:19 Accepted answer (Score 2)
03:09 Answer 2 (Score 0)
03:50 Thank you

--

Full question
https://stackoverflow.com/questions/5784...

Question links:
[IOB labeling of text chunks]: https://en.wikipedia.org/wiki/Inside%E2%...)

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #numpy #vectorization

#avk47



ACCEPTED ANSWER

Score 2


I would stay away from calling them 'intersection' and 'union', since those operations have well-defined meanings on sets and the operation you're looking to perform is neither of them.

However, to do what you want:

l0 = [0, 4, 4, 4, 0, 0, 0, 8, 8, 0]
l1 = [0, 1, 1, 1, 0, 0, 0, 8, 8, 8]

values = [
    (x
     if x == y else 0,
     0
     if x == y == 0
     else x if y == 0
     else y if x == 0
     else [x, y]) 
    for x, y in zip(l0, l1)
]

result_a, result_b = map(list, zip(*values))

print(result_a)
print(result_b)

This is more than enough for thousands, or even millions of elements since the operation is so basic. Of course, if we're talking billions, you may want to look at numpy anyway.




ANSWER 2

Score 0


Semi vectorized solution for union and full for intersection:

import numpy as np

l0 = np.array(l0)
l1 = np.array(l1)
intersec = np.zeros(l0.shape[0])
intersec_idx = np.where(l0==l1)
intersec[intersec_idx] = l0[intersec_idx]
intersec = intersec.astype(int).tolist()

union = np.zeros(l0.shape[0])
union_idx = np.where(l0==l1)
union[union_idx] = l0[union_idx]
no_union_idx = np.where(l0!=l1)
union = union.astype(int).tolist()
for idx in no_union_idx[0]:
    union[idx] = [l0[idx], l1[idx]]

and the output:

>>> intersection
[0, 0, 0, 0, 0, 0, 0, 8, 8, 0]
>>> union  
[0, [4, 1], [4, 1], [4, 1], 0, 0, 0, 8, 8, [0, 8]]

NB: I think your original union solution is incorrect. See the last output 8 vs [0,8]