Numpy: Duplicate mask for an array (returning True if we've seen that value before, False otherwise)
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Romantic Lands Beckon
--
Chapters
00:00 Numpy: Duplicate Mask For An Array (Returning True If We'Ve Seen That Value Before, False Otherw
00:31 Accepted Answer Score 3
01:21 Answer 2 Score 3
01:33 Answer 3 Score 0
01:50 Thank you
--
Full question
https://stackoverflow.com/questions/6396...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #list #numpy
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Romantic Lands Beckon
--
Chapters
00:00 Numpy: Duplicate Mask For An Array (Returning True If We'Ve Seen That Value Before, False Otherw
00:31 Accepted Answer Score 3
01:21 Answer 2 Score 3
01:33 Answer 3 Score 0
01:50 Thank you
--
Full question
https://stackoverflow.com/questions/6396...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #list #numpy
#avk47
ACCEPTED ANSWER
Score 3
Approach #1 : With sorting
def mask_firstocc(a):
sidx = a.argsort(kind='stable')
b = a[sidx]
out = np.r_[False,b[:-1] == b[1:]][sidx.argsort()]
return out
We can use array-assignment to boost perf. further -
def mask_firstocc_v2(a):
sidx = a.argsort(kind='stable')
b = a[sidx]
mask = np.r_[False,b[:-1] == b[1:]]
out = np.empty(len(a), dtype=bool)
out[sidx] = mask
return out
Sample run -
In [166]: a
Out[166]: array([2, 1, 1, 0, 0, 4, 0, 3])
In [167]: mask_firstocc(a)
Out[167]: array([False, False, True, False, True, False, True, False])
Approach #2 : With np.unique(..., return_index)
We can leverage np.unique with its return_index which seems to return the first occurence of each unique elemnent, hence a simple array-assignment and then indexing works -
def mask_firstocc_with_unique(a):
mask = np.ones(len(a), dtype=bool)
mask[np.unique(a, return_index=True)[1]] = False
return mask
ANSWER 2
Score 3
Use np.unique
a = np.array([1, 2, 1, 2, 3])
_, ix = np.unique(a, return_index=True)
b = np.full(a.shape, True)
b[ix] = False
In [45]: b
Out[45]: array([False, False, True, True, False])
ANSWER 3
Score 0
You can achieve that using the enumerate method - which lets you loop through using index + value :
array = [1, 2, 1, 2, 3]
mask = []
for i,v in enumerate(array):
if array.index(v) == i:
mask.append(False)
else:
mask.append(True)
print(mask)
Output:
[False, False, True, True, False]