The Python Oracle

Python: numpy array sublist match with large list where sequence matter

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Cosmic Puzzle

--

Chapters
00:00 Question
00:56 Accepted answer (Score 0)
01:41 Answer 2 (Score 2)
03:01 Thank you

--

Full question
https://stackoverflow.com/questions/4210...

Answer 1 links:
[slicing]: https://docs.scipy.org/doc/numpy/referen...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #numpy

#avk47



ANSWER 1

Score 2


One very easy solution using the sublist pattern for slicing and checking for that signature and getting the indices would be -

np.flatnonzero(~a[:-2] & a[1:-1] & a[2:]) # a as data array

Explanation

Basically, we are slicing three slices out of the data array - One that starts at 0th index and goes until leaving out last two elements, another that starts at 1st index and ends at second last element and a third slice starting at 2nd and goes on until the end. These three slices correspond to the three elements of matching required against the sublist pattern that's - [False, True, True]. We need to make sure that the first one is False, in other words, let's make sure the negation of it is True. Negation in NumPy is achieved through ~ operator. So, in essence, we get the combined mask off those three slices and get the corresponding indices with np.flatnonzero.

For the given data results in -

In [79]: np.flatnonzero(~a[:-2] & a[1:-1] & a[2:])
Out[79]: array([ 12,  31,  68, 112])



ACCEPTED ANSWER

Score 0


in won't check for sub-arrays. Instead, it checks for elements.

You will have to do something like this:
(Using A for big array and b for sub-list for readability.)

n = len(b)
c = [i for i in xrange(len(A)-n+1) if (b==A[i:i+n]).all()]

c is the required list of indexes.

Explanation:
This is basic List Comprehension in python.
The idea is to create sub-arrays of the bigarray and check if it matches the sublist.

Breaking down the statement for better understanding:

c = []    
for i in xrange(len(A)-n+1):
    if (b==A[i:i+n]).all():    # if list and arrays match
        c.append(i)