How to find the last true position of the group starting from the first position to be true faster?

This video explains
How to find the last true position of the group starting from the first position to be true faster?

--

Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn

Music by Eric Matyas
https://www.soundimage.org

Track title: Horror Game Menu Looping

Full question
https://stackoverflow.com/questions/7030...

Accepted answer links:
[this]: https://stackoverflow.com/a/53020765/290...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Chapters
00:00 Question
01:34 Accepted answer
01:56 Answer 2
02:49 Thank you

--

Tags
#python #pandas #dataframe

ACCEPTED ANSWER

Score 4

Use numba for processing values to first Trues block, inspiration by this solution:

from numba import njit

@njit
def sort_order3(a, b):
    if not a[0]:
        return 0
    else:
        for i in range(1, len(a)):
            if not a[i]:
                return b[i - 1]
        return b[-1]


  
df = generate_data()
print (sort_order3(df['data'].to_numpy(), df['order'].to_numpy()))

ANSWER 2

Score 1

Maybe I am missing something but why dont you just get the index of the first False in df.data then use that index to get the value in the df.order column?

For example:

def sort_order3(df):
    try:
        idx = df.data.to_list().index(False)
    except ValueError: # meaning there is no False in df.data
        idx = df.data.size - 1
    return df.order[idx]

Or for really large data numpy might be faster:

def sort_order4(df):
    try:
        idx = np.argwhere(~df.data.values)[0, 0]
    except IndexError: # meaning there is no False in df.data
        idx = df.data.size - 1
    return df.order[idx]

The timing on my device:

%timeit sort_order(df.copy())
565 µs ± 6.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit sort_order2(df.copy())
443 µs ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit sort_order3(df.copy())
96.5 µs ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit sort_order4(df.copy())
112 µs ± 5.06 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)