The Python Oracle

Label contiguous groups of True elements within a pandas Series

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle2

--

Chapters
00:00 Label Contiguous Groups Of True Elements Within A Pandas Series
00:25 Accepted Answer Score 4
00:49 Answer 2 Score 2
01:05 Answer 3 Score 2
01:20 Thank you

--

Full question
https://stackoverflow.com/questions/5293...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #timeseries #series

#avk47



ACCEPTED ANSWER

Score 4


Here's a unlikely but simple and working solution:

import scipy.ndimage.measurements as mnts

labeled, clusters = mnts.label(df.A.values)
# labeled is what you want, cluster is the number of clusters.

df.Labels = labeled # puts it into df

Tested as:

a = array([False, False,  True,  True,  True, False,  True, False, False,
        True, False,  True,  True,  True,  True,  True,  True,  True,
        False, True], dtype=bool)

labeled, clusters = mnts.label(a)

>>> labeled
array([0, 0, 1, 1, 1, 0, 2, 0, 0, 3, 0, 4, 4, 4, 4, 4, 4, 4, 0, 5], dtype=int32)

>>> clusters
5



ANSWER 2

Score 2


With cumsum

a = df.A.values
z = np.zeros(a.shape, int)

z[a] = pd.factorize((~a).cumsum()[a])[0] + 1

df.assign(Label=z)

        A  Label
0   False      0
1    True      1
2    True      1
3    True      1
4   False      0
5   False      0
6    True      2
7   False      0
8   False      0
9    True      3
10   True      3



ANSWER 3

Score 2


You can use cumsum and groupby + ngroup to mark groups.

v = (~df.A).cumsum().where(df.A).bfill()   
df['Label'] = (
    v.groupby(v).ngroup().add(1).where(df.A).fillna(0, downcast='infer'))

df
       A  Label
0   False      0
1    True      1
2    True      1
3    True      1
4   False      0
5   False      0
6    True      2
7   False      0
8   False      0
9    True      3
10   True      3