Label contiguous groups of True elements within a pandas Series
--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle2
--
Chapters
00:00 Label Contiguous Groups Of True Elements Within A Pandas Series
00:25 Accepted Answer Score 4
00:49 Answer 2 Score 2
01:05 Answer 3 Score 2
01:20 Thank you
--
Full question
https://stackoverflow.com/questions/5293...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #timeseries #series
#avk47
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle2
--
Chapters
00:00 Label Contiguous Groups Of True Elements Within A Pandas Series
00:25 Accepted Answer Score 4
00:49 Answer 2 Score 2
01:05 Answer 3 Score 2
01:20 Thank you
--
Full question
https://stackoverflow.com/questions/5293...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #timeseries #series
#avk47
ACCEPTED ANSWER
Score 4
Here's a unlikely but simple and working solution:
import scipy.ndimage.measurements as mnts
labeled, clusters = mnts.label(df.A.values)
# labeled is what you want, cluster is the number of clusters.
df.Labels = labeled # puts it into df
Tested as:
a = array([False, False, True, True, True, False, True, False, False,
True, False, True, True, True, True, True, True, True,
False, True], dtype=bool)
labeled, clusters = mnts.label(a)
>>> labeled
array([0, 0, 1, 1, 1, 0, 2, 0, 0, 3, 0, 4, 4, 4, 4, 4, 4, 4, 0, 5], dtype=int32)
>>> clusters
5
ANSWER 2
Score 2
With cumsum
a = df.A.values
z = np.zeros(a.shape, int)
z[a] = pd.factorize((~a).cumsum()[a])[0] + 1
df.assign(Label=z)
A Label
0 False 0
1 True 1
2 True 1
3 True 1
4 False 0
5 False 0
6 True 2
7 False 0
8 False 0
9 True 3
10 True 3
ANSWER 3
Score 2
You can use cumsum and groupby + ngroup to mark groups.
v = (~df.A).cumsum().where(df.A).bfill()
df['Label'] = (
v.groupby(v).ngroup().add(1).where(df.A).fillna(0, downcast='infer'))
df
A Label
0 False 0
1 True 1
2 True 1
3 True 1
4 False 0
5 False 0
6 True 2
7 False 0
8 False 0
9 True 3
10 True 3