Pandas: Setting True to False in a column, if it appears less than n times in a row
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: A Thousand Exotic Places Looping v001
--
Chapters
00:00 Question
02:00 Accepted answer (Score 2)
03:37 Answer 2 (Score 1)
04:25 Thank you
--
Full question
https://stackoverflow.com/questions/6328...
Accepted answer links:
[Series.cumsum]: http://pandas.pydata.org/pandas-docs/sta...
[Series.where]: http://pandas.pydata.org/pandas-docs/sta...
[Series.map]: http://pandas.pydata.org/pandas-docs/sta...
[Series.value_counts]: http://pandas.pydata.org/pandas-docs/sta...
[Series.gt]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: A Thousand Exotic Places Looping v001
--
Chapters
00:00 Question
02:00 Accepted answer (Score 2)
03:37 Answer 2 (Score 1)
04:25 Thank you
--
Full question
https://stackoverflow.com/questions/6328...
Accepted answer links:
[Series.cumsum]: http://pandas.pydata.org/pandas-docs/sta...
[Series.where]: http://pandas.pydata.org/pandas-docs/sta...
[Series.map]: http://pandas.pydata.org/pandas-docs/sta...
[Series.value_counts]: http://pandas.pydata.org/pandas-docs/sta...
[Series.gt]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
ACCEPTED ANSWER
Score 2
Idea is create groups for each consecutive Trues values by Series.cumsum with inverted boolean mask, then replace non match values to NaNs by Series.where and last count values of each groups by Series.map and Series.value_counts compared by threshold for greater by Series.gt:
s = (~df['input']).cumsum().where(df['input'])
df['out'] = s.map(s.value_counts()).gt(4)
print (df)
input output out
0 False False False
1 False False False
2 False False False
3 False False False
4 True False False
5 True False False
6 False False False
7 False False False
8 True False False
9 False False False
10 False False False
11 False False False
12 True False False
13 True False False
14 True False False
15 False False False
16 False False False
17 False False False
18 True True True
19 True True True
20 True True True
21 True True True
22 True True True
23 False False False
Details:
s = (~df['input']).cumsum().where(df['input'])
print (df.assign(inv = (~df['input']),
cumsum = (~df['input']).cumsum(),
s = (~df['input']).cumsum().where(df['input']),
count = s.map(s.value_counts()),
out = s.map(s.value_counts()).gt(4)))
input output inv cumsum s count out
0 False False True 1 NaN NaN False
1 False False True 2 NaN NaN False
2 False False True 3 NaN NaN False
3 False False True 4 NaN NaN False
4 True False False 4 4.0 2.0 False
5 True False False 4 4.0 2.0 False
6 False False True 5 NaN NaN False
7 False False True 6 NaN NaN False
8 True False False 6 6.0 1.0 False
9 False False True 7 NaN NaN False
10 False False True 8 NaN NaN False
11 False False True 9 NaN NaN False
12 True False False 9 9.0 3.0 False
13 True False False 9 9.0 3.0 False
14 True False False 9 9.0 3.0 False
15 False False True 10 NaN NaN False
16 False False True 11 NaN NaN False
17 False False True 12 NaN NaN False
18 True True False 12 12.0 5.0 True
19 True True False 12 12.0 5.0 True
20 True True False 12 12.0 5.0 True
21 True True False 12 12.0 5.0 True
22 True True False 12 12.0 5.0 True
23 False False True 13 NaN NaN False
ANSWER 2
Score 1
Here's a way to do that:
N = 4
df["group_size"] = df.assign(group = (df.input==False).cumsum()).groupby("group").transform("count")
df.loc[(df.group_size > N) & df.input, "output"] = True
df.output.fillna(False, inplace = True)
The output is (note that the group size is always the actual size + 1) - but the final result is fine:
input group_size output
0 False 1 False
1 False 1 False
2 False 1 False
3 False 3 False
4 True 3 False
5 True 3 False
6 False 1 False
7 False 2 False
8 True 2 False
9 False 1 False
10 False 1 False
11 False 4 False
12 True 4 False
13 True 4 False
14 True 4 False
15 False 1 False
16 False 1 False
17 False 6 False
18 True 6 True
19 True 6 True
20 True 6 True
21 True 6 True
22 True 6 True
23 False 1 False