Remove outliers from pandas dataframe python
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Switch On Looping
--
Chapters
00:00 Question
01:51 Accepted answer (Score 3)
02:14 Thank you
--
Full question
https://stackoverflow.com/questions/4546...
Accepted answer links:
[boolean indexing]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #outliers
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Switch On Looping
--
Chapters
00:00 Question
01:51 Accepted answer (Score 3)
02:14 Thank you
--
Full question
https://stackoverflow.com/questions/4546...
Accepted answer links:
[boolean indexing]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #outliers
#avk47
ACCEPTED ANSWER
Score 3
It seems you need boolean indexing with ~ for invert condition, because need filter only not outliers rows (and drop outliers):
df1 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
print (df1)
Data Time
0 -0.704239 7.304021
1 -0.704239 7.352021
2 -0.704239 7.400021
3 -0.704239 7.448021
4 -0.825279 7.496021