The Python Oracle

Remove outliers from pandas dataframe python

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Drifting Through My Dreams

--

Chapters
00:00 Remove Outliers From Pandas Dataframe Python
01:23 Accepted Answer Score 3
01:39 Thank you

--

Full question
https://stackoverflow.com/questions/4546...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #outliers

#avk47



ACCEPTED ANSWER

Score 3


It seems you need boolean indexing with ~ for invert condition, because need filter only not outliers rows (and drop outliers):

df1 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
print (df1)
       Data      Time
0 -0.704239  7.304021
1 -0.704239  7.352021
2 -0.704239  7.400021
3 -0.704239  7.448021
4 -0.825279  7.496021