The Python Oracle

Deleting DataFrame row in Pandas based on column value

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Quirky Dreamscape Looping

--

Chapters
00:00 Question
00:59 Accepted answer (Score 1475)
01:12 Answer 2 (Score 294)
01:41 Answer 3 (Score 133)
02:01 Answer 4 (Score 65)
02:24 Thank you

--

Full question
https://stackoverflow.com/questions/1817...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #dataframe #performance #deleterow

#avk47



ACCEPTED ANSWER

Score 1593


If I'm understanding correctly, it should be as simple as:

df = df[df.line_race != 0]



ANSWER 2

Score 314


But for any future bypassers you could mention that df = df[df.line_race != 0] doesn't do anything when trying to filter for None/missing values.

Does work:

df = df[df.line_race != 0]

Doesn't do anything:

df = df[df.line_race != None]

Does work:

df = df[df.line_race.notnull()]



ANSWER 3

Score 167


just to add another solution, particularly useful if you are using the new pandas assessors, other solutions will replace the original pandas and lose the assessors

df.drop(df.loc[df['line_race']==0].index, inplace=True)



ANSWER 4

Score 72


If you want to delete rows based on multiple values of the column, you could use:

df[(df.line_race != 0) & (df.line_race != 10)]

To drop all rows with values 0 and 10 for line_race.