Check if String in List of Strings is in Pandas DataFrame Column
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Dreamlands
--
Chapters
00:00 Question
01:39 Accepted answer (Score 6)
02:50 Thank you
--
Full question
https://stackoverflow.com/questions/5617...
Accepted answer links:
[Series.isin]: http://pandas.pydata.org/pandas-docs/sta...
[Series.str.contains]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #string #pandas
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Dreamlands
--
Chapters
00:00 Question
01:39 Accepted answer (Score 6)
02:50 Thank you
--
Full question
https://stackoverflow.com/questions/5617...
Accepted answer links:
[Series.isin]: http://pandas.pydata.org/pandas-docs/sta...
[Series.str.contains]: http://pandas.pydata.org/pandas-docs/sta...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #string #pandas
#avk47
ACCEPTED ANSWER
Score 9
If need match values in list, use Series.isin:
df['Match'] = df["Brand"].isin(search_for_these_values)
print (df)
Brand Price Liscence Plate Match
0 Honda Civic 22000 ABC 123 False
1 Toyota Corolla 25000 XYZ 789 False
2 Ford Focus 27000 CBA 321 True
3 Audi A4 35000 ZYX 987 False
4 NaN 29000 DEF 456 False
Solution with match is used for check substrings, so different output.
Alternative solution for match substrings with Series.str.contains and parameter na=False:
df['Match'] = df["Brand"].str.contains(pattern, na=False)
print (df)
Brand Price Liscence Plate Match
0 Honda Civic 22000 ABC 123 True
1 Toyota Corolla 25000 XYZ 789 True
2 Ford Focus 27000 CBA 321 True
3 Audi A4 35000 ZYX 987 False
4 NaN 29000 DEF 456 False
EDIT:
For test values in substrings is possible use list comprehension with loop by values in search_for_these_values and test match by in with any for return at least one True:
df['Match'] = [any(x in z for z in search_for_these_values)
if x == x
else False
for x in df["Brand"]]
print (df)
Brand Price Liscence Plate Match
0 Honda Civic 22000 ABC 123 False
1 Toyota Corolla 25000 XYZ 789 False
2 Ford Focus 27000 CBA 321 True
3 Audi A4 35000 ZYX 987 True
4 NaN 29000 DEF 456 False