What is the difference between re.search and re.match?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Magic Ocean Looping
--
Chapters
00:00 What Is The Difference Between Re.Search And Re.Match?
00:30 Accepted Answer Score 669
02:27 Answer 2 Score 59
02:40 Answer 3 Score 143
02:53 Answer 4 Score 41
03:31 Thank you
--
Full question
https://stackoverflow.com/questions/1809...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #regex #search #match
#avk47
ACCEPTED ANSWER
Score 669
re.match is anchored at the beginning of the string. That has nothing to do with newlines, so it is not the same as using ^ in the pattern.
As the re.match documentation says:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding
MatchObjectinstance. ReturnNoneif the string does not match the pattern; note that this is different from a zero-length match.Note: If you want to locate a match anywhere in string, use
search()instead.
re.search searches the entire string, as the documentation says:
Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding
MatchObjectinstance. ReturnNoneif no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
So if you need to match at the beginning of the string, or to match the entire string use match. It is faster. Otherwise use search.
The documentation has a specific section for match vs. search that also covers multiline strings:
Python offers two different primitive operations based on regular expressions:
matchchecks for a match only at the beginning of the string, whilesearchchecks for a match anywhere in the string (this is what Perl does by default).Note that
matchmay differ fromsearcheven when using a regular expression beginning with'^':'^'matches only at the start of the string, or inMULTILINEmode also immediately following a newline. The “match” operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optionalposargument regardless of whether a newline precedes it.
Now, enough talk. Time to see some example code:
# example code:
string_with_newlines = """something
someotherthing"""
import re
print re.match('some', string_with_newlines) # matches
print re.match('someother',
string_with_newlines) # won't match
print re.match('^someother', string_with_newlines,
re.MULTILINE) # also won't match
print re.search('someother',
string_with_newlines) # finds something
print re.search('^someother', string_with_newlines,
re.MULTILINE) # also finds something
m = re.compile('thing$', re.MULTILINE)
print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines,
re.MULTILINE) # also matches
ANSWER 2
Score 143
search ⇒ find something anywhere in the string and return a match object.
match ⇒ find something at the beginning of the string and return a match object.
ANSWER 3
Score 59
re.search searches for the pattern throughout the string, whereas re.match does not search the pattern; if it does not, it has no other choice than to match it at start of the string.
ANSWER 4
Score 41
The difference is, re.match() misleads anyone accustomed to Perl, grep, or sed regular expression matching, and re.search() does not. :-)
More soberly, As John D. Cook remarks, re.match() "behaves as if every pattern has ^ prepended." In other words, re.match('pattern') equals re.search('^pattern'). So it anchors a pattern's left side. But it also doesn't anchor a pattern's right side: that still requires a terminating $.
Frankly given the above, I think re.match() should be deprecated. I would be interested to know reasons it should be retained.