The Python Oracle

python re.sub group: number after \number

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Droplet of life

--

Chapters
00:00 Question
00:34 Accepted answer (Score 462)
01:32 Thank you

--

Full question
https://stackoverflow.com/questions/5984...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #regex #numbers #regexgroup

#avk47



ACCEPTED ANSWER

Score 497


The answer is:

re.sub(r'(foo)', r'\g<1>123', 'foobar')

Relevant excerpt from the docs:

In addition to character escapes and backreferences as described above, \g<name> will use the substring matched by the group named name, as defined by the (?P<name>...) syntax. \g<number> uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \g<0> substitutes in the entire substring matched by the RE.




ANSWER 2

Score 1


For this problem I would prefer to match but not capture, by employing the following.

re.sub(r'(?<=foo)', r'123', 'foobar')
  #=> 'foo123bar'

which replaces the zero-width string after 'foo' (think between 'foo' and 'bar') with '123'. (?<=foo) is a positive lookbehind.

Demo


There are of course situations where a capture group is needed, such as

re.sub(r'(f\w*o)', r'\g<1>123', 'foobar')

Here

re.sub(r'(?<=f\w*o)', r'123', 'foobar')

does not work because Python's default regex engine does not support variable-length lookbehinds (the alternative PyPI regex module does, however).