The Python Oracle

Type of compiled regex object in python

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Ominous Technology Looping

--

Chapters
00:00 Type Of Compiled Regex Object In Python
00:49 Accepted Answer Score 47
01:38 Answer 2 Score 11
02:09 Answer 3 Score 18
02:36 Answer 4 Score 21
02:50 Thank you

--

Full question
https://stackoverflow.com/questions/6102...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #regex #types

#avk47



ACCEPTED ANSWER

Score 47


When the type of something isn't well specified, there's nothing wrong with using the type builtin to discover the answer at runtime:

>>> import re
>>> retype = type(re.compile('hello, world'))
>>> isinstance(re.compile('goodbye'), retype)
True
>>> isinstance(12, retype)
False
>>> 

Discovering the type at runtime protects you from having to access private attributes and against future changes to the return type. There's nothing inelegant about using type here, though there may be something inelegant about wanting to know the type at all.

That said, with the passage of time, the context of this question has shifted. With contemporary versions of Python, the return type of re.compile is now re.Pattern.

The general question about what to do if the type of something is not well-specified is still valid but in this particular case, the type of re.compile(...) is now well-specified.




ANSWER 2

Score 21


It is possible to compare a compiled regular expression with 're._pattern_type'

import re
pattern = r'aa'
compiled_re = re.compile(pattern)
print isinstance(compiled_re, re._pattern_type)

>>True

Gives True, at least in version 2.7




ANSWER 3

Score 18


Disclaimer: This isn't intended as a direct answer for your specific needs, but rather something that may be useful as an alternative approach


You can keep with the ideals of duck typing, and use hasattr to determine if the object has certain properties that you want to utilize. For example, you could do something like:

if hasattr(possibly_a_re_object, "match"): # Treat it like it's an re object
    possibly_a_re_object.match(thing_to_match_against)
else:
    # alternative handler



ANSWER 4

Score 11


Prevention is better than cure. Don't create such a heterogeneous list in the first place. Have a set of allowed strings and a list of compiled regex objects. This should make your checking code look better and run faster:

if input in allowed_strings:
    ignored = False
else:
    for allowed in allowed_regexed_objects:
        if allowed.match(input):
            ignored = False
            break

If you can't avoid the creation of such a list, see if you have the opportunity to examine it once and build the two replacement objects.