The Python Oracle

How to check if a string in Python is in ASCII?

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Over a Mysterious Island Looping

--

Chapters
00:00 How To Check If A String In Python Is In Ascii?
00:25 Answer 1 Score 288
00:59 Accepted Answer Score 233
01:09 Answer 3 Score 190
01:33 Answer 4 Score 176
01:55 Answer 5 Score 29
02:19 Thank you

--

Full question
https://stackoverflow.com/questions/1963...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #string #unicode #ascii

#avk47



ANSWER 1

Score 289


I think you are not asking the right question--

A string in python has no property corresponding to 'ascii', utf-8, or any other encoding. The source of your string (whether you read it from a file, input from a keyboard, etc.) may have encoded a unicode string in ascii to produce your string, but that's where you need to go for an answer.

Perhaps the question you can ask is: "Is this string the result of encoding a unicode string in ascii?" -- This you can answer by trying:

try:
    mystring.decode('ascii')
except UnicodeDecodeError:
    print "it was not a ascii-encoded unicode string"
else:
    print "It may have been an ascii-encoded unicode string"



ACCEPTED ANSWER

Score 234


def is_ascii(s):
    return all(ord(c) < 128 for c in s)



ANSWER 3

Score 190


In Python 3, we can encode the string as UTF-8, then check whether the length stays the same. If so, then the original string is ASCII.

def isascii(s):
    """Check if the characters in string s are in ASCII, U+0-U+7F."""
    return len(s) == len(s.encode())

To check, pass the test string:

>>> isascii("♥O◘♦♥O◘♦")
False
>>> isascii("Python")
True



ANSWER 4

Score 29


Vincent Marchetti has the right idea, but str.decode has been deprecated in Python 3. In Python 3 you can make the same test with str.encode:

try:
    mystring.encode('ascii')
except UnicodeEncodeError:
    pass  # string is not ascii
else:
    pass  # string is ascii

Note the exception you want to catch has also changed from UnicodeDecodeError to UnicodeEncodeError.