How to check if a string in Python is in ASCII?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Forest of Spells Looping
--
Chapters
00:00 How To Check If A String In Python Is In Ascii?
00:28 Answer 1 Score 288
01:07 Accepted Answer Score 233
01:19 Answer 3 Score 190
01:46 Answer 4 Score 176
02:10 Answer 5 Score 29
02:36 Thank you
--
Full question
https://stackoverflow.com/questions/1963...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #string #unicode #ascii
#avk47
ANSWER 1
Score 289
I think you are not asking the right question--
A string in python has no property corresponding to 'ascii', utf-8, or any other encoding. The source of your string (whether you read it from a file, input from a keyboard, etc.) may have encoded a unicode string in ascii to produce your string, but that's where you need to go for an answer.
Perhaps the question you can ask is: "Is this string the result of encoding a unicode string in ascii?" -- This you can answer by trying:
try:
mystring.decode('ascii')
except UnicodeDecodeError:
print "it was not a ascii-encoded unicode string"
else:
print "It may have been an ascii-encoded unicode string"
ACCEPTED ANSWER
Score 234
def is_ascii(s):
return all(ord(c) < 128 for c in s)
ANSWER 3
Score 190
In Python 3, we can encode the string as UTF-8, then check whether the length stays the same. If so, then the original string is ASCII.
def isascii(s):
"""Check if the characters in string s are in ASCII, U+0-U+7F."""
return len(s) == len(s.encode())
To check, pass the test string:
>>> isascii("♥O◘♦♥O◘♦")
False
>>> isascii("Python")
True
ANSWER 4
Score 29
Vincent Marchetti has the right idea, but str.decode has been deprecated in Python 3. In Python 3 you can make the same test with str.encode:
try:
mystring.encode('ascii')
except UnicodeEncodeError:
pass # string is not ascii
else:
pass # string is ascii
Note the exception you want to catch has also changed from UnicodeDecodeError to UnicodeEncodeError.