The Python Oracle

When is `string.swapcase().swapcase()` not equal to `string`?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Melt

--

Chapters
00:00 When Is `String.Swapcase().Swapcase()` Not Equal To `String`?
00:30 Accepted Answer Score 8
01:16 Answer 2 Score 6
02:00 Thank you

--

Full question
https://stackoverflow.com/questions/6156...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #python3x #string #characterencoding #character

#avk47



ACCEPTED ANSWER

Score 8


A simple example would be:

s = "ß"

print(s.swapcase().swapcase())

Ouput:

ss

ß is German lowercase double s (The "correct" uppercase version would be ). The reason this happens is that the Unicode standard has defined the capitalization of ß to be SS:

The data in this file, combined with
# the simple case mappings in UnicodeData.txt, defines the full case mappings
# Lowercase_Mapping (lc), Titlecase_Mapping (tc), and Uppercase_Mapping (uc).

...

# The entries in this file are in the following machine-readable format:
#
# <code>; <lower>; <title>; <upper>; (<condition_list>;)? # <comment>

...

# The German es-zed is special--the normal mapping is to SS.
# Note: the titlecase should never occur in practice. It is equal to titlecase(uppercase(<es-zed>))

00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S

(00DF is ß, 0053 is S, and 0073 is s)




ANSWER 2

Score 6


In fact, there is a wide range of examples: it happens with some greek symbols, german symbols, armenian symbols, and other specific/special symbols.

To get them all:

find_dif = lambda s: s.swapcase().swapcase() != s

[chr(s) for s in range(100000) if find_dif(chr(s))]

and you get:

['µ', 'ß', 'İ', 'ı', 'ʼn', 'ſ', 'ǰ', 'ͅ', 'ΐ', 'ΰ', 'ς', 'ϐ', 'ϑ', 'ϕ', 'ϖ', 'ϰ', 'ϱ', 'ϴ', 'ϵ', 'և', 'ᲀ', 'ᲁ', 'ᲂ', 'ᲃ', 'ᲄ', 'ᲅ', 'ᲆ', 'ᲇ', 'ᲈ', 'ẖ', 'ẗ', 'ẘ', 'ẙ', 'ẚ', 'ẛ', 'ẞ', 'ὐ', 'ὒ', 'ὔ', 'ὖ', 'ᾀ', 'ᾁ', 'ᾂ', 'ᾃ', 'ᾄ', 'ᾅ', 'ᾆ', 'ᾇ', 'ᾐ', 'ᾑ', 'ᾒ', 'ᾓ', 'ᾔ', 'ᾕ', 'ᾖ', 'ᾗ', 'ᾠ', 'ᾡ', 'ᾢ', 'ᾣ', 'ᾤ', 'ᾥ', 'ᾦ', 'ᾧ', 'ᾲ', 'ᾳ', 'ᾴ', 'ᾶ', 'ᾷ', 'ι', 'ῂ', 'ῃ', 'ῄ', 'ῆ', 'ῇ', 'ῒ', 'ΐ', 'ῖ', 'ῗ', 'ῢ', 'ΰ', 'ῤ', 'ῦ', 'ῧ', 'ῲ', 'ῳ', 'ῴ', 'ῶ', 'ῷ', 'Ω', 'K', 'Å', 'ff', 'fi', 'fl', 'ffi', 'ffl', 'ſt', 'st', 'ﬓ', 'ﬔ', 'ﬕ', 'ﬖ', 'ﬗ']

Let's check them out:

s1 = 'µ'
s2 = s1.swapcase().swapcase()

s1 == s2

False

s1 = 'ß'
s2 = s1.swapcase().swapcase()

s1 == s2

False

s1 = 'ﬗ'
s2 = s1.swapcase().swapcase()

s1 == s2

False