The Python Oracle

MySQL "incorrect string value" error when save unicode string in Django

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: The Builders

--

Chapters
00:00 Mysql &Quot;Incorrect String Value&Quot; Error When Save Unicode String In Django
01:55 Answer 1 Score 172
03:33 Answer 2 Score 127
04:01 Answer 3 Score 72
04:33 Answer 4 Score 28
04:50 Accepted Answer Score 8
05:18 Thank you

--

Full question
https://stackoverflow.com/questions/2108...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #mysql #django #unicode #utf8

#avk47



ANSWER 1

Score 172


None of these answers solved the problem for me. The root cause being:

You cannot store 4-byte characters in MySQL with the utf-8 character set.

MySQL has a 3 byte limit on utf-8 characters (yes, it's wack, nicely summed up by a Django developer here)

To solve this you need to:

  1. Change your MySQL database, table and columns to use the utf8mb4 character set (only available from MySQL 5.5 onwards)
  2. Specify the charset in your Django settings file as below:

settings.py

DATABASES = {
    'default': {
        'ENGINE':'django.db.backends.mysql',
        ...
        'OPTIONS': {'charset': 'utf8mb4'},
    }
}

Note: When recreating your database you may run into the 'Specified key was too long' issue.

The most likely cause is a CharField which has a max_length of 255 and some kind of index on it (e.g. unique). Because utf8mb4 uses 33% more space than utf-8 you'll need to make these fields 33% smaller.

In this case, change the max_length from 255 to 191.

Alternatively you can edit your MySQL configuration to remove this restriction but not without some django hackery

UPDATE: I just ran into this issue again and ended up switching to PostgreSQL because I was unable to reduce my VARCHAR to 191 characters.




ANSWER 2

Score 127


I had the same problem and resolved it by changing the character set of the column. Even though your database has a default character set of utf-8 I think it's possible for database columns to have a different character set in MySQL. Here's the SQL QUERY I used:

    ALTER TABLE database.table MODIFY COLUMN col VARCHAR(255)
    CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;



ANSWER 3

Score 72


If you have this problem here's a python script to change all the columns of your mysql database automatically.

#! /usr/bin/env python
import MySQLdb

host = "localhost"
passwd = "passwd"
user = "youruser"
dbname = "yourdbname"

db = MySQLdb.connect(host=host, user=user, passwd=passwd, db=dbname)
cursor = db.cursor()

cursor.execute("ALTER DATABASE `%s` CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci'" % dbname)

sql = "SELECT DISTINCT(table_name) FROM information_schema.columns WHERE table_schema = '%s'" % dbname
cursor.execute(sql)

results = cursor.fetchall()
for row in results:
  sql = "ALTER TABLE `%s` convert to character set DEFAULT COLLATE DEFAULT" % (row[0])
  cursor.execute(sql)
db.close()



ACCEPTED ANSWER

Score 8


I just figured out one method to avoid above errors.

Save to database

user.first_name = u'Rytis'.encode('unicode_escape')
user.last_name = u'Slatkevičius'.encode('unicode_escape')
user.save()
>>> SUCCEED

print user.last_name
>>> Slatkevi\u010dius
print user.last_name.decode('unicode_escape')
>>> Slatkevičius

Is this the only method to save strings like that into a MySQL table and decode it before rendering to templates for display?