The Python Oracle

MySQL "incorrect string value" error when save unicode string in Django

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Luau

--

Chapters
00:00 Question
02:30 Accepted answer (Score 8)
02:57 Answer 2 (Score 169)
04:43 Answer 3 (Score 125)
05:13 Answer 4 (Score 72)
05:50 Thank you

--

Full question
https://stackoverflow.com/questions/2108...

Question links:
http://www.postgresql.org/docs/8.1/inter...

Answer 1 links:
[3 byte limit on utf-8 characters]: http://dev.mysql.com/doc/refman/5.6/en/c...
[nicely summed up by a Django developer here]: https://code.djangoproject.com/ticket/18...
[utf8mb4 character set]: http://dev.mysql.com/doc/refman/5.5/en/c...
[Specified key was too long]: https://code.djangoproject.com/ticket/21...
[edit your MySQL configuration to remove this restriction]: https://code.djangoproject.com/ticket/18...
[switching to PostgreSQL]: https://wiki.postgresql.org/wiki/Things_...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #mysql #django #unicode #utf8

#avk47



ANSWER 1

Score 172


None of these answers solved the problem for me. The root cause being:

You cannot store 4-byte characters in MySQL with the utf-8 character set.

MySQL has a 3 byte limit on utf-8 characters (yes, it's wack, nicely summed up by a Django developer here)

To solve this you need to:

  1. Change your MySQL database, table and columns to use the utf8mb4 character set (only available from MySQL 5.5 onwards)
  2. Specify the charset in your Django settings file as below:

settings.py

DATABASES = {
    'default': {
        'ENGINE':'django.db.backends.mysql',
        ...
        'OPTIONS': {'charset': 'utf8mb4'},
    }
}

Note: When recreating your database you may run into the 'Specified key was too long' issue.

The most likely cause is a CharField which has a max_length of 255 and some kind of index on it (e.g. unique). Because utf8mb4 uses 33% more space than utf-8 you'll need to make these fields 33% smaller.

In this case, change the max_length from 255 to 191.

Alternatively you can edit your MySQL configuration to remove this restriction but not without some django hackery

UPDATE: I just ran into this issue again and ended up switching to PostgreSQL because I was unable to reduce my VARCHAR to 191 characters.




ANSWER 2

Score 127


I had the same problem and resolved it by changing the character set of the column. Even though your database has a default character set of utf-8 I think it's possible for database columns to have a different character set in MySQL. Here's the SQL QUERY I used:

    ALTER TABLE database.table MODIFY COLUMN col VARCHAR(255)
    CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;



ANSWER 3

Score 72


If you have this problem here's a python script to change all the columns of your mysql database automatically.

#! /usr/bin/env python
import MySQLdb

host = "localhost"
passwd = "passwd"
user = "youruser"
dbname = "yourdbname"

db = MySQLdb.connect(host=host, user=user, passwd=passwd, db=dbname)
cursor = db.cursor()

cursor.execute("ALTER DATABASE `%s` CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci'" % dbname)

sql = "SELECT DISTINCT(table_name) FROM information_schema.columns WHERE table_schema = '%s'" % dbname
cursor.execute(sql)

results = cursor.fetchall()
for row in results:
  sql = "ALTER TABLE `%s` convert to character set DEFAULT COLLATE DEFAULT" % (row[0])
  cursor.execute(sql)
db.close()



ACCEPTED ANSWER

Score 8


I just figured out one method to avoid above errors.

Save to database

user.first_name = u'Rytis'.encode('unicode_escape')
user.last_name = u'Slatkevičius'.encode('unicode_escape')
user.save()
>>> SUCCEED

print user.last_name
>>> Slatkevi\u010dius
print user.last_name.decode('unicode_escape')
>>> Slatkevičius

Is this the only method to save strings like that into a MySQL table and decode it before rendering to templates for display?