The Python Oracle

Let JSON object accept bytes or let urlopen output strings

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Meditation

--

Chapters
00:00 Let Json Object Accept Bytes Or Let Urlopen Output Strings
00:49 Accepted Answer Score 81
01:24 Answer 2 Score 103
01:41 Answer 3 Score 67
01:55 Answer 4 Score 20
02:05 Thank you

--

Full question
https://stackoverflow.com/questions/6862...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #json #python3x #encoding #urlopen

#avk47



ANSWER 1

Score 103


Python’s wonderful standard library to the rescue…

import codecs

reader = codecs.getreader("utf-8")
obj = json.load(reader(response))

Works with both py2 and py3.

Docs: Python 2, Python3




ACCEPTED ANSWER

Score 81


HTTP sends bytes. If the resource in question is text, the character encoding is normally specified, either by the Content-Type HTTP header or by another mechanism (an RFC, HTML meta http-equiv,...).

urllib should know how to encode the bytes to a string, but it's too naïve—it's a horribly underpowered and un-Pythonic library.

Dive Into Python 3 provides an overview about the situation.

Your "work-around" is fine—although it feels wrong, it's the correct way to do it.




ANSWER 3

Score 67


I have come to opinion that the question is the best answer :)

import json
from urllib.request import urlopen

response = urlopen("site.com/api/foo/bar").read().decode('utf8')
obj = json.loads(response)



ANSWER 4

Score 20


For anyone else trying to solve this using the requests library:

import json
import requests

r = requests.get('http://localhost/index.json')
r.raise_for_status()
# works for Python2 and Python3
json.loads(r.content.decode('utf-8'))