The Python Oracle

Given a URL to a text file, what is the simplest way to read the contents of the text file?

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzling Curiosities

--

Chapters
00:00 Given A Url To A Text File, What Is The Simplest Way To Read The Contents Of The Text File?
00:24 Answer 1 Score 13
00:31 Accepted Answer Score 160
01:27 Answer 3 Score 29
01:39 Answer 4 Score 69
02:02 Thank you

--

Full question
https://stackoverflow.com/questions/1393...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python

#avk47



ACCEPTED ANSWER

Score 160


Edit 09/2016: In Python 3 and up use urllib.request instead of urllib2

Actually the simplest way is:

import urllib2  # the lib that handles the url stuff

data = urllib2.urlopen(target_url) # it's a file like object and works just like a file
for line in data: # files are iterable
    print line

You don't even need "readlines", as Will suggested. You could even shorten it to: *

import urllib2

for line in urllib2.urlopen(target_url):
    print line

But remember in Python, readability matters.

However, this is the simplest way but not the safe way because most of the time with network programming, you don't know if the amount of data to expect will be respected. So you'd generally better read a fixed and reasonable amount of data, something you know to be enough for the data you expect but will prevent your script from been flooded:

import urllib2

data = urllib2.urlopen("http://www.google.com").read(20000) # read only 20 000 chars
data = data.split("\n") # then split it into lines

for line in data:
    print line

* Second example in Python 3:

import urllib.request  # the lib that handles the url stuff

for line in urllib.request.urlopen(target_url):
    print(line.decode('utf-8')) #utf-8 or iso8859-1 or whatever the page encoding scheme is



ANSWER 2

Score 69


I'm a newbie to Python and the offhand comment about Python 3 in the accepted solution was confusing. For posterity, the code to do this in Python 3 is

import urllib.request
data = urllib.request.urlopen(target_url)

for line in data:
    ...

or alternatively

from urllib.request import urlopen
data = urlopen(target_url)

Note that just import urllib does not work.




ANSWER 3

Score 29


There's really no need to read line-by-line. You can get the whole thing like this:

import urllib
txt = urllib.urlopen(target_url).read()



ANSWER 4

Score 13


import urllib2
for line in urllib2.urlopen("http://www.myhost.com/SomeFile.txt"):
    print line