The Python Oracle

urlib.request.urlopen not accepting query string with spaces

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puddle Jumping Looping

--

Chapters
00:00 Urlib.Request.Urlopen Not Accepting Query String With Spaces
01:24 Accepted Answer Score 6
01:53 Answer 2 Score 5
02:34 Answer 3 Score 1
03:02 Answer 4 Score 0
04:18 Thank you

--

Full question
https://stackoverflow.com/questions/4121...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #python3x

#avk47



ACCEPTED ANSWER

Score 6


urllib accepts it, the server doesn't. And well it should not, because a space is not a valid URL character.

Escape your query string properly with urllib.parse.quote_plus(); it'll ensure your string is valid for use in query parameters. Or better still, use the urllib.parse.urlencode() function to encode all key-value pairs:

from urllib.parse import urlencode

params = urlencode({'q': text_to_read})
connection = urllib.request.urlopen(f"http://www.wdylike.appspot.com/?{params}")



ANSWER 2

Score 5


The below response is for python 3.* 400 Bad request occurs when there is space within your input text. To avoid this use parse. so import it.

from urllib import request, parse

If you are sending any text along with the url then parse the text.

url = "http://www.wdylike.appspot.com/?q="
url = url + parse.quote(input_to_check) 

Check the explanation here - https://discussions.udacity.com/t/problem-in-profanity-with-python-3-solved/227328

The Udacity profanity checker program -

from urllib import request, parse

def read_file():
    fhand = open(r"E:\Python_Programming\Udacity\movie_quotes.txt")
    file_content = fhand.read()
    #print (file_content)
    fhand.close()
    profanity_check(file_content)

def profanity_check(input_to_check):
    url = "http://www.wdylike.appspot.com/?q="
    url = url + parse.quote(input_to_check)
    req = request.urlopen(url)
    answer = req.read()
    #print(answer)
    req.close()

    if b"true" in answer:
        print ("Profanity Alret!!!")
    else:
        print ("Nothing to worry")


read_file()



ANSWER 3

Score 1


I think this code is closer to what the Lesson was aiming to, inferencing the difference between native functions, classes and functions inside classes:

from urllib import request, parse

def read_text():
    quotes = open('C:/Users/Alejandro/Desktop/movie_quotes.txt', 'r+')
    contents_of_file = quotes.read()
    print(contents_of_file)
    check_profanity(contents_of_file)
    quotes.close()

def check_profanity(text_to_check):
    connection = request.urlopen('http://www.wdylike.appspot.com/?q=' + parse.quote(text_to_check))
    output = connection.read()
    # print(output)
    connection.close()

    if b"true" in output:
        print("Profanity Alert!!!")
    elif b"false" in output:
        print("This document has no curse words!")
    else:
        print("Could not scan the document properly")

read_text()



ANSWER 4

Score 0


I'm working on the same project also using Python 3 like the most.

While looking for the solution in Python 3, I found this HowTo, and I decided to give it a try.

It seems that on some websites, including Google, connections through programming code (for example, via the urllib module), sometimes does not work properly. Apparently this has to do with the User Agent, which is recieved by the website when building the connection.

I did some further researches and came up with the following solution:

First I imported URLopener from urllib.request and created a class called ForceOpen as a subclass of URLopener.

Now I could create a "regular" User Agent by setting the variable version inside the ForceOpen class. Then just created an instance of it and used the open method in place of urlopen to open the URL.

(It works fine, but I'd still appreciate comments, suggestions or any feedback, also because I'm not absolute sure, if this way is a good alternative - many thanks)


from urllib.request import URLopener


class ForceOpen(URLopener):  # create a subclass of URLopener
    version = "Mozilla/5.0 (cmp; Konqueror ...)(Kubuntu)"

force_open = ForceOpen()  # create an instance of it


def read_text():
    quotes = open(
        "/.../profanity_editor/data/quotes.txt"
    )
    contents_of_file = quotes.read()
    print(contents_of_file)
    quotes.close()
    check_profanity(contents_of_file)


def check_profanity(text_to_check):
    # now use the open method to open the URL
    connection = force_open.open(
        "http://www.wdylike.appspot.com/?q=" + text_to_check
    )
    output = connection.read()
    connection.close()

    if b"true" in output:
        print("Attention! Curse word(s) have been detected.")

    elif b"false" in output:
        print("No curse word(s) found.")

    else:
        print("Error! Unable to scan document.")


read_text()