If I do
url = "http://example.com?p=" + urllib.quote(query)
- It doesn’t encode
/to%2F(breaks OAuth normalization) - It doesn’t handle Unicode (it throws an exception)
Is there a better library?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Python 2
From the documentation:
urllib.quote(string[, safe])
Replace special characters in string
using the %xx escape. Letters, digits,
and the characters ‘_.-‘ are never
quoted. By default, this function is
intended for quoting the path section
of the URL.The optional safe parameter
specifies additional characters that
should not be quoted — its default
value is ‘/’
That means passing '' for safe will solve your first issue:
>>> urllib.quote('/test')
'/test'
>>> urllib.quote('/test', safe='')
'%2Ftest'
About the second issue, there is a bug report about it. Apparently it was fixed in Python 3. You can workaround it by encoding as UTF-8 like this:
>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller
By the way, have a look at urlencode.
Python 3
In Python 3, the function quote has been moved to urllib.parse:
>>> import urllib.parse
>>> print(urllib.parse.quote("Müller".encode('utf8')))
M%C3%BCller
>>> print(urllib.parse.unquote("M%C3%BCller"))
Müller
Method 2
In Python 3, urllib.quote has been moved to urllib.parse.quote, and it does handle Unicode by default.
>>> from urllib.parse import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
>>> quote('/El Niño/')
'/El%20Ni%C3%B1o/'
Method 3
I think module requests is much better. It’s based on urllib3.
You can try this:
>>> from requests.utils import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
My answer is similar to Paolo’s answer.
Method 4
If you’re using Django, you can use urlquote:
>>> from django.utils.http import urlquote >>> urlquote(u"Müller") u'M%C3%BCller'
Note that changes to Python mean that this is now a legacy wrapper. From the Django 2.1 source code for django.utils.http:
A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)
Method 5
It is better to use urlencode here. There isn’t much difference for a single parameter, but, IMHO, it makes the code clearer. (It looks confusing to see a function quote_plus! – especially those coming from other languages.)
In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'
In [22]: val=34
In [23]: from urllib.parse import urlencode
In [24]: encoded = urlencode(dict(p=query,val=val))
In [25]: print(f"http://example.com?{encoded}")
http://example.com?p=lskdfj%2Fsdfkjdf%2Fksdfj+skfj&val=34
Documentation
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0