I open urls with:
site = urllib2.urlopen('http://google.com')
And what I want to do is connect the same way with a proxy
I got somewhere telling me:
site = urllib2.urlopen('http://google.com', proxies={'http':'127.0.0.1'})
but that didn’t work either.
I know urllib2 has something like a proxy handler, but I can’t recall that function.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
proxy = urllib2.ProxyHandler({'http': '127.0.0.1'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
Method 2
You have to install a ProxyHandler
urllib2.install_opener(
urllib2.build_opener(
urllib2.ProxyHandler({'http': '127.0.0.1'})
)
)
urllib2.urlopen('http://www.google.com')
Method 3
You can set proxies using environment variables.
import os os.environ['http_proxy'] = '127.0.0.1' os.environ['https_proxy'] = '127.0.0.1'
urllib2 will add proxy handlers automatically this way. You need to set proxies for different protocols separately otherwise they will fail (in terms of not going through proxy), see below.
For example:
proxy = urllib2.ProxyHandler({'http': '127.0.0.1'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
# next line will fail (will not go through the proxy) (https)
urllib2.urlopen('https://www.google.com')
Instead
proxy = urllib2.ProxyHandler({
'http': '127.0.0.1',
'https': '127.0.0.1'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
# this way both http and https requests go through the proxy
urllib2.urlopen('http://www.google.com')
urllib2.urlopen('https://www.google.com')
Method 4
To use the default system proxies (e.g. from the http_support environment variable), the following works for the current request (without installing it into urllib2 globally):
url = 'http://www.example.com/' proxy = urllib2.ProxyHandler() opener = urllib2.build_opener(proxy) in_ = opener.open(url) in_.read()
Method 5
In Addition to the accepted answer:
My scipt gave me an error
File "c:Python23liburllib2.py", line 580, in proxy_open
if '@' in host:
TypeError: iterable argument required
Solution was to add http:// in front of the proxy string:
proxy = urllib2.ProxyHandler({'http': 'http://proxy.xy.z:8080'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
Method 6
One can also use requests if we would like to access a web page using proxies. Python 3 code:
>>> import requests
>>> url = 'http://www.google.com'
>>> proxy = '169.50.87.252:80'
>>> requests.get(url, proxies={"http":proxy})
<Response [200]>
More than one proxies can also be added.
>>> proxy1 = '169.50.87.252:80'
>>> proxy2 = '89.34.97.132:8080'
>>> requests.get(url, proxies={"http":proxy1,"http":proxy2})
<Response [200]>
Method 7
In addition set the proxy for the command line session
Open a command line where you might want to run your script
netsh winhttp set proxy YourProxySERVER:yourProxyPORT
run your script in that terminal.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0