I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy address.
Generally urllib works fine, the problem is dealing with urllib2.
>>> urllib2.urlopen("http://www.google.com").read()
returns
urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>
or
urllib2.URLError: <urlopen error [Errno 11004] getaddrinfo failed>
Extra info:
urllib.urlopen(….) works fine! It is just urllib2 that is playing tricks…
I tried @Fenikso answer but I’m getting this error now:
URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
Any ideas?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can do it even without the HTTP_PROXY environment variable. Try this sample:
import urllib2
proxy_support = urllib2.ProxyHandler({"http":"http://61.233.25.166:80"})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
html = urllib2.urlopen("http://www.google.com").read()
print html
In your case it really seems that the proxy server is refusing the connection.
Something more to try:
import urllib2
#proxy = "61.233.25.166:80"
proxy = "YOUR_PROXY_GOES_HERE"
proxies = {"http":"http://%s" % proxy}
url = "http://www.google.com/search?q=test"
headers={'User-agent' : 'Mozilla/5.0'}
proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)
req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html
Edit 2014:
This seems to be a popular question / answer. However today I would use third party requests module instead.
For one request just do:
import requests
r = requests.get("http://www.google.com",
proxies={"http": "http://61.233.25.166:80"})
print(r.text)
For multiple requests use Session object so you do not have to add proxies parameter in all your requests:
import requests
s = requests.Session()
s.proxies = {"http": "http://61.233.25.166:80"}
r = s.get("http://www.google.com")
print(r.text)
Method 2
I recommend you just use the requests module.
It is much easier than the built in http clients:
http://docs.python-requests.org/en/latest/index.html
Sample usage:
r = requests.get('http://www.thepage.com', proxies={"http":"http://myproxy:3129"})
thedata = r.content
Method 3
Just wanted to mention, that you also may have to set the https_proxy OS environment variable in case https URLs need to be accessed.
In my case it was not obvious to me and I tried for hours to discover this.
My use case: Win 7, jython-standalone-2.5.3.jar, setuptools installation via ez_setup.py
Method 4
Python 3:
import urllib.request
htmlsource = urllib.request.FancyURLopener({"http":"http://127.0.0.1:8080"}).open(url).read().decode("utf-8")
Method 5
I encountered this on jython client.
The server was only talking TLS and the client using SSL context.
javax.net.ssl.SSLContext.getInstance("SSL")
Once the client was to TLS, things started working.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0