How can I open a website with urllib via proxy in Python?

I have this program that check a website, and I want to know how can I check it via proxy in Python…

this is the code, just for example

while True:
    try:
        h = urllib.urlopen(website)
        break
    except:
        print '['+time.strftime('%Y/%m/%d %H:%M:%S')+'] '+'ERROR. Trying again in a few seconds...'
        time.sleep(5)

Contents hide

Answers:

Method 1

Method 2

Method 3

Method 4

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

By default, urlopen uses the environment variable http_proxy to determine which HTTP proxy to use:

$ export http_proxy='http://myproxy.example.com:1234'
$ python myscript.py  # Using http://myproxy.example.com:1234 as a proxy

If you instead want to specify a proxy inside your application, you can give a proxies argument to urlopen:

proxies = {'http': 'http://myproxy.example.com:1234'}
print("Using HTTP proxy %s" % proxies['http'])
urllib.urlopen("http://www.google.com", proxies=proxies)

Edit: If I understand your comments correctly, you want to try several proxies and print each proxy as you try it. How about something like this?

candidate_proxies = ['http://proxy1.example.com:1234',
                     'http://proxy2.example.com:1234',
                     'http://proxy3.example.com:1234']
for proxy in candidate_proxies:
    print("Trying HTTP proxy %s" % proxy)
    try:
        result = urllib.urlopen("http://www.google.com", proxies={'http': proxy})
        print("Got URL using proxy %s" % proxy)
        break
    except:
        print("Trying next proxy in 5 seconds")
        time.sleep(5)

Method 2

Python 3 is slightly different here. It will try to auto detect proxy settings but if you need specific or manual proxy settings, think about this kind of code:

#!/usr/bin/env python3
import urllib.request

proxy_support = urllib.request.ProxyHandler({'http' : 'http://user:<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="572736242417243225213225">[email protected]</a>:port', 
                                             'https': 'https://...'})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)

with urllib.request.urlopen(url) as response:
    # ... implement things such as 'html = response.read()'

Refer also to the relevant section in the Python 3 docs

Method 3

Here example code guide how to use urllib to connect via proxy:

authinfo = urllib.request.HTTPBasicAuthHandler()

proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})

# build a new opener that adds authentication and caching FTP handlers
opener = urllib.request.build_opener(proxy_support, authinfo,
                                     urllib.request.CacheFTPHandler)

# install it
urllib.request.install_opener(opener)

f = urllib.request.urlopen('http://www.google.com/')
"""

Method 4

For http and https use:

proxies = {'http':'http://proxy-source-ip:proxy-port',
           'https':'https://proxy-source-ip:proxy-port'}

more proxies can be added similarly

proxies = {'http':'http://proxy1-source-ip:proxy-port',
           'http':'http://proxy2-source-ip:proxy-port'
           ...
          }

usage

filehandle = urllib.urlopen( external_url , proxies=proxies)

Don’t use any proxies (in case of links within network)

filehandle = urllib.urlopen(external_url, proxies={})

Use proxies authentication via username and password

proxies = {'http':'http://username:<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c9b9a8bababea6bbad89b9bba6b1b0e4baa6bcbbaaace4a0b9">[email protected]</a>:proxy-port',
           'https':'https://username:<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9dedfceeeeeaf2eff9ddedeff2e5e4b0eef2e8effef8b0f4ed">[email protected]</a>:proxy-port'}

Note: avoid using special characters such as :,@ in username and passwords

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating