Making HTTP HEAD request with urllib2 from Python 2

I’m trying to do a HEAD request of a page using Python 2.

I am trying

import misc_urllib2
.....
opender = urllib2.build_opener([misc_urllib2.MyHTTPRedirectHandler(), misc_urllib2.HeadRequest()])

with misc_urllib2.py containing

class HeadRequest(urllib2.Request):
    def get_method(self):
        return "HEAD"


class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler):
    def __init__ (self):
        self.redirects = []

    def http_error_301(self, req, fp, code, msg, headers):  
        result = urllib2.HTTPRedirectHandler.http_error_301(
                self, req, fp, code, msg, headers)
        result.redirect_code = code
        return result

    http_error_302 = http_error_303 = http_error_307 = http_error_301

But I am getting

TypeError: __init__() takes at least 2 arguments (1 given)

If I just do

opender = urllib2.build_opener(misc_urllib2.MyHTTPRedirectHandler())

then it works fine

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

This works just fine:

import urllib2
request = urllib2.Request('http://localhost:8080')
request.get_method = lambda : 'HEAD'

response = urllib2.urlopen(request)
print response.info()

Tested with quick and dirty HTTPd hacked in python:

Server: BaseHTTP/0.3 Python/2.6.6
Date: Sun, 12 Dec 2010 11:52:33 GMT
Content-type: text/html
X-REQUEST_METHOD: HEAD

I’ve added a custom header field X-REQUEST_METHOD to show it works 🙂

Here is HTTPd log:

Sun Dec 12 12:52:28 2010 Server Starts - localhost:8080
localhost.localdomain - - [12/Dec/2010 12:52:33] "HEAD / HTTP/1.1" 200 -

Edit: there is also httplib2

import httplib2
h = httplib2.Http()
resp = h.request("http://www.google.com", 'HEAD')

Method 2

Try httplib

>>> import httplib
>>> conn = httplib.HTTPConnection("www.google.com")
>>> conn.request("HEAD", "/index.html")
>>> res = conn.getresponse()
>>> print res.status, res.reason
200 OK
>>> print res.getheaders()
[('content-length', '0'), ('expires', '-1'), ('server', 'gws'), ('cache-control', 'private, max-age=0'), ('date', 'Sat, 20 Sep 2008 06:43:36 GMT'), ('content-type', 'text/html; charset=ISO-8859-1')]

See How do you send a HEAD HTTP request in Python 2?

Method 3

The problem lies with your class HeadRequest, which inherits from urllib2.Request. According to doc, urllib2.Request.__init__ signature is

 __init__(self, url, data=None, headers={}, origin_req_host=None, unverifiable=False)

so you must pass an url argument to it. In your second try, you just do not use HeadRequest, this is why it works.

Method 4

you shoud not add HeadRequest to build_opener or add_handler it should be called like this

opener = urllib2.build_opener(MyHTTPRedirectHandler)
response = opener.open(HeadRequest(url))
print response.getheaders()


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x