I’m trying to do a HEAD request of a page using Python 2.
I am trying
import misc_urllib2 ..... opender = urllib2.build_opener([misc_urllib2.MyHTTPRedirectHandler(), misc_urllib2.HeadRequest()])
with misc_urllib2.py containing
class HeadRequest(urllib2.Request):
def get_method(self):
return "HEAD"
class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler):
def __init__ (self):
self.redirects = []
def http_error_301(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_301(
self, req, fp, code, msg, headers)
result.redirect_code = code
return result
http_error_302 = http_error_303 = http_error_307 = http_error_301
But I am getting
TypeError: __init__() takes at least 2 arguments (1 given)
If I just do
opender = urllib2.build_opener(misc_urllib2.MyHTTPRedirectHandler())
then it works fine
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
This works just fine:
import urllib2
request = urllib2.Request('http://localhost:8080')
request.get_method = lambda : 'HEAD'
response = urllib2.urlopen(request)
print response.info()
Tested with quick and dirty HTTPd hacked in python:
Server: BaseHTTP/0.3 Python/2.6.6 Date: Sun, 12 Dec 2010 11:52:33 GMT Content-type: text/html X-REQUEST_METHOD: HEAD
I’ve added a custom header field X-REQUEST_METHOD to show it works 🙂
Here is HTTPd log:
Sun Dec 12 12:52:28 2010 Server Starts - localhost:8080 localhost.localdomain - - [12/Dec/2010 12:52:33] "HEAD / HTTP/1.1" 200 -
Edit: there is also httplib2
import httplib2
h = httplib2.Http()
resp = h.request("http://www.google.com", 'HEAD')
Method 2
Try httplib
>>> import httplib
>>> conn = httplib.HTTPConnection("www.google.com")
>>> conn.request("HEAD", "/index.html")
>>> res = conn.getresponse()
>>> print res.status, res.reason
200 OK
>>> print res.getheaders()
[('content-length', '0'), ('expires', '-1'), ('server', 'gws'), ('cache-control', 'private, max-age=0'), ('date', 'Sat, 20 Sep 2008 06:43:36 GMT'), ('content-type', 'text/html; charset=ISO-8859-1')]
See How do you send a HEAD HTTP request in Python 2?
Method 3
The problem lies with your class HeadRequest, which inherits from urllib2.Request. According to doc, urllib2.Request.__init__ signature is
__init__(self, url, data=None, headers={}, origin_req_host=None, unverifiable=False)
so you must pass an url argument to it. In your second try, you just do not use HeadRequest, this is why it works.
Method 4
you shoud not add HeadRequest to build_opener or add_handler it should be called like this
opener = urllib2.build_opener(MyHTTPRedirectHandler) response = opener.open(HeadRequest(url)) print response.getheaders()
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0