Submitting a post request to an aspx page

I have an ASPX page at https://searchlight.cluen.com/E5/CandidateSearch.aspx with a form on it, that I’d like to submit and parse for information.

Using Python’s urllib and urllib2 I created a post request with the proper headers and user agent. But the resulting html response does not contain the expected table of results. Am I misunderstanding or am I missing any obvious details?

    import urllib
    import urllib2

    headers = {
        'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.13)         Gecko/2009073022 Firefox/3.0.13',
        'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml; q=0.9,*/*; q=0.8',
        'Content-Type': 'application/x-www-form-urlencoded'
    }
    # obtained these values from viewing the source of https://searchlight.cluen.com/E5/CandidateSearch.aspx
    viewstate = '/wEPDwULLTE3NTc4MzQwNDIPZBYCAg ... uJRWDs/6Ks1FECco='
    eventvalidation = '/wEWjQMC8pat6g4C77jgxg0CzoqI8wgC3uWinQQCwr/ ... oPKYVeb74='
    url = 'https://searchlight.cluen.com/E5/CandidateSearch.aspx'
    formData = (
        ('__VIEWSTATE', viewstate),
        ('__EVENTVALIDATION', eventvalidation),
        ('__EVENTTARGET',''),
        ('__EVENTARGUMENT',''),
        ('textcity',''),
        ('dropdownlistposition',''),
        ('dropdownlistdepartment',''),
        ('dropdownlistorderby',''),
        ('textsearch',''),
    )

    # change user agent
    from urllib import FancyURLopener
    class MyOpener(FancyURLopener):
        version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127         Firefox/2.0.0.11'

    myopener = MyOpener()

    # encode form data in post-request format
    encodedFields = urllib.urlencode(formData)

    f = myopener.open(url, encodedFields)
    print f.info()

    try:
      fout = open('tmp.htm', 'w')
    except:
      print('Could not open output filen')

    fout.writelines(f.readlines())
    fout.close()

There are several questions on this topic that were helpful (such as how to submit query to .aspx page in python) but I’m stuck on this and asking for additional help, if that is possible.

The resulting html page is saying I may need to log in, but the aspx page displays in my browser without any login.

Here are the results from info():

Connection: close
Date: Tue, 07 Jun 2011 17:05:26 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 1944

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

ASP.Net uses a security feature that protects against tampering with the ViewState by embedding specific information in it.

More than likely, the server is rejecting your request because the ViewState is being treated as though it were tampered with. I can’t say this with absolute certainty, but ASP.Net has several security features that are built in to the framework that may be preventing a direct post.

If session is involved at all, then you will also need to take that into account. To simulate what the browser is doing you will need to perform the following steps:

  1. Request the page.
  2. Save the collection of cookies to a variable.
  3. Extract the ViewState to a variable.
  4. Post with the appropriate form values, passing both the saved cookies and ViewState information along with the request.

A lot of work I know, but not too awfully difficult. Again, this may not be the sole source of your problems, but it is worth reading up on in order to start troubleshooting.

Method 2

I tried mechanize and urllib2, and mechanize handles cookies better. I can submit the form simply by specifying with mechanize:

    browser= mechanize.Browser()
    browser.select_form(form_name)
    browser.set_value("Page$Next", name="pagenumber")

It was not necessary to replicate the post request manually, and mechanize in this case was able to handle a form that relies on javascript.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x