Sending an ASP.net POST with Python’s Requests

I’m scraping an old ASP.net website using Python’s requests module.

I’ve spent 5+ hours trying to figure out how to simulate this POST request to no avail. Doing it the way I do it below, I essentially get a message saying “No item matches this item reference.”

Any help would be deeply appreciated – here’s the request and my code, a few things are modified out of respect to brevity and/or privacy:

My own code:

import requests

# Scraping the item number from the website, I have confirmed this is working.

#Then use the newly acquired item number to request the data.
item_url = http://www.example.com/EN/items/Pages/yourrates.aspx?vr= + item_number[0]
viewstate = r'/wEPD...' # Truncated for brevity.

# Create the appropriate request and payload.
payload = {"vr": int(item_number[0])}

item_request_body = {
        "__SPSCEditMenu": "true",
        "MSOWebPartPage_PostbackSource": "",
        "MSOTlPn_SelectedWpId": "",
        "MSOTlPn_View": 0,
        "MSOTlPn_ShowSettings": "False",
        "MSOGallery_SelectedLibrary": "",
        "MSOGallery_FilterString": "",
        "MSOTlPn_Button": "none",
        "__EVENTTARGET": "",
        "__EVENTARGUMENT": "",
        "MSOAuthoringConsole_FormContext": "",
        "MSOAC_EditDuringWorkflow": "",
        "MSOSPWebPartManager_DisplayModeName": "Browse",
        "MSOWebPartPage_Shared": "",
        "MSOLayout_LayoutChanges": "",
        "MSOLayout_InDesignMode": "",
        "MSOSPWebPartManager_OldDisplayModeName": "Browse",
        "MSOSPWebPartManager_StartWebPartEditingName": "false",
        "__VIEWSTATE": viewstate,
        "keywords": "Search our site",
        "__CALLBACKID": "ctl00$SPWebPartManager1$g_dbb9e9c7_fe1d_46df_8789_99a6c9db4b22",
        "__CALLBACKPARAM": "startvr"
    }

# Write the appropriate headers for the property information.
item_request_headers = {
    "Host": home_site,
    "Connection": "keep-alive",
    "Content-Length": len(encoded_valuation_request),
    "Cache-Control": "max-age=0",
    "Origin": home_site,
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36",
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
    "Cookie": "__utma=48409910.1174413745.1405662151.1406402487.1406407024.17; __utmb=48409910.7.10.1406407024; __utmc=48409910; __utmz=48409910.1406178827.13.3.utmcsr=ratesandvallandingpage|utmccn=landingpages|utmcmd=button",
    "Accept": "*/*",
    "Referer": valuation_url,
    "Accept-Encoding": "gzip,deflate,sdch",
    "Accept-Language": "en-US,en;q=0.8"
}

    response = requests.post(url=item_url, params=payload, data=item_request_body, headers=item_request_headers)
    print response.text

What Chrome is telling me the request looks like:

Remote Address:202.55.96.131:80
Request URL:http://www.example.com/EN/items/Pages/yourrates.aspx?vr=123456789
Request Method:POST
Status Code:200 OK

Request Headers
Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:21501
Content-Type:application/x-www-form-urlencoded; charset=UTF-8
Cookie:__utma=48409910.1174413745.1405662151.1406402487.1406407024.17; __utmb=48409910.7.10.1406407024; __utmc=48409910; __utmz=48409910.1406178827.13.3.utmcsr=ratesandvallandingpage|utmccn=landingpages|utmcmd=button
Host:www.site.com
Origin:www.site.com
Referer:http://www.example.com/EN/items/Pages/yourrates.aspx?vr=123456789
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36

Query String Parameters
vr:123456789

Form Data
__SPSCEditMenu:true
MSOWebPartPage_PostbackSource:
MSOTlPn_SelectedWpId:
MSOTlPn_View:0
MSOTlPn_ShowSettings:False
MSOGallery_SelectedLibrary:
MSOGallery_FilterString:
MSOTlPn_Button:none
__EVENTTARGET:
__EVENTARGUMENT:
MSOAuthoringConsole_FormContext:
MSOAC_EditDuringWorkflow:
MSOSPWebPartManager_DisplayModeName:Browse
MSOWebPartPage_Shared:
MSOLayout_LayoutChanges:
MSOLayout_InDesignMode:
MSOSPWebPartManager_OldDisplayModeName:Browse
MSOSPWebPartManager_StartWebPartEditingName:false
__VIEWSTATE:/wEPD...(Omitted for length)
keywords:Search our site
__CALLBACKID:ctl00$SPWebPartManager1$g_dbb9e9c7_fe1d_46df_8789_99a6c9db4b22
__CALLBACKPARAM:startvr

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You have too many request parameters, and should not set the content-type, content-length, host, origin, or connection headers; leave those to requests to set.

You are also doubling up the url parameters; either add the vr parameter to the URL manually or use params, not do both.

It may well be that some of the parameters in the POST body are generated by the ASP application tied to a session. I’d use a GET request with a Session object the valuation_url, parse the form in that page to extract the __CALLBACKID parameter. The requests Session will then store any cookies the server sets and reuse those:

item_request_headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36",
    "Accept": "*/*",
    "Accept-Encoding": "gzip,deflate,sdch",
    "Accept-Language": "en-US,en;q=0.8"
}
payload = {"vr": int(item_number[0])}

session = requests.Session(headers=item_request_headers)

# Get form page
form_response = session.get(validation_url, params=payload) 

# parse form page; BeautifulSoup could do this for example
soup = BeautifulSoup(form_response.content)
callbackid = soup.select('input[name=__CALLBACKID]')[0]['value']

item_request_body = {
    "__SPSCEditMenu": "true",
    "MSOWebPartPage_PostbackSource": "",
    "MSOTlPn_SelectedWpId": "",
    "MSOTlPn_View": 0,
    "MSOTlPn_ShowSettings": "False",
    "MSOGallery_SelectedLibrary": "",
    "MSOGallery_FilterString": "",
    "MSOTlPn_Button": "none",
    "__EVENTTARGET": "",
    "__EVENTARGUMENT": "",
    "MSOAuthoringConsole_FormContext": "",
    "MSOAC_EditDuringWorkflow": "",
    "MSOSPWebPartManager_DisplayModeName": "Browse",
    "MSOWebPartPage_Shared": "",
    "MSOLayout_LayoutChanges": "",
    "MSOLayout_InDesignMode": "",
    "MSOSPWebPartManager_OldDisplayModeName": "Browse",
    "MSOSPWebPartManager_StartWebPartEditingName": "false",
    "__VIEWSTATE": viewstate,
    "keywords": "Search our site",
    "__CALLBACKID": callbackid,
    "__CALLBACKPARAM": "startvr"
}

item_url = 'http://www.example.com/EN/items/Pages/yourrates.aspx'

response = session.post(url=item_url, params=payload, data=item_request_body,
                        headers={'Referer': form_response.url})

The session handles the headers (setting a user agent, and accept parameters), only on the POST with the session do we add a referrer header as well.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x