Selenium driver.get (url) wait till full page load. But a scraping page try to load some dead JS script. So my Python script wait for it and doesn’t works few minutes. This problem can be on every pages of a site.
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js
driver.find_element_by_name('ANCHO').send_keys("100")
How to limit the time wait, block AJAX load of a file, or is other way?
Also I test my script in webdriver.Chrome(), but will use PhantomJS(), or probably Firefox(). So, if some method uses a change in browser settings, then it must be universal.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy set to normal. To make Selenium not to wait for full page load we can configure the pageLoadStrategy. pageLoadStrategy supports 3 different values as follows:
normal(full page load)eager(interactive)none
Here is the code block to configure the pageLoadStrategy :
-
Firefox :
from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities caps = DesiredCapabilities().FIREFOX caps["pageLoadStrategy"] = "normal" # complete #caps["pageLoadStrategy"] = "eager" # interactive #caps["pageLoadStrategy"] = "none" driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:pathtogeckodriver.exe') driver.get("http://google.com") -
Chrome :
from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities caps = DesiredCapabilities().CHROME caps["pageLoadStrategy"] = "normal" # complete #caps["pageLoadStrategy"] = "eager" # interactive #caps["pageLoadStrategy"] = "none" driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:pathtochromedriver.exe') driver.get("http://google.com")
Note :
pageLoadStrategyvaluesnormal,eagerandnoneis a requirement as per WebDriver W3C Editor’s Draft butpageLoadStrategyvalue aseageris still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in “Eager” Page Load Strategy workaround for Chromedriver Selenium in Python
Method 2
@undetected Selenium answer works well but for the chrome, part its not working use the below answer for chrome
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities capa = DesiredCapabilities.CHROME capa["pageLoadStrategy"] = "none" browser= webdriver.Chrome(desired_capabilities=capa,executable_path='PATH',options=options)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0