I want to use PhantomJS in Python. I googled this problem but couldn’t find proper solutions.
I find os.popen() may be a good choice. But I couldn’t pass some arguments to it.
Using subprocess.Popen() may be a proper solution for now. I want to know whether there’s a better solution or not.
Is there a way to use PhantomJS in Python?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
The easiest way to use PhantomJS in python is via Selenium. The simplest installation method is
- Install NodeJS
- Using Node’s package manager install phantomjs:
npm -g install phantomjs-prebuilt - install selenium (in your virtualenv, if you are using that)
After installation, you may use phantom as simple as:
from selenium import webdriver
driver = webdriver.PhantomJS() # or add to your PATH
driver.set_window_size(1024, 768) # optional
driver.get('https://google.com/')
driver.save_screenshot('screen.png') # save a screenshot to disk
sbtn = driver.find_element_by_css_selector('button.gbqfba')
sbtn.click()
If your system path environment variable isn’t set correctly, you’ll need to specify the exact path as an argument to webdriver.PhantomJS(). Replace this:
driver = webdriver.PhantomJS() # or add to your PATH
… with the following:
driver = webdriver.PhantomJS(executable_path='/usr/local/lib/node_modules/phantomjs/lib/phantom/bin/phantomjs')
References:
- http://selenium-python.readthedocs.io/
- How do I set a proxy for phantomjs/ghostdriver in python webdriver?
- https://dzone.com/articles/python-testing-phantomjs
Method 2
PhantomJS recently dropped Python support altogether. However, PhantomJS now embeds Ghost Driver.
A new project has since stepped up to fill the void: ghost.py. You probably want to use that instead:
from ghost import Ghost
ghost = Ghost()
with ghost.start() as session:
page, extra_resources = ghost.open("http://jeanphi.me")
assert page.http_status==200 and 'jeanphix' in ghost.content
Method 3
Now since the GhostDriver comes bundled with the PhantomJS, it has become even more convenient to use it through Selenium.
I tried the Node installation of PhantomJS, as suggested by Pykler, but in practice I found it to be slower than the standalone installation of PhantomJS. I guess standalone installation didn’t provided these features earlier, but as of v1.9, it very much does so.
- Install PhantomJS (http://phantomjs.org/download.html) (If you are on Linux, following instructions will help https://stackoverflow.com/a/14267295/382630)
- Install Selenium using pip.
Now you can use like this
import selenium.webdriver
driver = selenium.webdriver.PhantomJS()
driver.get('http://google.com')
# do some processing
driver.quit()
Method 4
Here’s how I test javascript using PhantomJS and Django:
mobile/test_no_js_errors.js:
var page = require('webpage').create(),
system = require('system'),
url = system.args[1],
status_code;
page.onError = function (msg, trace) {
console.log(msg);
trace.forEach(function(item) {
console.log(' ', item.file, ':', item.line);
});
};
page.onResourceReceived = function(resource) {
if (resource.url == url) {
status_code = resource.status;
}
};
page.open(url, function (status) {
if (status == "fail" || status_code != 200) {
console.log("Error: " + status_code + " for url: " + url);
phantom.exit(1);
}
phantom.exit(0);
});
mobile/tests.py:
import subprocess
from django.test import LiveServerTestCase
class MobileTest(LiveServerTestCase):
def test_mobile_js(self):
args = ["phantomjs", "mobile/test_no_js_errors.js", self.live_server_url]
result = subprocess.check_output(args)
self.assertEqual(result, "") # No result means no error
Run tests:
manage.py test mobile
Method 5
The answer by @Pykler is great but the Node requirement is outdated. The comments in that answer suggest the simpler answer, which I’ve put here to save others time:
-
Install PhantomJS
As @Vivin-Paliath points out, it’s a standalone project, not part of Node.
Mac:
brew install phantomjs
Ubuntu:
sudo apt-get install phantomjs
etc
-
Set up a
virtualenv(if you haven’t already):virtualenv mypy # doesn't have to be "mypy". Can be anything. . mypy/bin/activate
If your machine has both Python 2 and 3 you may need run
virtualenv-3.6 mypyor similar. -
Install selenium:
pip install selenium
-
Try a simple test, like this borrowed from the docs:
from selenium import webdriver from selenium.webdriver.common.keys import Keys driver = webdriver.PhantomJS() driver.get("http://www.python.org") assert "Python" in driver.title elem = driver.find_element_by_name("q") elem.clear() elem.send_keys("pycon") elem.send_keys(Keys.RETURN) assert "No results found." not in driver.page_source driver.close()
Method 6
this is what I do, python3.3. I was processing huge lists of sites, so failing on the timeout was vital for the job to run through the entire list.
command = "phantomjs --ignore-ssl-errors=true "+<your js file for phantom>
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
# make sure phantomjs has time to download/process the page
# but if we get nothing after 30 sec, just move on
try:
output, errors = process.communicate(timeout=30)
except Exception as e:
print("ttException: %s" % e)
process.kill()
# output will be weird, decode to utf-8 to save heartache
phantom_output = ''
for out_line in output.splitlines():
phantom_output += out_line.decode('utf-8')
Method 7
If using Anaconda, install with:
conda install PhantomJS
in your script:
from selenium import webdriver driver=webdriver.PhantomJS()
works perfectly.
Method 8
In case you are using Buildout, you can easily automate the installation processes that Pykler describes using the gp.recipe.node recipe.
[nodejs] recipe = gp.recipe.node version = 0.10.32 npms = phantomjs scripts = phantomjs
That part installs node.js as binary (at least on my system) and then uses npm to install PhantomJS. Finally it creates an entry point bin/phantomjs, which you can call the PhantomJS webdriver with. (To install Selenium, you need to specify it in your egg requirements or in the Buildout configuration.)
driver = webdriver.PhantomJS('bin/phantomjs')
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0