Is there a way to use PhantomJS in Python?

I want to use PhantomJS in Python. I googled this problem but couldn’t find proper solutions.

I find os.popen() may be a good choice. But I couldn’t pass some arguments to it.

Using subprocess.Popen() may be a proper solution for now. I want to know whether there’s a better solution or not.

Is there a way to use PhantomJS in Python?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The easiest way to use PhantomJS in python is via Selenium. The simplest installation method is

  1. Install NodeJS
  2. Using Node’s package manager install phantomjs: npm -g install phantomjs-prebuilt
  3. install selenium (in your virtualenv, if you are using that)

After installation, you may use phantom as simple as:

from selenium import webdriver

driver = webdriver.PhantomJS() # or add to your PATH
driver.set_window_size(1024, 768) # optional
driver.get('https://google.com/')
driver.save_screenshot('screen.png') # save a screenshot to disk
sbtn = driver.find_element_by_css_selector('button.gbqfba')
sbtn.click()

If your system path environment variable isn’t set correctly, you’ll need to specify the exact path as an argument to webdriver.PhantomJS(). Replace this:

driver = webdriver.PhantomJS() # or add to your PATH

… with the following:

driver = webdriver.PhantomJS(executable_path='/usr/local/lib/node_modules/phantomjs/lib/phantom/bin/phantomjs')

References:

Method 2

PhantomJS recently dropped Python support altogether. However, PhantomJS now embeds Ghost Driver.

A new project has since stepped up to fill the void: ghost.py. You probably want to use that instead:

from ghost import Ghost
ghost = Ghost()

with ghost.start() as session:
    page, extra_resources = ghost.open("http://jeanphi.me")
    assert page.http_status==200 and 'jeanphix' in ghost.content

Method 3

Now since the GhostDriver comes bundled with the PhantomJS, it has become even more convenient to use it through Selenium.

I tried the Node installation of PhantomJS, as suggested by Pykler, but in practice I found it to be slower than the standalone installation of PhantomJS. I guess standalone installation didn’t provided these features earlier, but as of v1.9, it very much does so.

  1. Install PhantomJS (http://phantomjs.org/download.html) (If you are on Linux, following instructions will help https://stackoverflow.com/a/14267295/382630)
  2. Install Selenium using pip.

Now you can use like this

import selenium.webdriver
driver = selenium.webdriver.PhantomJS()
driver.get('http://google.com')
# do some processing

driver.quit()

Method 4

Here’s how I test javascript using PhantomJS and Django:

mobile/test_no_js_errors.js:

var page = require('webpage').create(),
    system = require('system'),
    url = system.args[1],
    status_code;

page.onError = function (msg, trace) {
    console.log(msg);
    trace.forEach(function(item) {
        console.log('  ', item.file, ':', item.line);
    });
};

page.onResourceReceived = function(resource) {
    if (resource.url == url) {
        status_code = resource.status;
    }
};

page.open(url, function (status) {
    if (status == "fail" || status_code != 200) {
        console.log("Error: " + status_code + " for url: " + url);
        phantom.exit(1);
    }
    phantom.exit(0);
});

mobile/tests.py:

import subprocess
from django.test import LiveServerTestCase

class MobileTest(LiveServerTestCase):
    def test_mobile_js(self):
        args = ["phantomjs", "mobile/test_no_js_errors.js", self.live_server_url]
        result = subprocess.check_output(args)
        self.assertEqual(result, "")  # No result means no error

Run tests:

manage.py test mobile

Method 5

The answer by @Pykler is great but the Node requirement is outdated. The comments in that answer suggest the simpler answer, which I’ve put here to save others time:

  1. Install PhantomJS

    As @Vivin-Paliath points out, it’s a standalone project, not part of Node.

    Mac:

    brew install phantomjs

    Ubuntu:

    sudo apt-get install phantomjs

    etc

  2. Set up a virtualenv (if you haven’t already):
    virtualenv mypy  # doesn't have to be "mypy". Can be anything.
    . mypy/bin/activate

    If your machine has both Python 2 and 3 you may need run virtualenv-3.6 mypy or similar.

  3. Install selenium:
    pip install selenium
  4. Try a simple test, like this borrowed from the docs:
    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    
    driver = webdriver.PhantomJS()
    driver.get("http://www.python.org")
    assert "Python" in driver.title
    elem = driver.find_element_by_name("q")
    elem.clear()
    elem.send_keys("pycon")
    elem.send_keys(Keys.RETURN)
    assert "No results found." not in driver.page_source
    driver.close()

Method 6

this is what I do, python3.3. I was processing huge lists of sites, so failing on the timeout was vital for the job to run through the entire list.

command = "phantomjs --ignore-ssl-errors=true "+<your js file for phantom>
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)

# make sure phantomjs has time to download/process the page
# but if we get nothing after 30 sec, just move on
try:
    output, errors = process.communicate(timeout=30)
except Exception as e:
    print("ttException: %s" % e)
    process.kill()

# output will be weird, decode to utf-8 to save heartache
phantom_output = ''
for out_line in output.splitlines():
    phantom_output += out_line.decode('utf-8')

Method 7

If using Anaconda, install with:

conda install PhantomJS

in your script:

from selenium import webdriver
driver=webdriver.PhantomJS()

works perfectly.

Method 8

In case you are using Buildout, you can easily automate the installation processes that Pykler describes using the gp.recipe.node recipe.

[nodejs]
recipe = gp.recipe.node
version = 0.10.32
npms = phantomjs
scripts = phantomjs

That part installs node.js as binary (at least on my system) and then uses npm to install PhantomJS. Finally it creates an entry point bin/phantomjs, which you can call the PhantomJS webdriver with. (To install Selenium, you need to specify it in your egg requirements or in the Buildout configuration.)

driver = webdriver.PhantomJS('bin/phantomjs')


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x