Want to grab data from a website? Here it’s the way to go!

Use case:

  • Share prices

If you want to get share prices you often find solutions using the Yahoo Finance API.

The problem with that is: You will not find share prices in Euro and (as far as I know) only NYSE, AMEX and NASDAQ stock exchange is supported.

So let’s say you want to grab that data from a website.

Just install PhantomJS - a headless WebKit which is really cool. You can install it from source - this takes a few hours but was at least working for me on a RaspberryPi 2.
A much faster solution is described here.

Now let’s say we want to get the share price of Google Inc. A in Euro from Tradegate (a german stock exchange).

Here we get this information:
http://kurse.boerse.ard.de/ard/kurse_einzelkurs_uebersicht.htn?i=47417135

By reading the page source or - even easier with Firefox Developer Tools -> Inspector we can easily get the class name of this html element:

<span class="leftfloat big" title="aktueller Wert">
     507,90 €
</span>

So ‘leftfloat big’ is what we need.

Let’s create a simple JavaScript we can use with PhantomJS

var page = require('webpage').create();
page.open('http://kurse.boerse.ard.de/ard/kurse_einzelkurs_uebersicht.htn?i=5864734',function () {
  var returnvalue = page.evaluate(function () {
        return document.getElementsByClassName('leftfloat big')[0].innerHTML;
    });

try{
        var rawValue = returnvalue.replace(/&nbsp;/g,'')
        rawValue = rawValue.replace(/[^\u000A\u0020-\u007E]/g, ' ');
        rawValue = rawValue.replace(/,/g, '.');
        console.log(rawValue);
        }catch(e){
        console.error(e);
         }
    phantom.exit();
});

Check if it’s working:

phantomjs /pathtoyourscript.js
507.90

Now let’s create a ShellSensor device for pimatic to display (and record!) the share price every 5 minutes:

  {
      "id": "share-google",
      "name": "Google Class A",
      "class": "ShellSensor",
      "attributeName": "price",
      "attributeType": "number",
      "attributeUnit": "Euro",
      "command": "phantomjs /pathtoyourscript.js",
      "interval": 300000
    }

Of course please check in advance the terms of usage of the website you are using!

You are welcome to post any other useful ideas for PhantomJS.
Guess there are a lot!

Cheers,

DerIng