Websockets with Angular, socket.io and Apache

Seeing as I just spent the better part of the afternoon trying to figure this out, I thought I’d write a quick blog post on how I eventually got this working – both for other people struggling with this and for my own archive since I’ll probably forget again and want to look it up next time I do such a project 🙂

socket.io is a pretty awesome implementation of HTML5 websockets with transparent fallbacks (polyfills) for non-supporting browsers. However, its documentation can be…sketchy. For instance, their examples all assume everything is handled by nodeJS (including your HTML pages), which no doubt some people actually do but if you’re anything like me you’ll prefer to use Apache or nginx or the like in combination with a server side language like PHP for serving up your HTML pages. That’s not documented, ehm, extensively (to use an understatement), and also stuff like authentication isn’t really covered. So this is my solution, many thanks to Google, Stackoverflow and the rest of the interwebz:

1. The node server

socket.io is in the end a node module, so there’s no escaping that. Get it installed:

$ npm install --save socket.io http express

The most basic server script would look something like this:


let http = require('http');
let express = require('express');
let server = http.createServer(app);
let io = require('socket.io').listen(server);
io.on('connection', socket => {
    socket.on('someEvent', function () {
        socket.emit('anotherEvent', {msg: 'Hi!'});
    }
});
server.listen(8080);

Port 8080 is arbirtrary, anything not already in use will do. I usally go for something in a higher range, but this is good enough for the example.

To wrap your head around this: We have an HTTP server created using an Express app with socket.io plugged in, and it listens on port 8080. We can now run it using node path/to/server.js (in real life you’ll want to use something like PM2 for running it, but that’s not the issue at hand here).

2. The Angular client

There’s a bunch of Angular modules for tying sockets into the Angular “digest flow”. I chose this one: https://github.com/btford/angular-socket-io, but others should also work. The client script will be auto-hosted under /socket.io/socket.io.js (let’s ignore our port 8080 for now), so make sure that and the Angular module are loaded in your HTML (in that order, I might add). Adding an Angular factory for your socket is then as simple as


app.factory('Socket', ['socketFactory', socketFactory => {
    if (window.io) {
        let ioSocket = window.io.connect('', {query: ''});
        return socketFactory({ioSocket});
    }
    // I'm sure you can handle this more gracefully:
    return {};
}]);

This uses the defaults (the name Socket is arbitrary, you can call it Gregory for all I care), and sets up the query parameter. This is going to come in handy later on.

That’s really all there is to it; e.g. in your controller you can now do something like this:


export default class Controller {
    constructor(Socket) {
        Socket.on('anotherEvent', data => {
            console.log(data); // object{msg: 'hi'}
        });
    }
}

Controller.$inject = ['Socket'];

3. Configuring the actual web server

This is the part that mainly had me banging my head. Of course, we could just instruct socket.io to use our “special” port 8080, but that leads to all sorts of problems with proxies, company networks, crappy clients etc. We just want to tunnel everything through port 80 (or 443 for secure sites). An important thing to understand here: A web socket connection is just a regular HTTP connection, but upgraded. That means we can pipe it through regular HTTP ports, as long as the server behind it handles the upgrade.

Apache comes with two modules (well, as of v2.4, but it’s been around for a while) to handle this: mod_proxy and mod_proxy_wstunnel. “ws” stands for “Web Socket” of course, so yay! Only, the documentation didn’t go much farther than “you can turn this on if you need it”. In any case, make sure both modules are enabled.

This is the configuration that finally got it working for me (the actual rules vary slightly between socket.io versions, but this works for 1.3.6):


RewriteEngine On
RewriteCond %{REQUEST_URI}  ^/socket.io/1/websocket  [NC]
RewriteRule /(.*)           ws://localhost:8080/$1 [P,L]

ProxyPass        /socket.io http://localhost:8080/socket.io
ProxyPassReverse /socket.io http://localhost:8080/socket.io

A breakdown:

  1. First, we check if ‘websocket’ is in the request_uri. socket.io provides fallbacks like JSON polling for older clients – which is cool – and it does that by requesting different URLs and seeing which one works. I think in my version it tries /socket.io/1/xhr-polling next if websockets fail. Anyway, the point is: if Apache gets a request for the websocket URL, we rewrite to the ws:// scheme (on localhost, which is fine for now – if you’re running a gazillion apps this way you’ll probably want to handle this differently ;)) and just pass on the URL including query parameters (the P flag) and end it there (the L flag). If that works, the request was succesfull and our client support actual sockets. If not, the rewrite returns an invalid result and we move on to the next rule…
  2. For anything else under /socket.io, we now proxy to localhost:8080 via regular http and let the fallbacks handle it. This rule also makes sure that we can safely serve socket.io.js (since the URL doesn’t contain /websocket, it just gets forwarded).

If you’re using something else than Apache (e.g. nginx) there’s similar rules and rewrites available, but I’m not experienced enough in those to offer them here 🙂 The principle will be the same though.

4. Bonus: authentication

I rarely build apps without some form of authentication involved, and I’m guessing I’m not the only one. Regular authentication in a web app is usually something with cookies and sessions, but since socket.io is actually running on a different domain (localhost) – and besides doesn’t know about cookies – this won’t work. That’s where that ‘query’ option when we connected earlier comes in.

The query parameter is just something that gets appended to the socket.io requests as a regular GET parameter, and which can be read on the server. Exactly how you implement this is up to you, but a very simple (and by the way not extremely secure) option would be to just pass the session ID:


let ioSocket = window.io.connect('', {query: 'session=' + my_session_id});

And then in your node app on the server you can do something like this:


io.on('connection', socket => {
    socket.session = socket.handshake.query.session;
    // perform some validation, perhaps including a query to a database that stores sessions?
});

Server side languages like PHP by default store their sessions in proprietary flat files, so setting the server up to store them in a way that your node script can reach them depends on your platform. At least in PHP it’s pretty trivial to use a database like MySQL for that.

And that’s it really: you can now start building a real time application!

The poor man’s PHP daemon

We have this project lying around from a while back that’s based on PHP/AngularJS and also sports a socket.io server for (among other things) a real time chat. Pretty nifty (no, we didn’t do the design :)). The socket part is powered by a NodeJS process, and since we didn’t feel like (nor did the client have budget for) rewriting all our PHP code to Javascript, we used dNode-PHP (well, actually a fork with a few small project-specific adjustments) to let the Javascript code in the NodeJS process communicate with our existing PHP libraries. So now we had two processes which is suboptimal but worked at the time.

As a poor man’s solution (time pressure, limited budget, etc.) these processes were simply kept running in a permanent screen on the server. In theory, that worked well enough for then – the idea was that once the client found new budget, we’d finally properly daemonize them. And our server doesn’t reboot that often anyway, so remembering to restart the screen and processes on those occasions wasn’t a big deal.

Of course, that day never came.

There was however one annoying issue: PHP’s database resource would go stale after a while. According to the docs it should automatically reconnect, only it didn’t. This brought the need to manually restart those processes every once in a while (we put the MySQL timeout to a week to alleviate the burden, but the exact moment was random depending on the last moment of activity. Of course, one could argue that a site that’s completely inactive for >7 days regularly isn’t worth the effort, but a) that wasn’t our problem and b) the client was still working on his marketing plan. Fair enough).

Today I got fed up with it, had an hour to spare, and the marketing plan was ready as well. Time to bite the bullet; here’s what I came up with.

1. The NodeJS process

That part was easy; essentially I followed the instructions here. We don’t use Ubuntu but rather Debian, but it should be similar on all *nix systems.

2. The dNode-PHP process

This is where it got interesting, and which had been the actual bullet I’d been putting off biting. PHP isn’t very well suited to run as a daemon. It’s possible, but that doesn’t mean it’s desirable, in the same sense as writing out all your CSS in <script>document.write('<style>...<' + '/style>')</script> tags is only a theoretical option. But in this case it was still better than duplicating code in Javascript.

Now, there are ways to turn a PHP script into an actual daemon. It was still overkill for our purposes, so I simply went with a cronjob that killed any existing process and restarts it every hour. If the client ever needs an actual daemon, we’ll get to that then 🙂

Yes, this “daemon” is a beggar, but as they say: beggars can’t be choosers…

Graceful routing fallback in hybrid AngularJS apps

My current favourite way of working with the otherwise-awesome AngularJS framework is to only apply it to those sections of a page that actually need Angular-behaviour, and let common PHP solve the rest (what I call a “hybrid Angular app”, lacking a better term). While Angular does offer stuff like ngRoute and uiRouter to handle “HTML5-mode routing”, this comes with some other problems, the most notable being search engine indexing. While there are solutions for that they have their own drawbacks, so today I’d rather just have a regular site with some <element my-awesome-directive/> where needed.

I did however run into a problem: IE<10. (Yes, it’s Explorer again…) IE9 and before don’t support the HTML5 history API of course, so Angular has a fallback using hashed URLs. Fine. I assumed that as long as you’re not actually using routes, Angular would leave your URLs alone.

I was wrong.

For any regular URL /foo/, Angular insists on redirecting to /#/foo/ in older Explorers, which of course will just render the homepage with a useless hash URL since I’m not actually letting Angular handle the routing. Bummer. The solution turned out to be simple enough though: don’t force $location.html5Mode(true) if the browser doesn’t even support it. This will cause your IE<10 to just treat all URLs as regular links, will let Angular handle HTML5 URLs where you do define them (e.g. an image gallery) in supporting browsers, will still allow old IEs to run all other Angular code and thus provides the perfect graceful fallback.

Hence:

angular.module('mythingy', []).config(['$locationProvider', $locationProvider => {
    if (!!(window.history && window.history.pushState)) {
        $locationProvider.html5Mode(true);
    }
}]);