Introducing Socket.IO 1.0

The first version of Socket.IO was created shortly after Node.JS made its first appearance. I had been looking for a framework that easily enabled me to push data from a server to a client for a long time, and even had tried other approaches to server-side JavaScript.

At the time, the main focus was on an interface equivalent to the upcoming WebSocket API that was in the process of standardization. I was lucky to receive a lot of feedback from the community at the time (including Node.JS’s creator) that helped shape the project into something significantly more useful.

Socket.IO has thus become the EventEmitter of the web. Today I want to talk about the work that has gone into 1.0 to round up this vision.

There’s a lot to say about Socket.IO 1.0, so if you’re short in time, feel free to jump to the parts that are most interesting to you:

  1. New engine
  2. Binary support
  3. Automated testing
  4. Scalability
  5. Integration
  6. Better debugging
  7. Streamlined APIs
  8. CDN delivery
  9. Future innovation
  10. Credits

New engine

The Socket.IO codebase no longer deals with transports and browser incompatibilities. That work has been relegated to a new module we’ve been perfecting for months called Engine.IO that implements a WebSocket-like API.

The benefits of this particular modularization can’t be understated:

  • For the Socket.IO end user, nothing changes. Just drop-in the new version!
  • A tremendous simplification in terms of codebase size and testing surface
    • The Socket.IO Server is now only 1234 lines of code.
    • The Socket.IO Client is now only 976 lines of code.
  • Future-proof flexibility
    • If WebSocket is the only transport you want to support moving forward, Engine.IO (with all its browser hacks and workarounds) can be seamlessly removed.
    • Alternative transports such as vanilla Node.JS TCP Sockets or Google Chrome Sockets can be trivially implemented.

This separation has also allowed us to innovate and perfect the transport layer. One of my favorite improvements was introducing the idea of what I call transport feature detection.

Once upon a time in the web, it was extremely common to simply sniff User Agents to make decisions of what APIs to use or what behaviors to enable. As JavaScript codebases became more complex and mature, it became obvious that in order to maximize reliability, it made more sense to directly test the APIs to see if they would behave as expected.

For example, simply checking that the JSON global is present does not mean that JSON.stringify works, or even exists. It could have simply meant that the user defined a JSON global of their own, or the environment could have a broken JSON implementation.

Socket.IO never assumes that WebSocket will just work, because in practice there’s a good chance that it won’t. Instead, it establishes a connection with XHR or JSONP right away, and then attempts to upgrade the connection to WebSocket. Compared to the fallback method which relies on timeouts, this means that none of your users will have a degraded experience.

Binary support

Users have been asking for the ability to send binary data for a while, especially after WebSocket added support for it.

The main issue was that if we had modeled our binary support after the WebSocket API, its usefulness would have been fairly limited. WebSocket requires that you put your Socket either into “string mode” or “binary mode”:

var socket = new WebSocket('ws://localhost');
socket.binaryType = 'arraybuffer';
socket.send(new ArrayBuffer);

This is good for a low-level API, which is why Engine.IO now supports it, but application developers most likely don’t want to send only blobs, or encode everything as a blob manually prior to sending it out.

Socket.IO now supports emitting Buffer (from Node.JS), Blob, ArrayBuffer and even File, as part of any datastructure:

var fs = require('fs');
var io = require('socket.io')(3000);
io.on('connection', function(socket){
  fs.readFile('image.png', function(err, buf){
    // it's possible to embed binary data
    // within arbitrarily-complex objects
    socket.emit('image', { image: true, buffer: buf });
  });
});

To test how useful it would be to support binary in this particular way (and as a virtualization geek), I decided to replicate the Twitch Plays Pokemon experiment 100% in JavaScript. Using a JavaScript gameboy emulator, node-canvas, socket.io we came up with a server-rendered collaborative game that even works on IE8. Check it out on http://weplay.io (source code here).

The relevant code that sends the image data is:

self.canvas.toBuffer(function(err, buf){
  if (err) throw err;
  io.emit('frame', buf);
});

The next experiment was to run an instance of QEMU running an image of Windows XP, in honor of its retirement. Every player gets a 15 second turn to control the machine. Check out the demo on http://socket.computer. Here’s a video of your typical inception scenario:

A key part of putting together this demo was connecting to the QEMU VNC server and implementing the RFB protocol. As it’s usually the case with Node.JS, the solution was a npm search rfb away.

Essentially, in order to minimize latency and have the best performance, it’s best to notify clients only of the pieces of the screen that changed. For example, if you move your mouse around, only little pieces of the screen that surround the cursor are broadcasted. The node-rfb2 module gives us a rect event with objects like the following:

{
  x: 103,
  y: 150,
  width: 200,
  height: 250,
  data: Buffer
}

It then became clear to me that our support for binary data would be genuinely useful. All I had to do was call io.emit to pass that object around, and let Socket.IO do the rest.

Just for fun, I also installed and ran one of my favorite first person shooters:

Automated Testing

Every commit to the Socket.IO codebase now triggers a testing matrix totaling to 25 browsers, including Android and iOS.

We accomplish this by having make test seamlessly set up a reverse tunnel to ephemeral ports in your computer (thus making it accessible from the outside world), and have them execute on the Sauce Labs cloud, which is in charge of virtualizing and executing browsers on all the environments we care about.

Scalability

We simplified the approach towards rooms and multi-node scalability dramatically. Instead of storing and/or replicating data across nodes, Socket.IO is now only concerned with passing events around.

If you want to scale out Socket.IO to multiple nodes, it now comes down to two simple steps:

  1. Turn on sticky load balancing (for example by origin IP address). This ensures that long-polling connections for example always route requests to the same node where buffers of messages could be stored.
  2. Implement the socket.io-redis adapter.
    var io = require('socket.io')(3000);
    var redis = require('socket.io-redis');
    io.adapter(redis({ host: 'localhost', port: 6379 }));

We have deprecated the Socket#set and Socket#get APIs. Packets now simply get encoded and distributed to other nodes whenever you broadcast, and we don’t deal with storage.

This leads directly into our next goal: integration with other backends.

Integration

Chances are good that your existing application deployments are written in a variety of languages and frameworks, and are not just limited to Node.JS. Even if it was all Node.JS, you probably at some point want to separate concerns of your application into different processes.

One of the processes might be in charge of hosting the Socket.IO server, accepting connections, performing authentication, etc, and then another part of your backend might end up in charge of producing messages.

To that end we’re introducing the socket.io-emitter project which hooks into socket.io-redis to easily allow you to emit events to browsers from anywhere:

var io = require('socket.io-emitter')();
setInterval(function(){
  io.emit('time', new Date);
}, 5000);

Tony Kovanen already created a PHP implementation:

<?php
$emitter = new SocketIOEmitter(array('port' => '6379', 'host' => '127.0.0.1'));
$emitter->emit('event', 'wow');
?>

This makes it really easy to turn any existing application into a realtime application!

Better debugging

Socket.IO is now completely instrumented by a minimalistic yet tremendously powerful utility called debug by TJ Holowaychuk.

In the past, the Socket.IO server would default to logging everything out to the console. This turned out to be annoyingly verbose for many users (although extremely useful for others), and violates the Rule of Silence of the Unix Philosophy:

Rule of Silence
Developers should design programs so that they do not print unnecessary output. This rule aims to allow other programs and developers to pick out the information they need from a program’s output without having to parse verbosity.

The basic idea is that each module used by Socket.IO provides different debugging scopes that give you insight into the internals. By default, all output is suppressed, and you can opt into seeing messages by supplying the DEBUG env variable (Node.JS) or the localStorage.debug property (Browsers).

You can see it in action for example on our homepage:

Streamlined APIs

The socket.io module now exports the attachment function directly (previously .listen).
It’s even easier now to attach socket.io to a HTTP server:

var srv = require('http').Server();
var io = require('socket.io')(srv);

or to make it listen on some port:

var io = require('socket.io')(8080);

Before, to refer to everyone connected you had to use io.sockets. Now you can call directly on io:

io.on('connection', function(socket){
  socket.emit('hi');
});
io.emit('hi everyone');

CDN delivery

One of the best decisions we made early on was that implementing a Socket.IO server would not only give you access to the realtime protocol, but Socket.IO itself would also serve the client.

Normally, all you have to do is to include a snippet like this:

<script src="/socket.io/socket.io.js"></script>

If you want to optimize access to the client by serving it near your users, provide the maximum level of gzip compression (thanks to Google’s zopfli and proper support for caching, you can now use our CDN. It’s free, forever, and has built-in SSL support:

<script src="https://cdn.socket.io/socket.io-1.0.0.js"></script>

Future innovation

The core Socket.IO projects will continue to improve with lots of more frequent releases, with the sole goal of improving reliability, speed and making the codebase smaller and easier to maintain. Socket.IO 2.0 will probably see us ditching support for some older browsers, and not bundling some modules like the JSON serializer.

Most of the innovation in the Socket.IO world will happen outside of the core codebases. The most important projects that I’ll be closely watching are the following:

socket.io-stream

By adding this plugin, you’ll be able to send Stream objects so that you can write memory-efficient programs. In the first example we loaded a file into memory prior to emitting it, but the following should be possible:

var fs = require('fs');
var io = require('socket.io')(3000);
require('socket.io-stream')(io);
io.on('connection', function(socket){
  io.emit(fs.createReadStream('file.jpg'));
});

And on the client side you’ll receive a Stream object that emits data events.

Tooling

When you use Socket.IO you don’t care about transports, packets, frames, TCP or WebSocket. You care about what events are sent back and forth.

Our goal is to have plugins for Web Inspector, Firefox Developer Tools that allow you to easily introspect what events are being sent, when, and what their parameters are.

This project is being led by the talented Nick LaGrow (Github), Samaan Ghani (Github) and David Cummings (Twitter).

New languages and frameworks

A lot of effort has gone into specing and documenting the Engine.IO protocol and Socket.IO protocol.

The main goal behind this is that the Node.JS servers and clients become the reference implementations for many other languages and frameworks. Interoperability within the larger ecosystem is one of our biggest goals for 2014 and beyond.

Credits

This release has been a big team effort. Special thanks go out to our new core team:

  • Tony Kovanen (Github / Twitter) for his amazing work on Engine.IO binary support and research into a variety of workarounds to support all versions of iOS and Internet Explorer, his help in putting together this website and rounding up the docs.
  • Kevin Roark (Github) for the entire development of the new Socket.IO parser on top of Engine, the Socket.IO Computer demo, and help with docs, issues and pull requests.
  • Roman Shtylman (Github / Twitter) for his work on zuul and localtunnel, crucial to our testing architecture and our mission of reliability.

And in no particular order:

  • Jay Borenstein (LinkedIn) for selecting Socket.IO as one of the projects to mentor students on Open Source engineering as part of the Open Academy project.
  • Michael Srb (Github), Mark Mokryn (Github), Eugen Dueck (Github), Afshin Mehrabani (Github), Christoph Dorn (Github) and Mikito Takada (Github) for several key Engine.IO patches.
  • Grant Timmerman (Github / Twitter) for his outstanding work on the new Socket.IO example chat application, and multiple patches and issues investigation.
  • Jxck (Github / Twitter) for his work on translation, documentations and patches. ありがとう
  • Arnout Kazemier (Github / Twitter) for his multiple contributions to Engine.IO and Socket.IO
  • Sauce Labs (Github / Twitter) for supporting open source projects with free testing infrastructure.
  • Shihui Song (Github), Qiming Fang (Github) and Erluo Li for their work on testing infrastructure.
  • Julian Salazar (Github) and Tianyiu Liu (Github) for their work on reconnection and ongoing research into resource sharing between browser tabs and messages synchronization.
  • Gal Koren (Github) for his fantastic work into modularization of the codebases.
  • Matt Walker (Twitter) for the beautiful Socket.IO logo.

Finally, I’m very grateful to my company Automattic for being a great home to Open Source innovation.