LakTEK A Sri Lankan, A Rubyist and A Web Dude

Posted
25 May 2010 @ 11am

Tagged
, , , , , ,

Real-time Collaborative Editing with Web Sockets, Node.js & Redis

Few months ago, I mentioned I’m developing a real-time collaborative code editor (codenamed as Realie) for my individual research project in the university. Since then I did couple of posts on the design decisions and on technologies I experimented for the project. After some excessive hacking, today I’ve got something tangible to share with you.

Currently, I have implemented the essentials for real-time collaboration including ability watch who else is editing the file, view others edits, chat with the other collaborators and replay how the edits were done. You may think this is more or less similar to what Etherpad had - yes, it is! However, this is only the first part of the project and the final goal would be to extend this to a collaborative code editor (with syntax highlighting, SCM integration).

Web Sockets

The major difference of Realie from other Real-time collaborative editors (i.e. Etherpad, Google Docs & Wave) is it uses web sockets for communication between client and server. Web Sockets perfectly suit for cases like this, where we need to have asynchronous, full-duplex connections. Compared to alternatives such as long-polling or comet, web sockets are really efficient and reliable.

In traditional HTTP messages, every message needs to be sent with HTTP headers. With web sockets, once a handshake is done between client and server, messages can be sent back and forth with a minimal overhead. This greatly reduces the bandwidth usage, thus improves the performance. Since there is an open connection, server can reliably send updates to client as soon as they become available (no client polling is required). All this makes the app truly real-time.

As of now, only browser to implement web sockets is Google Chrome. However, I hope other browsers would soon catch up and Mozilla has already shown hints for support. There are also Flash based workarounds for other browsers. For now, I decided to stick with the standard web socket API.

Taking Diffs and Applying Patches

In case if you wonder, this is how the real-time collaboration is done:

  1. when one user makes a change; a diff will be created for his change and sent to server.
  2. Then the server posts this diff to other connected collaborators of the pad.
  3. When a user receives a diff, his content will be patched with the update.

So both taking diffs and applying patches gets executed on the client side. Handling these two actions on browser was made trivial thanks to this comprehensive library written by Neil Fraser.

However, on some occasions these two actions needs to be executed concurrently. We know by default client-side scripts get executed in a single thread. This makes execution synchronous and slow. As a solution to this I tried using the Web Workers API in HTML5 (this is implemented in WebKit & Mozilla). Separate worker scripts were used for taking diffs and applying patches. The jobs were passed on to this worker scripts from the main script and the results were passed back to main script after execution was complete. Not only this made the things fast, but also more organized.

Node.js for Backend Servers

Initially, I started off the server implementation in Ruby (and Rails). Ruby happend to be my de-facto choice as it was my favorite language and I had enough competency with it. However, soon I was started feeling Ruby was not the ideal match for such asynchronous application. With EventMachine it was possible to take the things to a certain extent. Yet, most of the Ruby libraries were written in a synchronous manner (including Rails), which didn’t help the cause. As an alternative, I started to play with Node.js and soon felt `this is the tool for the job`. It brings the JavaScript’s familiar event-driven model to server, making things very flexible. On the other hand, Google’s V8 JavaScript engine turned out to be really fast. I decided to ditch the Ruby based implementation and fully use Node.js for the backend system.

Backend is consist of two parts. One for serving normal HTTP requests and other for web socket requests. For serving HTTP requests, I used Node.js based web framework called Express. It followed the same ideology as Sinatra, so it was very easy to adapt.

Web socket server was implemented based on the recently written web socket server module for Node.js by Micheil Smith. If you are interested to learn more about Node.js’ web socket server implementation please see my earlier post.

Message delivery with Redis Pub/Sub

On each pad, there are different types of messages that users send on different events. These messages needs to be propagated correctly to other users.

Mainly following messages needs to be sent:

  • When a user joins a pad
  • When a user leaves a pad
  • When a user sends a diff
  • When a user sends a chat message

For handling the message delivery, I used Redis’ newly introduced pub/sub implementation. Every time a user is connected (i.e. visits a pad) there would be two redis client instances initiated for him. One client is used for publishing his messages, while other will be used to listen to incoming messages from subscribed channels.

Redis as a persistent store

Not only for message handling, I also use Redis as the persistent data store of the application. As a key-value store Redis can provide fast in-memory data access. Also, it will write data to disk on a given interval (also, there is a much safer Append only mode, which will write every change to disk). This mechanism is invaluable for this sort of application where both fast access and integrity of the data matters.

Another advantage of using Redis is the support for different data types. In Realie, the snapshots are stored as strings. The diffs, chat messages and users for a pad are stored as lists.

There is a well written redis client for Node.js which makes the above tasks really simple.

Try it out!

I’m still in the process of setting up an online demo of the app. Meanwhile, you can checkout the code and try to run the app locally.

Here is the link to GitHub page of the project - http://github.com/laktek/realie

Please raise your ideas, suggestions and questions in the comments below. Also, let me know if you are interested to contribute to this project (this project is open source). Thanks for reading!


  • Great work, I have been looking for something like this for some time. I specially like what you did with Redis queues...
    Thanks!
  • I finally got a chance to check out the actual code, and I'm really pleased with how quickly I was able to get everything working. A few questions on some specifics though. I see that currently, a user fetches a pad by calling localhost:3000/pad directly (and ExpressJS handles the routing). On the backend, your Express node server fetches the pad and returns it when this happens. Then, once a pad is fetched, the client creates a WebSocket connection and subscribes to the pad for changes (using the "open" event on the socket connection, though I'm having a hard time finding this event in http://github.com/miksago/node-websocket-server - does this event get fired as soon as a connection occurs? And if so, is it possible, in the open event, to specify which pad you wish to subscribe to if there were multiple pads?).

    So my main question is - how do you prevent losing updates that happen between the client's open and the client's socket connection? Specifically, can't the following happen:

    1. Client A connects to a pad and enters "Hello".
    2. Client B connects to the pad and gets "Hello" as the pad data.
    3. Client A enters " World" into the pad.
    4. Client B creates its WebSockets connection and subscribes to the pad channel

    Don't we now end up in a state where Client A sees "Hello World" and Client B sees "Hello"?

    Thanks in advance for reading this long rambling question =).

    - Saikat
  • laktek
    The code on the repo is still a demo and idea for the final version is different pads will be known from their ids (i.e. http://localhost/pads/1001). So based on Web Socket handshake's origin (or using Sec-Websocket-Protocol) we could set for which pad the user is connecting.

    Regarding, loosing updates between client open and socket connection: it's something I'm currently working on. Basically, after the handshake client B will send a message to server with its snapshot id and in response server would send all diffs it received since that snapshot id.

    I'm still brainstorming and experimenting with the ideas. So any suggestions or criticisms on these approaches are most welcome :)

  • Stephen Bannasch
  • laktek
    Yep. I started initially with Bespin embedded as the editor. However, for this project we need to perfrom actions such as automatically append changes to a line, highlight each user's changes on the editor interface. Bespin's editor API didn't seems to have enough flexibility for such options.

    Hence, I decided it would be the best to write a simple editor from the scratch.
  • Kareem
    I wanted to ask about ur choice for the database to be a key per value store instead of a document base like mongodb or couchdb
  • laktek
    Though Redis is a key-value store, it supports data types such as lists, sets apart from the normal strings. This is really helpful in maintaining the data structure of Realie. Also, since it's an in-memory data store performance is also very high.

    Another advantage of using Redis was its pub/sub implementation, which handles the message delivery of the app.
  • Andrew Cates
    Great job, amazing use of your time with the new technologies. Reading articles like this makes me wish I had more free time to play around with the latest technologies. Thanks for putting it all together into sample code I can digest quicker.
  • Nice effort on using all of the latest technologies.
  • Thanks for the shout-out about the client module.

    You could have written this as a NodeRed extension. NodeRed is a generic server for Node.js and supports TCP/IP, WebSockets, and PUBSUB through redis.

    You were on your way to reimplementing NodeRed so I thought I'd tell you about it to save you some time!

    http://github.com/fictorial/nodered
  • Thanks for the suggestion on NodeRed, didn't notice that project before, will checkout and see how it fits for the task.
  • Rob
    How do you handle conflicts? For example, when User 1 deletes a paragraph at the same time User 2 modifies a word in that paragraph?
  • laktek
    Honestly, it's still a gray area of the project..Currently, due to the chronological order of the patches getting applied User 2's change would get rejected.
  • Looks great! You should definitely check out Socket.IO - http://github.com/learnboost/socket.io. It abstracts away degradation of the socket protocol for you. I used it a while ago in a Node.js dummy project (http://techblog.gomockingbird.com/socket-to-em).

    Also, did you look at Neil Fraser's collaboration framework (based on diff-match-patch)? It's what gets used by google mob-write and (until recently) Google Docs - http://neil.fraser.name/writing/sync/ is the theory and http://code.google.com/p/google-mobwrite/ is the code. I forget why his method requires as much complexity as it does though (definitely it's more complex than simply applying the patches).
  • Thanks for the reference to Neil Fraser's work, I have been thinking a lot about this problem, and now here's the solution!
  • Thanks for your comment Saikat. Initially I also started with Socket.IO and however it doesn't seems to work with current Node.js versions. Also, it didn't seems to listen Node.js' HTTP ugrade event to implement web sockets. Also, Micheil Smith's web socket server seemed more standard compliant. Please refer to my previous blog post on this: http://www.web2media.net/laktek/2010/05/04/implementing-web-socket-servers-with-node-js/

    I too felt Neil's Mob-write framework, is bit harder to grasp. So implementing something from the scratch using diff-match-patch was easy (also I wanted to learn the things in the hard way too ;)).
  • Interesting - I know that Socket.IO had a dependency on a patched Node a while back, but I believe he's fixed that. I see this commit that gives it support for at least 0.1.94 - http://github.com/LearnBoost/Socket.IO-node/commit/0459c9584880ec91024a90b80fa14bbc37047b2c .

    Also, the reason I bring up mob-write is that I believe Neil Fraser had a reason for making it as complicated as it is (and at some point I think I had figured out what his motivation for it was, but I'm not sure). His algorithm is basically keeping around a copy of the last version of the document that was in sync per client on the server, and also one on each client. Perhaps doing it this way helps to keep conflicts down?

    Anyway, great work!
  • The main complication in the differential synchronization algorithm was to deal with packet loss. Any solution which does not handle packet loss will work great for short demos and for internal applications, but once it hits the big bad flaky Internet, things start to fall apart really fast. Especially if one adds mobile phones to the mix. The addition of an edit stack, version numbers and rollbacks allows MobWrite to continue to synchronize properly with no data loss for either side when running on a best-effort network.
  • laktek
    Glad to see you commenting here Neil :) Your work was a major inspiration for the Realie project.

    I agree with you on the complications that could be caused by the uncertainty of data transfers on Internet. Also, this is why a protocol such as Web Sockets suits better for such a task than the traditional HTTP messages, where the overhead would be minimal. But I agree, that's too not totally reliable and better synchronization technique should be in place to avoid conflicts.
  • John
    This is quite interesting. Definitely going to have a look at this. Eclipse on the cloud :)
  • pixelcort
    Was their a reason you decided to use diffs and patches as compared to operational transformation?
  • Currently there's no comprehensive Open Source library to do Operational Transformations. Using the Diff-Match-Patch library was the most simplest option I found for implementing real-time collaboration. That said, if we can implement a good open source operational transformation library it would be more efficient in the long run
  • I'm looking at building exactly this - let me know if it would be useful, or if you have any suggestions.

    http://www.dontstopthesignal.com/2010/05/design-goals-for-pluto-open-source-ot.html

    Great work on Realie BTW :)
blog comments powered by Disqus