Most realtime collaborative editors make use of a class of algorithms known as Operational Transformation. Most of these algorithms require that the server and the client run the same algorithm otherwise risk desynchronisation - the nightmare of realtime collaboration algorithm designers. I will propose a new method of finding consensus using a blockchain of the type originated in bitcoin. This algorithm uses the blockchain to find consensus on the state of the collaborative document or data-structure then applies the Operational Transformation on the client side without requiring any help from the server. Such a setup simplifies the server side, allowing multiple implementations of the server, for example in different programming languages, but it also allows the client-side to use encryption in order to keep the data secret from the server. Finally, I will propose an extension to this algorithm which, borrowing again from bitcoin, makes the system able to function purely as a peer-to-peer system, providing high resilience and greater security.
Lecture Notes:
For those who don't know me, I'm Caleb James DeLisle, I'm a researcher at XWiki where we make
technology to help people collaborate and share information. Today I'm going to talk about a
realtime collaboration algorithm but in order to fill in the history for this, I need to explain
a bit about about distributed consensus. Consensus is something which we need more often than we
know and until recently we did not know how to do distributed consensus.
The concept of distributed consensus was made really popular by the Bitcoin protocol. Not only did
Bitcoin claim to achieve distributed consensus but it claimed the ability to achieve consensus
amongst machines operated by mutually antagonistic parties. There is, however, another aspect of
distributed consensus which gets a bit less attention but has probably has had as much impact if
not more. That is consensus in distributed "no-sql" databases. To give a little background, it's
interesting to look at how traditional high-volume databases are architected. At a financial
institution you usually would have an Oracle database with master/slave replication and sharding
where possible but where you have, for example, the accounts database, you ended up with this
bottleneck in a single machine. If you were to split up this master database then you run the risk
of what in bitcoin parlance is known as the "double spend", I send money to Alice and Bob at the
same time and my account can't support both. NodeA thinks Alice got paid and the transfer to Bob is
invalid, NodeB thinks Bob got paid and the transfer to Alice is invalid, thus the distributed
database becomes desynchronised. This problem comes up in almost every type of database and the
traditional solution has been to make sure the main database ran on one computer.
Obviously total reliance on a single machine is a recipe for problems and to solve these problems,
organisations spent money. In traditional enterprise settings, the database master has become the
crown princess. It's not uncommon to see RAIDs containing multiple mirrored high availability SSDs
or even battery backed RAM based storage. Microprocessors looked more like something one would
expect to see in a top end gaming machine and duplicating an entire master with a "hot spare" is not
uncommon.
In the tech world, there wasn't the money to run these beasts and with the margin of error in their
applications being far wider than those in the financial industry, the stage was set for something
different to emerge. Google BigTable, Amazon Dynamo, Hadoop, Cassandra and Mongodb were the
beginning of a new approach. Behind the changes to the database query language was the ability for
multiple master nodes. Unlike bitcoin, no-sql did not promise to work with nodes which are trying
to attack eachother but at least with honest nodes it promised to come to reach agreement. The logic
of this was astonishingly simple, servers would tell each-other about the transactions they receive
but they would tag them with the id of the previous transaction and a timestamp, if it received a
conflicting transaction with an older timestamp then it would just revert it's own and apply the
other.
So what does all of this have to do with collaborative documents?
Well a collaborative document is another thing which really really needs to be the same for all of
the people editing it. If you are working on a document and your version is different from one that
someone is collaborating on, that's a big problem. In collaborative documents, two people altering
the same content is a very common situation and instead of simply rejecting one change, the state
of the art is to try to transform it into what the user "probably would have wanted" using what is
known as Operational Transformation.
The problem with the traditional method has been that the server and the clients both needed to have
the Operational Transformation functions and those functions are famously difficult to implement for
collaborative editing of rich text.
Secondly this type of algorithm implies that there must be a central server and that server must be
able to read all of the content which is being edited. From a privacy standpoint it's not the best
deal but the real difficulty is that with the same algorithm needed on the client side and on the
server, programming languages in which the server can be implemented are quite limited.
We have developed a new algorithm called ChainPad, this algorithm makes use of a limited blockchain.
Like no-sql, ChainPad does not resolve cases of mutually antagonistic parties but as long as the
people editing are not trying to attack you, the algorithm will find consensus, even with
out-of-order messages from a peer-to-peer transport.
Even without going to full peer-to-peer, we can do zero knowledge. Zero knowledge means that the
server which hosts the data does not have the ability to read it. With traditional realtime this is
not possible but with CryptPad, the server does nothing more than to pass messages between the
clients.