Concurrent Distributed Programs

Courtesy:  AtomizeJS

Writing concurrent programs is tricky. Languages tend to go down one of two routes: either threads are allowed to access the same data-structures directly, and it’s left up to the programmer to decide how to manage locking; or the language presents a model whereby each thread has its own specific memory, and data is passed between threads through message passing. The former approach gives the programmer a slightly bigger gun to shoot themselves with: the ease of writing code that deadlocks is unrivalled. The latter is conceptually much simpler and more intuitive, but is relatively confined: languages such as Erlang or dedicated actor frameworks such as Akka are about as mainstream as it gets.

Writing distributed programs is trickier still. In addition to the issues of writing concurrent programs, you have further difficulties of dealing with the safe access or distribution of data across the system, combined with the increased potential for partial failures.

AtomizeJS is a project which aims to make it easy to write programs in JavaScript that can be both concurrent and distributed. It does this by implementing Distributed Software Transactional Memory (DSTM).

The mental model you should have when using AtomizeJS is as follows:

  • Assume there is an object graph that gets distributed automatically to every browser looking at your site.
  • To make safe changes to this object graph, you write functions which change the objects as desired. These functions are then run by AtomizeJS as transactions. AtomizeJS ensures that these transaction functions are run:
    • atomically: Transactions are atomic (all or nothing).
    • consistently: Transactions preserve the object graph’s consistency. That is, a transaction transforms a consistent state of the object graph into another consistent state, without necessarily preserving consistency at all intermediate points.
    • in isolation: Transactions are isolated from one another. That is, even though in general there will be many transactions running concurrently, any given transaction’s updates are concealed from all the rest, until that transaction commits. Another way of saying that same thing is that, for any two distinct transactions T1 and T2, T1 might see T2‘s updates (after T2 has committed) or T2 might see T1‘s updates (after T1 has committed), but certainly not both.

Transactions cannot deadlock: when a transaction function is run, AtomizeJS detects if other transactions have modified the same objects that are being altered by this function. If so, this function is automatically restarted with the updated current state-of-the-world. If a transaction is restarted then it will be because some other transaction has committed, thus the system as a whole has made progress.