Home » 2018 » April » 24

Daily Archives: April 24, 2018

Logic and Lattices for Distributed Programming

Logic and Lattices for Distributed Programming
Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier, in Symposium on Cloud Computing 2012 (SOCC ’12), October 14-17, 2012, San Jose, CA.

This is definitely a different direction than we’ve had in prior papers, though I do have an ulterior motive in presenting this particular paper – we will see it used later.

In recent years there has been interest in achieving application-level
consistency criteria without the latency and availability costs of
strongly consistent storage infrastructure. A standard technique is to
adopt a vocabulary of commutative operations; this avoids the risk
of inconsistency due to message reordering. Another approach was
recently captured by the CALM theorem, which proves that logically
monotonic programs are guaranteed to be eventually consistent. In
logic languages such as Bloom, CALM analysis can automatically
verify that programs achieve consistency without coordination.
In this paper we present BloomL, an extension to Bloom that
takes inspiration from both of these traditions. BloomL generalizes
Bloom to support lattices and extends the power of CALM analysis
to whole programs containing arbitrary lattices. We show how the
Bloom interpreter can be generalized to support ecient evaluation
of lattice-based code using well-known strategies from logic
programming. Finally, we use BloomL to develop several practical
distributed programs, including a key-value store similar to Amazon
Dynamo, and show how BloomL encourages the safe composition
of small, easy-to-analyze lattices into larger programs.

Notice they do mention key-value stores, so you have another hint on how I’ll be referring back to this work in a future post.

This tends more to the theoretical side of systems.  It is not a theory paper (there just isn’t enough formalism, let alone proofs!)   It has performance graphs, which you certainly expect from a systems paper, but not from a theory paper.

The driving factor behind this is the issue of distributed consistency.  At a high level, “distributed consistency” is concerned with ensuring that a group of communicating computers, with some temporal separation, agree on the outcome of operations even when things go wrong.  Perhaps the most famous example of distributed consistency is Paxos.  These days we refer to these as consensus protocols.  I generally describe there being several such: two-phase commit is certainly one of the older ones.  Quorum protocols are another (e.g., weighted voting, which I described previously).  Viewstamped Replication is another.  These days, the popular consensus protocols are
Raft and Blockchain.

Figure 8 (From Paper)

The paper starts by pointing out that monotonic consisency provides a valuable mechanism for reasoning about distributed consistency.  Prior work by the authors establishes that all monotonic programs are “invariant to message reordering and retry”, a property they call confluent. This matters for distributed systems because such a system only moves forward (the operations are durable.)

They point out some weaknesses in the prior definition and motivate improving it by explaining one such obvious case that does not fit within the model (a voting quorum in a distributed protocol.)

Hence, they introduce the lattice.  They do this within the context of their language (BloomL), which works on top of Ruby.  I will not dwell on the details.

The authors define a bounded semijoined lattice.  My reading of what they are saying is that in such a set, there is a unique element that happened first.  They define this formally as a set S, with an operator (“bottom” that I don’t seem to have in my font set) that defines a partial ordering.  There is a unique element ⊥ that represents the least element.

From this definition, they construct their model; the paper drops the “bounded semijoined” part of the definition and simply discusses lattices from that point forward, but it is this partial ordering property that imparts the key characteristics to their subsequent operations.

Why is this important?  Because it demonstrates that these lattices – which are going to turn out to be equivalent to key operations in distributed systems – have consistency guarantees that are desirable.

The authors then turn their attention to utilizing lattices for key-value stores.  They describe data structure versioning and vector clocks.  Vector clocks have a property they desire for lattices: they are partially ordered.  They combine this with a quorum voting protocol, to provide the distributed consensus for their system.

Figure 9 (from paper)
Figure 9 (from paper)

Figure 8 (from the paper) shows the general structure of their key-value store implementation, which is implemented in BloomL and Ruby.  Their sample usage for this is a shopping cart, which they graphically describe in Figure 9 (from the paper).

As one would expect in a distributed system, the key benefit here is that there is no centralized authority deciding on the order of things.  They point out that prior work argues shopping carts are non-monotonic and thus cannot be solved in a distributed systems setting.  The authors point out that using the lattice structure, they achieve a monotonic ordering, which permits them to implement it without a centralized decision maker; in fact the decision maker in this case is really the client itself, as it has all the information from all the servers sufficient to complete the operation.

While a shopping cart might not be the killer application for a distributed systems technology, this paper does describe a powerful tool for providing distributed consensus in a system that can be implemented in a modest amount of code; compared to Paxos, Raft, or Viewstamped Replication, that is a significant contribution.

It does not appear to have byzantine protection, however, so if you live in a hostile environment it might not be the right protocol.  Similarly, if you need stronger consistency guarantees, this might not be the best model either.  But for many applications slightly relaxed consistency guarantees are often more than adequate.

We will see how this can be applied in the future.