This Year's posts

Archive for April, 2010

Potential Consistency

Thursday, April 29th, 2010

In my role as the lead on Twitter’s migration towards Cassandra, I spend a lot of time explaining the concept of eventual consistency and why its not as big of a shift for us as people fear.

It seems that this fear stems from misunderstandings of both eventual consistency and its alternatives.

First off, people confuse eventual consistency (and I’ll be speaking in terms of how its implemented in Cassandra) as if it were the normal condition. Its not– it is the error condition. With the right parameters (R + W > N)* and no failures, you get immediate consistency. The eventual part only comes in when there are failures or you purposely tune your consistency down.

One alternative to Cassandra-style eventual consistency is what you see implemented in BigTable and its clones. Under normal operations with these systems you get immediate consistency. However, in the case of failures, the mutation operation can fail, requiring the client to retry (if it can). If you can’t retry (or can’t wait long enough for the retry to succeed), you’ll lose the data. These systems choose to reduce their availability in the case of failures. For some systems, this may be a great tradeoff. I think that class of systems is smaller than many think.

Another alternative to eventual consistency is a pattern I call “Potential Consistency”. Some well known architectures have this property– any system that relies on asynchronous master-slave replication + a cache (think mysql + memcache) has, at best, potential consistency.

Whether you do write-through or read-through caching (do you update the cache or invalidate it?) you can easily have different data in your master, each of your slaves (replication, especially in mysql, isn’t perfect because statement-based replication is non-deterministic) and memcache. And there’s no guarantees that this differences will ever be resolved. Your data might be consistent, but once it becomes inconsistent there no guarantees it will ever become consistent again. Unless you build something to repair that data. If you can do this successfully– congratulations, you’ve built an eventually consistent system.

  • R = read consistency (how many replicas you block for on read); W = write consistency (how many replicas you block for on write); N = number of replicas for the data in question. You typically satisfy this condition with quorum reads and writes.