Wednesday 18 June 2008

CAP

A couple of months ago I wrote a little about the architectural concepts ACID and BASE, two descriptions of two very different systems.  In a company like ours, the business (and it's pseudo-techie product managers) fail to recognize the mutual exclusivity involved in various combinations of ACID and BASE, desiring the benefits of both concurrently.  This is a pretty vast comprehension chasm to cross without a good tool to help us explain the tradeoffs - enter Eric Brewer's CAP theorem.

CAP stands for Consistency, Availability and tolerance to network Partitions and works a little like the great software triangle (scope, cost, time) in that you may only have 2 of the 3 properties in any given implementation.  Note that we talk about an implementation here because it is perfectly valid, and in many cases quite sensible, to build different features within a single system to different CAP tradeoffs.

Consider a system with high availability requirements.  From this starting point you may chose to design in strong consistency (the data is always the same from any perspective) but you will not be able to distribute the system across any network boundary.  Your other choice would be network tolerance (it will run nicely geographically separated) but you will have to accept a window of inconsistency in both normal and failure modes.  If you have the option of doing away with your availability requirement then you might build something partitioned and consistent but you'll always have to fail to guarantee consistency through any network event.

Trying to keep a widely distributed data set highly available and 100% consistent at any given moment will bring you up against certain laws of physics.  Good luck with that.

No comments: