CAP Theorem

Recently I listened to CAP Theorem 23 Years Later with Eric Brewer on the Software Engineering Daily podcast. I didn’t recognize the name at first (shame on me), but it was very cool to hear about the CAP theorem from the original postulator.

I haven’t had a chance to work on distributed systems in real life, but I find the topic fascinating. Coordinated cache eviction was the closest I’ve come. I read the book Designing Data-Intensive Applications soon after it was released. Besides the in-depth content within, Martin Kleppmann includes so many footnotes in that book it is the perfect jumping-off point into any other tangent of databases, data processing, or distributed systems. I got halfway through his undergrad Distributed Systems lecture series with lecture notes he taught at Cambridge. I completely forgot about that course until writing this blog - definitely will revisit it soon.

Anyway, back to the CAP theorem . It says that in a distributed system, only two of Consistency, Availability, and Partition Tolerance can be achieved. We take partitions as a fact of life (cables cut, power outages, etc) so we are only left with trading off Consistency and Availability. This makes logical sense at a basic level. If we have a cluster that has split down the middle and one side gets a write, that write will not be visible on the other side. The system has a choice to make at this point. Don’t give any answer since it could be wrong (Consistency), or give the answer it has (Availability). Different angles on these properties exist, a common one being read availability vs write availability.

As Eric Brewer highlights on the podcast, the CAP theorem can be taken too black-and-white. There are opportunities to find niches of acceptable tradeoffs based on the desired goals. For example, Google Cloud Spanner uses Google’s TrueTime API to expose time uncertainty. The Spanner Paper (2012) explains how the database adapts commit timing based on how in-sync the clocks are to preserve transaction ordering. This allows Spanner to provide the highest levels of consistency . While driving to a 30th birthday weekend campout, I listened to an interview with Peter Mattis , co-founder and CTO of Cockroach Labs. I knew Cockroach DB had some distributed systems properties, but I didn’t realize the VC pitch when starting the company was some ex-Googlers “implementing the Spanner paper.”

Free Will and Determinism

On an in-one-sense parallel topic, I listened to the London Lyceum podcast on Free Will and Determinism . I was somewhat familiar with the two concepts but had never heard of Compatibilism and Incompatibilism . These distinctions reflect whether one can hold to both Free Will and Determinism at the same time - if they are logically compatible. As this was my first time hearing about those philosophies, I would need to do more research before commenting further.

Gray Area

Last weekend during my birthday campout, we completed a nine-mile hike Friday, four-mile hike Saturday, and two Summits on the Air . After all that, we had a four hour campfire Saturday night. It was a great time of relaxing and reflecting. One topic we talked about was how narrow our knowledge is and how we must lean on others for the rest. My friend has a master’s in early American history, a J.D., and practices law in Virginia. I work in IT around cloud, data, and most recently software engineering. At the end of the day, these are very narrow slices of knowledge. Even topics I am interested in like distributed systems I can’t talk intelligently about at a deep level. Much less philosophy. 😄

Both CAP Theorem and Free Will vs Determinism seem to contain opposing positions to the casual observer. A database can be either available OR consistent. A person can subscribe to either Free Will OR Determinism. But in reality, the subject is much more nuanced. People that research the area can navigate the minutiae and find useful ground in the middle. It benefits everyone to have those people that can dive in deeply and make discoveries for the rest of us. Distributed databases would be far less useful if we threw up our hands and accepted only fully consistent or fully available. For consumers of the technology or philosophical thought, we need to make a conscious decision on how much understanding we need to reap the benefits we are aiming for. While fuller understanding makes our lives richer and our minds more appreciative of others’ work, the beauty of abstractions is we don’t need to be an expert to partake.