A Critique of the CAP Theorem
Etyang Lennah Ipale- SCT321-C014-2789/2017- JKUAT Eldoret Campus
July 25, 2018
In the Year 2000, Dr. Eric Brewer gave a speech in a conference at the University of California. This is where he posed his ‘CAP Theorem’ which states that it is not possible to build a performance of a Distributed System that satis?es Consistency, Availability and Partition tolerance at the same time- one has to be sacri?ced. Twelve years later, Seth Gilbert and Professor Nancy Lynch situated the CAP theorem showing that it is one example of the most common tradeo? between Liveliness and safety in systems that are unreliable. Looking at the CAP in this state enables us to have a clear view of these tradeo?s and the way in which they can be put into practice. In this paper, I shall highlight the misconceptions about the CAP Theorem, criticism and alternative models to CAP theorem and examples of systems that are using eventual consistency model.
More casually, the CAP theorem says that it is not possible to build a distributed database that is able to respond to each and every request and is able to return the results as expected each time. It is an impossibility output whereby something we may want to do is completely out of reach because it is applicable to many distributed systems that have been recently built. It doesn’t mean that it is not possible to build systems that are useful even though we work within the CAP constraints. The CAP theorem puts together three properties that are desirable C (consistency), A (availability), and P (partition-tolerance). The CAP theorem states that in any distributed system where data from di?erent locations is used, it is only possible to have not more than two of the three desirable properties. Abadi (2012) has beforehand provided an explanation of how CAP is actually not about making a choice between two of the three states by providing the answer whether the system can give up consistency or availability if a partition exists. The same way systems which give up consistency always maintain a degree of consistency as well as those which sacri?ce availability do not do it in totality as well. In fact Dr Eric Brewer, shed more light on the CAP theorem stating that the 2 out of 3 justi?cation is not partitioned. Choosing between consistency and availability can happen several times within the same system at very detailed aspects of granularity. Subsystems can not only make choices di?erently but the changes can occur according to how data operations
are being performed or even as per the user involvement in the operations that take place in the system. Also, all the three properties are uninterrupted because Availability is always up to 100 percent, several degrees of Consistency exist and Partitions too have variations, including divergence in the systems to show whether they exist or not (Brewer, 2012). Greiner (2014) in his study states that in today’s technical background, there exists a strong growing desire to scale out when resources such as computers, storage are added in order to complete workloads successfully within the appropriate time. This is made possible by adding more hardware into the system to handle the workload that is increasing. Due to this strategy of scaling, the additional penalty comes up whereby complexity is as a result of this scaling strategy and this is where the CAP theorem is explained. The CAP Theorem is mostly shown as an impossible outcome in distributed systems, mostly in NoSQL distributed databases.
2 Misconceptions about the CAP theorem
While looking for the deliberations of the CAP theorem, there is an exceptional article by Eric Brewer, the father of the CAP theorem: CAP Twelve Years Later: How the “Rules” Have Changed. (Brewer, 2012) To begin with, misinterpretations about CAP that says that you can only have 2 out of 3 is consistent and now there is a partition of the same system that could be a genuine network wait for that partition to come to an end so as to ensure that the system is always in a consistent state hence sacri?cing availability or you could do partial updates therefore clarifying consistency. So you can only have consistency or availability in the case of partition-tolerance but not both. On that note there is actually no way in which one could opt for “P” but it was always about how partitions should be handled and that focuses on the way of detecting partitions and how to behave while in a partition state and ?nally how the system should be brought back to its consistent state after an occurrence of a partition. These are not double decisions but rather an entire range of possible actions to choose from. It is about saying, in case of any failure, availability becomes more important hence choosing to accept temporary inconsistencies and putting strategies for cleaning up afterwards. Looking at it that way gives a clear picture on how databases like Cassandra ?t into this and how features like read/repair work to bring back consistency. There are also ways of trying to minimize the impact of partitions on consistency and availability and how to bring back consistency after a partition. Kleppmann ( 2015) in his paper reviewed some misunderstandings about what CAP means, including implications, irregularities and inexactness in how it is being de?ned, as he brings into light some challenges on how it is being formalized. CAP is frequently interpreted as a con?rmation that Databases that are consistent are more available but this requires a more careful reasoning in terms of trade-o?s particularly in systems that are practically being used. Kleppmann also proposed an alternative to cap by using a delay-sensitivity framework which analyses how operational latency is sensitive to network delays and hence enabling designers to have a better reasoning about trade-o?s between consistency guarantees and fault tolerance in networks.
3 Criticism and an Alternative Model to the CAP Theorem
However, the CAP theorem is often criticized for being too simple and mostly deceptive. More than a decade after its release, Eric Brewer acknowledges that the CAP theorem oversimpli?ed the available options in the occasion of a network partition. According to Brewer, the CAP theorem prohibits only minute parts of the design space: perfect availability and also consistency in the occurrence of partitions, which are always uncommon. System designers have a large of options for dealing and improving from network partitions. The goal of every system must be to take full advantage of both consistency and availability that are sensible to speci?c applications. The fundamental thought is that if a client does write on to one side of the partition, any of the reads that are taken to the other side of the same partition cannot perhaps be able to identify the write that was recently done. That is how one is faced with the option of either responding to the reads which have potentially stale data or keep waiting (maybe forever) so as to get response from the other side hence causing a big compromise on availability. While the majority of distributed systems are extensively operational, and may get millions of requests in their lifespan, CAP expresses the need for us to be careful because there is a possibility of hitting one of the three critical conditions and it is sensible to understand the way in which the system will not be able to meet either Consistency or Availability. It is possible to relax both consistency and availability from the strong requirements imposed by CAP in order to get systems that are useful. In fact, the main aim of CAP is to ensure that any systems designed relaxes on one or both of the guarantees and the responsibility is on the designer to outline when and how these guarantees should occur. (Mehra, 2017) explored more on vital thoughts in the in the Data Engineering ?eld and where it is today. CAP is applied in systems that are distributed and are able to store state. This enables practitioners to have adequate knowledge of the trade-o?s when they are designing shared Data systems that are networked hence in?uencing the design of most Distributed Data Systems. The use of relaxed ACID properties in the CAP process will help improve optimization methods using techniques that are preferred in the optimization literature. This will help to bridge the gap between the traditional ACID theory and the CAP theorem. Traditional ACID properties not dropped but instead weakened so as to ensure optimization of the CAP properties. All systems should work as if both CAP properties and ACID properties are implemented. This optimization process is mostly important in mobile databases where disconnections are more frequent. It is most appropriate in distributed databases such as Electronic Health Records which are based in di?erent hospital locations because the risk of disconnection increases when the participating locations are many. (Lars Rasmus et al 2014)
3.1 NoSQL Databases
In this NoSQL community, the CAP theorem has been used as the explanation for giving up consistency. Since most NoSQL systems usually don’t allow transactions that are crossing the node’s border line, then consistency is used only in replicas. For that reason, the CAP theorem provides a reason for giving up
consistent replicas, replacing this objective with “eventual consistency.” With the relaxed concept, one only guarantees that all replicas will eventually come to the same state when connectivity to the network has been re-established. At least, that’s the predictable wisdom. Many current distributed databases, including the ones that are mostly trapped under the ‘NoSQL’ net, ?nd pleasure in o?ering availability and partition tolerance over strong consistency the reason behind it being that short periods of application not functioning well are not so problematic as compared to short periods of being unavailable. The NoSQL movement has used the CAP theorem as an argument in opposing traditional ACID (atomicity, consistency, isolation, and durability) databases, which always put into priority consistency and partition tolerance at the outlay of a potentially low availability. Recently, Brewer (2012) modi?ed the CAP theorem outlining that all the CAP properties are truly more or less continuous with the possibility of being optimized, bringing them down against each other. Practically, there is a possibility of an application area to both high availability and consistency despite being in the presence of network partitions. With the start of the NoSQL community, and growth of interest in consistent Data Stores, the CAP theorem has faced some criticism. The main issue is that CAP theorem is as a result of the errors that target network failures only. It is mostly the premature sacri?ce of consistency as a solution to network errors that has brought up questions by database community members such as Stonebraker. CAP assures that there are some situations whereby the system has to either give up consistency or availability. This situation is called a critical condition. It does not explain further how likely this condition is since both consistency and availability are very strong guarantees and they only hold if the requirements are met in all of the operations. A single read that is inconsistent or a write that is unavailable make consistency or availability invalid until that critical condition is eventually met. Most NoSQL systems have all used the CAP theorem to give an explanation for why they make all the decisions they have to make either correctly or incorrectly.
4 How Facebook and Google relate to CAP
Facebook normally implements eventual consistency only. Messages take quite some time to move between servers hence these servers may have inconsistent outputs temporarily depending on the messages they see at that very moment. On the other hand, all servers will ultimately get into consistent states because the correct state is always de?ned by the interpretation of the messages in the order of the global timestamps instead of using the order which they arrived. The original signi?cance of containers is that they can be run on the laptop they deploy the same on the cloud. A ?eet of containers are run where there is a controlled way of upgrading them, sending tra?c to them, and scaling services depending on the number of containers that are used for running it so as to increase the capability as the load increases. The trade-o? that is selected often depends on the requirements of a speci?c product. Google’s main focus is availability hence relaxing consistency. The output always depends on which state the server is while responding to the request since there are di?erent inconsistent states on di?erent servers. Google is presently running a project called Kubernetes which is open source. Kubernetes has simpli?ed the process
used to build applications on top of clusters of containers including those using the docker format which t is very popular. (Harris, 2015) The ?nal outcome is fundamentally higher speed for the whole industry, which for users means more choices, more services, and more attractive things every day. Current systems have achieved advanced performance, lesser latency and almost up-time in all the data centers across the entire globe. These systems are more complex as compared to the single networked ones. Being able to understand the complexity in these systems and making an appropriate trade o? is very important. (Nazrul, 2018). Brewer, now being the vice president dealing with infrastructure at Google, explained why the work he is doing on the application containers is as big as cloud computing and the way the CAP theorem is still holding up ever since its inception.
4.1 Amazon’s Dynamo-eventual consistency
Most Real-time systems decide to relax availability in the case where consistency is of the highest signi?cance. Amazon’s Dynamo, relax consistency so as to maintain the highest degree of availability. In order to appreciate the modern Distributed Database System design, it is vital to know the circumstance in which the system was built. Amazon initially designed the Dynamo in order to serve data to the main functions of its electronic commerce shopping cart. The CAP theorem is not purely a case of “pick two”, since while Amazon’s Dynamo always sacri?ces consistency for availability; it does it using eventual consistency and not the entire non-existence of consistency.
4.2 Other examples of Systems which use eventual consistency model
Eventual Consistency is a precise way of de?ning weak consistency whereby it is widely used most popular systems such as Facebook, airline booking systems as well as ATMs in Banks. Ryan ?nds an exciting story on Facebook and decides to share it with Linda asking her to check it out on her wall whereby she doesn’t ?nd it but after waiting for a minute and checking back again she ?nds the story. This is because of more than a billion active users on Facebook and huge amounts of data which is generated at each given time. Facebook works with stale data and it also uses eventual consistency model which o?ers an alternative reducing the load so as to improve availability. When the majority of the seats are vacant in an airline booking system, it is acceptable to rely on data that is somewhat out of date because availability is more critical. When the plane is almost getting full, accuracy is of high priority so as to avoid overbooking. This is where consistency becomes a critical factor. An ATM is even so often disconnected from the banks and still is able to give out money. The system is always available but sometimes inconsistent. It chooses to always be available and can give money up to a certain limit hence hiding its inconsistency. Then it later chooses a way to pay compensation for the errors hence preventing mistakes altogether.
CAP Theorem is very signi?cant in many ?elds mostly where there is need to make a trade-o? based on a speci?c case. These concepts are di?erent and each has a reason as to why there shall be a tradeo?. That is why the system have been broken down and are mostly focusing on concepts like “Eventual Consistency”, which explains that consistency has changed from being absolute to relative up to the extent of the application’s requirement, and the CAP theorem. It is achievable to have the systems which are highly available, partition tolerant and provide a degree of consistency but depending on whether the degree is appropriate for that speci?c workload is another a?air. Partition tolerance is not something that can be assumed because it is compulsory in distributed systems so it is not possible to choose it.
A Critique of the CAP Theorem