'How do these database management systems practically behave during a network partition?

I am looking into deploying a database management system, replicated across regions (various data centers across a country). I am currently looking into the following candidates:

MongoDB (NoSQL, CP system)
Cockroach (SQL, CP system)
Cassanda (NoSQL, AP system)

How do those three behave during a network partition between nodes? Let's assume we deploy all of them in 3-node clusters.

What happens if the 2 secondary/follower nodes become separated from their leader during a network failure?

Will MongoDB and Cockroach block reads during a network partition? If so, for the entire duration of the partition or only during leader election (Cockroach)?

Will Cassandra allow reads during a network partition?

Solution 1:^[1]

The answer for all three is, in theory, the same: it's up to the application making the read request. You can choose either availability (the read succeeds but could be out of date) or consistency (the read generally fails). The details vary among the three, as do the degree to which the databases are able to actually honor the guarantees they make.

Cassandra

Cassandra in theory: Cassandra reads and writes specify how many nodes need to acknowledge the request in order for it to be considered successful. This allows you to tune your consistency, availability, and throughput requirements to individual workloads. For strong consistency in an N-node cluster, you can require a total of N+1 acks across both reads and writes. In your 3 node example, you could require all 3 nodes to ack for a write, and only 1 for a read. In this case, writes can't be accepted during any network partition, so reads can without sacrificing consistency. Or you could require 3 nodes for a read and only 1 for a write, reversing the availability. More commonly, applications tend to require a majority for both reads and writes: 2 nodes each in this case. This means that both reads and writes can fail during a network partition, but can maximize overall performance. It's also common to just require 1 ack for all queries and live with some inconsistency.

Cassandra in practice: You're going to have to live with some inconsistency regardless. Cassandra generally doesn't pass the Jepsen test suite for detecting inconsistent writes; under heavy load and a network partition you're likely to end up with some corrupted data even when requesting otherwise.

MongoDB

MongoDB in theory: MongoDB has a primary and secondary nodes. If you enable secondary reads, you get data that could be out of date. If you don't, read attempts only go to the primary node, so if you're cut off from that some reads will fail until MongoDB recovers.

MongoDB in practice: Historically, MongoDB has not done well when its consistency is tested--its earlier versions use a protocol that is considered fundamentally flawed, leading to stale and dirty reads even when requesting full consistency. As of 2017, it tentatively seemed like they had fixed those issues with a new protocol. Of these three, Mongo's the one I haven't worked with directly so I'll leave it at that.

CockroachDB

CockroachDB in theory: By default, CockroachDB chooses consistency. If you're lucky, some reads in the first 9 seconds of a network partition will hit the node that acquired a 9-second lease on all the data needed to serve the request. As long as the nodes can't establish a quorum, they can't create new leases, so eventually all reads start failing as no one node can be confident that the other two nodes aren't accepting new writes. However, Cockroach allows "bounded staleness reads" that can be served without a lease. Queries of the form SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s') will continue to succeed for 10-19 seconds into a network partition.

CockroachDB in practice: CockroachDB brought in Aphyr, the researcher whose Jepsen analyses I linked above, early on it its development process. It now runs nightly Jepsen tests simulating a network partition under load and verifying consistency, so it's unlikely to violate its consistency guarantee in that particular way.

Summary

All three databases make an effort to support choosing either consistency or availability. Reads in "consistent mode" will start failing during a network partition until a majority of nodes reestablish communication with each other. Reads in "availability mode" will be less likely to fail during a network partition, but there's a risk you're reading from one isolated node while the other two have reestablished communication with each other and started accepting new writes. Of the three databases, Cassandra has the most flexibility for specifying this behavior per-query, while CockroachDB has the most reliable guarantee of consistency.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	histocrat