The core idea behind NoSQL is not to replace SQL, but to redesign consistency strategies for distributed environments. This article focuses on CAP, BASE, and eventual consistency, explaining how they address massive data volume, high concurrency, and network partitions. Keywords: NoSQL, CAP, eventual consistency.
Technical Specification Snapshot
| Parameter | Details |
|---|---|
| Domain | Distributed Databases / NoSQL Theory |
| Language | Chinese |
| Key Protocols/Models | CAP, BASE, Eventual Consistency |
| GitHub Stars | Not provided in the source |
| Core Dependencies | Distributed replicas, network communication, replication mechanisms, session semantics |
| Representative Systems | HBase, Cassandra, DynamoDB, CouchDB, Riak |
The foundation of NoSQL design begins with distributed trade-offs
The three foundational pillars of NoSQL typically refer to CAP, BASE, and eventual consistency. These are not isolated concepts. They are different perspectives on the same class of problems: once data is distributed across multiple nodes, the system must make trade-offs among consistency, availability, and fault tolerance.
Relational databases tend to favor strongly consistent transactions, which makes them suitable for strict scenarios such as finance and order processing. NoSQL systems are more commonly used for massive read/write workloads, elastic scaling, and multi-replica deployments, so they place greater emphasis on availability and partition tolerance rather than maintaining strong consistency at every moment.
A minimal model for understanding NoSQL trade-offs
Single-node database: prioritize strong consistency
Distributed database: prioritize fault tolerance and scalability
Once a network partition occurs: choose between C and A
This model illustrates the starting point of NoSQL theory: once you enter a distributed environment, network failures are no longer exceptions. They become a default assumption.
CAP theory states that distributed systems cannot fully achieve all three guarantees at once
CAP stands for Consistency, Availability, and Partition Tolerance. Its central conclusion is that when a network partition occurs in a distributed system, the system cannot fully satisfy C, A, and P at the same time. At most, it can prioritize two of them.
Consistency means that read operations always return the most recent write result. Availability means that every request receives a response within a bounded time. Partition tolerance means that the system as a whole must continue operating even when network communication between nodes is interrupted.
AI Visual Insight: This diagram usually presents the three CAP elements as a triangle or a three-way constraint, highlighting that when a network partition occurs, a distributed system cannot guarantee both strong consistency and continuous availability at the same time. It is useful for understanding the core tension behind database architecture decisions.
The three typical CAP paths have clear engineering implications
A CA system emphasizes consistency and availability, but it effectively weakens partition tolerance and is usually closer to a single-node or strongly centralized architecture. Traditional relational databases are often seen as examples of this category, although strictly speaking they rely on stable networks and centralized deployment assumptions.
A CP system emphasizes consistency and partition tolerance. When a partition occurs, the system would rather reject some requests than return stale data. Systems such as HBase, BigTable, and Neo4J are closer to this design approach.
An AP system emphasizes availability and partition tolerance. It allows temporary data inconsistency, but the system continues to respond to requests. Cassandra, Riak, CouchDB, and DynamoDB are commonly viewed as representative AP systems.
# Use pseudocode to describe write propagation under CAP
replica_m1 = "val_0"
replica_m2 = "val_0"
replica_m1 = "val_1" # Node M1 completes the local write first
network_ok = False
if network_ok:
replica_m2 = replica_m1 # Synchronize replicas when the network is healthy
else:
pass # During a network partition, M2 keeps the old value, causing temporary inconsistency
This code demonstrates the most typical issue in an AP scenario: to keep requests reachable, the system may temporarily return stale replica data.
The BASE model is NoSQL’s practical revision of ACID
BASE stands for Basically Available, Soft State, and Eventual Consistency. It is not the opposite of ACID. Instead, under the goal of distributed high availability, it downgrades and restructures strong consistency transactions in a practical way.
ACID emphasizes Atomicity, Consistency, Isolation, and Durability, which makes it suitable for transaction-critical scenarios. But under massive concurrency and cross-node replication, fully enforcing ACID can be very expensive. The BASE approach is to keep the service alive first, then gradually converge state through asynchronous replication.
The three components of BASE correspond to different tolerance mechanisms
Basically Available means that local failures do not cause a global outage. For example, if 1 out of 10 nodes fails, the system can still provide most of its services. The goal is not 100% perfection, but sustainable operation of the business as a whole.
Soft State means that data state in the system is allowed to remain temporarily unsynchronized. In other words, intermediate state may be unstable for a period of time, as long as it can eventually converge to the correct result.
Eventual Consistency means that updates are not immediately visible on all replicas, but after some time, all replicas will become consistent. This is the execution target of BASE and the default replication semantics in many NoSQL systems.
# Use a message queue to simulate the transition from soft state to eventual consistency
account_a = 100
account_b = 50
transfer_queue = []
account_a -= 20 # Deduct funds from user A first
transfer_queue.append(20) # Funds enter the message queue, and the system enters a soft state
if transfer_queue:
money = transfer_queue.pop(0)
account_b += money # After asynchronous delivery, the system becomes consistent again
This code shows that soft state is not an error. It is an acceptable intermediate state in an asynchronous system before convergence completes.
Eventual consistency is the most important execution mechanism in AP systems
Eventual consistency is a special case of weak consistency. It does not guarantee that all reads will immediately see the latest value after a write completes, but it does guarantee that if no new writes continue to overwrite the value, the system will eventually converge to the same result after some period of time.
This period is commonly called the inconsistency window. Its size depends on factors such as replication delay, network quality, system load, and replica count. The shorter the window, the less likely users are to notice inconsistency. The longer the window, the higher the cost of business-level compensation.
AI Visual Insight: This type of diagram usually presents CA, CP, and AP as separate branches to show different product design priorities. It emphasizes that databases are not simply “better” or “worse,” but instead make strategic choices among consistency, availability, and partition tolerance based on business goals.
Eventual consistency can be divided into five common semantics
Causal consistency requires operations with causal relationships to become visible in order. Read-your-writes consistency guarantees that after the same client performs a write, it will see the new value in subsequent reads. Session consistency limits that guarantee to a single session.
Monotonic read consistency requires that once a client has read a newer value, it will not later read an older one. Monotonic write consistency guarantees that write operations issued by the same client take effect in order. Otherwise, program behavior becomes difficult to reason about.
Developers should choose consistency levels based on business requirements
Business scenarios such as user feeds, comments, like counters, and recommendation streams can usually tolerate short-term inconsistency, so they are better suited to AP plus eventual consistency. Critical paths such as account balances, inventory deduction, and payment ledgering are better suited to CP systems or stronger transactional mechanisms.
In real engineering practice, the common strategy is not to choose one extreme exclusively, but to layer consistency by data type: core transaction paths pursue strong consistency, while peripheral caches, search indexes, and timeline systems accept eventual consistency. This approach balances performance and reliability.
FAQ
1. Why must distributed systems take partition tolerance seriously?
Because network failures, node disconnections, and data center instability are normal events in distributed systems, not rare exceptions. As long as a system runs across multiple nodes, it must assume that partitions will occur.
2. Does BASE mean data consistency can be ignored?
No. BASE does not abandon consistency. It changes the target from “immediate consistency” to “consistency after some period of time.” The system still needs replication, compensation, and conflict resolution mechanisms.
3. Which business scenarios are suitable for eventual consistency?
It is suitable for scenarios that are more sensitive to availability, throughput, and latency, and that can tolerate temporarily stale data, such as social feeds, log aggregation, recommendation systems, distributed caches, and content delivery.
Summary
The three pillars of NoSQL reveal the core reality of distributed databases: CAP defines the boundary of constraints, BASE provides the engineering compromise, and eventual consistency turns high-availability systems into workable replication semantics. Understanding these three concepts is essential for making sound database and architecture decisions.
Core takeaway: This article reconstructs the core theory behind NoSQL and systematically explains the definitions, trade-offs, and typical scenarios of CAP, BASE, and eventual consistency, helping developers understand why distributed databases make engineering choices among consistency, availability, and partition tolerance.