Redis Cluster is Redis’s native distributed solution. It addresses single-node capacity limits, the inability to scale writes horizontally, and failover processes that otherwise depend on manual intervention. This article focuses on high-availability mode comparisons, hash slot routing, 6-node deployment, and troubleshooting. Keywords: Redis Cluster, hash slots, failover.
Technical Specifications at a Glance
| Parameter | Value |
|---|---|
| Core Topic | Redis Cluster |
| Languages | Bash, INI |
| Protocols / Mechanisms | Gossip, primary-replica replication, Raft-like election |
| Redis Version | 5.0.14 |
| Deployment Topology | 3 masters, 3 replicas |
| Operating System | OpenEuler 24 |
| Service Port | 6379 |
| Cluster Bus Port | 16379 |
| GitHub Stars | Not provided in the source |
| Core Dependencies | gcc, make, tar, zlib-devel |
Redis high availability evolves in layers
Common Redis high-availability patterns include primary-replica replication, Sentinel, and Redis Cluster. These are not mutually exclusive. Instead, they represent different capability tiers for different scales.
Primary-replica replication solves data backup and read scaling, but write traffic still concentrates on the primary node. Sentinel adds automatic failover, but it does not provide true sharding. Redis Cluster provides distributed sharding, automatic failover, and horizontal scaling at the same time.
The three modes serve different purposes
| Mode | Core Capability | Main Limitation | Best Fit |
|---|---|---|---|
| Primary-replica replication | Read/write separation, data backup | Failover depends on manual intervention | Small workloads |
| Sentinel | Automatic primary failover | Write scaling remains limited | Medium-sized workloads |
| Redis Cluster | Sharding, scaling, automatic failover | Multi-key operations are limited, and only database 0 is supported | Large-scale, high-concurrency workloads |
# You must use -c when connecting to a cluster
redis-cli -h 192.168.10.101 -p 6379 -c
# After writing data, the client automatically redirects to the node that owns the target slot
set test_key hello_cluster
get test_key
These commands verify whether the client supports automatic cluster routing.
Redis Cluster uses hash slots for native sharding
Redis Cluster maintains exactly 16,384 hash slots. Every key is first processed with CRC16, then modulo 16384, and finally mapped to a specific primary node.
This design removes the need for application-side shard routing and turns scale-out data movement into slot migration instead of full database migration. Only primary nodes serve slot reads and writes. Replica nodes handle replication and failover takeover.
Hash slots and cluster communication form the basis of stability
The routing formula is CRC16(key) mod 16384. Nodes exchange status, slot distribution, and primary-replica role information through the Gossip protocol, which makes Redis Cluster fundamentally a decentralized topology.
When a primary node becomes unreachable and more than half of the primary nodes mark it as FAIL, its associated replica nodes start a promotion vote. The replica that receives a majority becomes the new primary. After the original primary recovers, it typically rejoins as a replica.
bind 127.0.0.1 192.168.10.101
protected-mode yes
port 6379
daemonize yes
appendonly yes
# Enable cluster mode
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 15000
cluster-require-full-coverage no
This configuration defines the minimum required conditions for running Redis Cluster.
A 6-node deployment is the minimum high-availability unit for production
A 3-master, 3-replica deployment is the recommended baseline. The reason is straightforward: you need at least three primary nodes to form a stable majority, which ensures reliable failure detection and primary-replica failover.
The sample environment uses OpenEuler 24. Node IPs range from 192.168.10.101 to 192.168.10.106. All nodes use port 6379, and the Redis version is 5.0.14.
Every node must complete baseline initialization first
First disable the firewall and SELinux, then install build dependencies, extract the source package, and complete the installation. All six machines must use a consistent configuration. Otherwise, cluster creation can fail because of mismatched node capabilities.
# Disable the firewall and SELinux
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# Install build dependencies
dnf -y install gcc* zlib-devel tar make
# Compile and install Redis
tar -zxvf redis-5.0.14.tar.gz
cd redis-5.0.14
make
make PREFIX=/usr/local/redis install
ln -s /usr/local/redis/bin/* /usr/local/bin/
cd utils
./install_server.sh
This script completes Redis compilation, installation, symlink creation, and service initialization.
Provide the full node list at cluster creation time
After all nodes are started, run the cluster creation command on any primary node. --cluster-replicas 1 means Redis automatically assigns one replica to each primary, resulting in a 3-master, 3-replica topology.
# Create a 3-master, 3-replica cluster in one step
redis-cli --cluster create --cluster-replicas 1 \
192.168.10.101:6379 \
192.168.10.102:6379 \
192.168.10.103:6379 \
192.168.10.104:6379 \
192.168.10.105:6379 \
192.168.10.106:6379
This command automatically assigns slots and establishes primary-replica replication relationships.
Cluster scaling must revolve around slot migration
There are two common ways to add a node: cluster meet introduces the node to the cluster, while add-node is better suited for standardized operational workflows. After a node joins, you must rebalance slots if the node is expected to carry traffic.
If you add a replica node, first join it to the cluster, then bind it to the target primary with cluster replicate. Before you remove a primary node, you must clear its slots. Otherwise, the removal fails because the node still owns hash slots.
Common operations should be documented as a fixed runbook
# View node and cluster status
redis-cli -h 192.168.10.101 -p 6379 cluster nodes
redis-cli --cluster check 192.168.10.101:6379
redis-cli -h 192.168.10.101 -p 6379 cluster info
# Add a node and trigger rebalancing
redis-cli --cluster add-node 192.168.10.108:6379 192.168.10.101:6379
redis-cli --cluster rebalance --cluster-threshold 1 --cluster-use-empty-masters 192.168.10.101:6379
These commands cover three high-frequency operational tasks: cluster inspection, scale-out onboarding, and slot rebalancing.
Most common failures relate to residual state and network connectivity
Slot 0 is already busy usually means the node still has stale slot metadata or the nodes-6379.conf file was not cleaned up. In this case, simply restarting the service usually does not work. You must clear the data and reset the cluster state.
Waiting for cluster to join is commonly caused by port 16379 not being open, nodes not having performed meet, or an incorrect bind configuration. In addition to the service port, Redis Cluster depends on the cluster bus port for node coordination.
# Clean up residual cluster state
redis-cli -h 节点IP -p 6379 flushall
redis-cli -h 节点IP -p 6379 cluster reset
rm -rf /var/lib/redis/6379/nodes-6379.conf
redis-server /etc/redis/6379.conf
These commands fix slot conflict issues caused by residual node state.
In production, recoverability matters more than peak performance
Enable AOF persistence, distribute primary and replica nodes across racks, and expose only ports 6379 and 16379 to trusted network ranges. For scaling, follow this sequence: add the node, configure replication, then rebalance. This helps avoid unstable routing during migration.
At a minimum, monitoring should cover node health, slot distribution, replication lag, command latency, and memory utilization. If replica nodes need to serve read traffic, run readonly after connecting.
FAQ
Why is a 3-master, 3-replica topology the minimum recommendation for Redis Cluster?
Because failure detection and new primary election depend on majority voting. With fewer than three primary nodes, network jitter or a single-node failure significantly reduces cluster availability and the accuracy of failure decisions.
Why must the client use -c when connecting to a Redis Cluster?
Because cluster data is distributed across different primary nodes by slot. The -c option allows the client to automatically follow MOVED/ASK redirections. Without it, you will frequently see errors or only access data from a single node.
Why must you clear slots before removing a primary node?
If a primary node still owns slots, it is still responsible for data routing. Removing it directly breaks slot integrity and leaves the cluster unhealthy. You must migrate or clear its slots before deletion.
AI Readability Summary
This article systematically reconstructs the core principles and implementation workflow of Redis Cluster. It covers a comparison of three high-availability modes, the 16,384-hash-slot model, Gossip-based communication, failover mechanisms, and the deployment, scaling, troubleshooting, and production practices for a 3-master, 3-replica cluster. It is well suited for development and operations teams that need to quickly build a horizontally scalable Redis cluster.