Kubernetes StatefulSets Explained
Why StatefulSets Exist
In Kubernetes, containers are ephemeral by design.
A Pod can be:
- Restarted
- Rescheduled to another node
- Deleted and recreated
For many applications, this is perfectly fine. Stateless services (APIs, workers, frontends) don’t care which Pod handles a request.
However, some applications do care:
- Databases must keep their data
- Replicas may have leader/follower roles
- Nodes in a cluster may need predictable identities
This is exactly the problem StatefulSets were created to solve.
What Is a StatefulSet?
A StatefulSet is a Kubernetes workload API object designed to manage stateful applications by providing strong guarantees around identity, storage, and lifecycle.
Unlike a Deployment:
- Pods are not interchangeable
- Each Pod has a long-lived identity
- Each Pod gets its own persistent storage
Typical use cases:
- PostgreSQL, MySQL, MongoDB
- Kafka, Zookeeper, Elasticsearch
- Any replicated system with internal state
Core Guarantees Provided by StatefulSets
1. Stable Pod Identity
Each Pod created by a StatefulSet gets:
- A stable Pod name
- A stable hostname
- A stable network identity
Naming pattern:
<statefulset-name>-<ordinal>
Example:
mysts-0
mysts-1
mysts-2
If mysts-1 crashes or the node dies:
- Kubernetes recreates mysts-1
- With the same name
- With the same identity
This is critical for clustered systems that reference peers by name.
2. Stable and Isolated Persistent Storage
Each Pod:
- Gets its own PersistentVolumeClaim (PVC)
- Is always re-attached to the same volume after restart
- Does not share storage with other replicas
Important implications:
- Data isolation is guaranteed
- One Pod ≠ shared data
- Volumes outlive Pods
If a Pod is deleted:
- The PVC and PersistentVolume remain
- Data is preserved until manually removed
This prevents accidental data loss.
3. Ordered Pod Lifecycle
StatefulSets enforce strict ordering when required.
Creation order:
mysts-0 → mysts-1 → mysts-2
Deletion / scale-down order:
mysts-2 → mysts-1 → mysts-0Why this matters:
- Databases often need a primary before replicas
- Some systems require graceful shutdown in reverse order
- Prevents split-brain scenarios
Deployments do not offer this guarantee.
PersistentVolumeClaim Templates Explained
StatefulSets define storage using volumeClaimTemplates.
Example:
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
What Kubernetes does:
- Creates one PVC per Pod
- Names them predictably (e.g.
data-mysts-0) - Automatically re-attaches them after restarts
As long as:
- The StatefulSet name stays the same
- The claim template is unchanged
➡️ The Pod will always see the same data.
Static vs Dynamic Provisioning (Why It Matters)
Static Provisioning
- PersistentVolumes are created manually
- PVC must match an existing PV
- Common in on-prem clusters
Risk:
- If no PV matches → Pod stays Pending
Dynamic Provisioning
- PVC automatically triggers PV creation
- Managed by a StorageClass
- Preferred in cloud environments
Most modern clusters rely on dynamic provisioning.
Headless Services and Stable DNS
StatefulSets are usually paired with a Headless Service:
clusterIP: None
This enables stable DNS records:
mysts-0.mysts.default.svc.cluster.local
mysts-1.mysts.default.svc.cluster.local
Why this is important:
- Each Pod is addressable directly
- No load-balancing in between
- Ideal for database replication and clustering
Without a headless service, this stability is lost.
What StatefulSets Are Not Designed For
StatefulSets are not a universal solution.
They are ❌ NOT suitable for:
- Stateless APIs
- Horizontally scalable web apps
- Shared writable storage across replicas
If your replicas must share the same data:
- You need RWX volumes
- Or a different architecture entirely
StatefulSet vs Deployment (Mental Model)
| Feature | Deployment | StatefulSet |
|---|---|---|
| Pod identity | Random | Stable |
| Storage | Optional | Per-Pod PVC |
| Scaling order | Parallel | Ordered |
| Use case | Stateless | Stateful |
When Should You Use a StatefulSet?
Use a StatefulSet when:
- Each replica needs its own persistent data
- Pod names and identity matter
- Startup or shutdown order matters
Use a Deployment when:
- Pods are interchangeable
- Data lives outside the Pod
- Fast scaling is more important than order
Key Takeaway
StatefulSets are about predictability:
- Predictable names
- Predictable storage
- Predictable lifecycle
They trade flexibility for strong guarantees, which is exactly what stateful systems need.
Happy clustering 🚀
Comments
Post a Comment