Startup, Liveness, and Readiness Probes in Kubernetes
Kubernetes provides startup, liveness, and readiness probes to continuously monitor container health.
These probes are a core building block for self-healing systems and allow Kubernetes to make
intelligent decisions during the entire lifecycle of a container: from startup, through steady state,
to failure and recovery.
When used correctly, probes dramatically reduce downtime and protect users from unstable services.
What Kubernetes Means by “Healthy”
A container being running does not necessarily mean it is healthy.
Kubernetes distinguishes between multiple states:
- The container has not started correctly
- The container is running but not yet ready
- The container is running but stuck
- The container is healthy and serving traffic
Each probe targets exactly one of these states, which is why they should never be treated as interchangeable.
Why Probes Are Essential in Real Systems
Consider the following real-world scenarios:
- A web API starts, but database migrations are still running
- A JVM service is alive, but all worker threads are blocked
- A cache temporarily becomes unavailable
- An application is overloaded but could recover without a restart
Without probes, Kubernetes cannot react intelligently.
With probes, Kubernetes can restart, isolate, or wait — depending on the situation.
Types of Kubernetes Probes
Startup Probe — “Did the application start?”
The startup probe answers a very specific question:
Has the application finished starting up?
Key characteristics:
- Runs only during container startup
- Temporarily disables liveness and readiness probes
- Failure beyond the threshold → container restart
- Success → liveness and readiness probes are enabled
This probe is critical for slow-starting applications, such as:
- JVM-based services
- Applications performing schema migrations
- Services loading large datasets into memory
Without a startup probe, liveness probes may kill the container before it ever becomes ready.
Liveness Probe — “Is the application still making progress?”
The liveness probe determines whether the application is irreversibly broken.
Important details:
- Runs continuously for the entire container lifetime
- Failure threshold exceeded → container restart
- Designed to detect:
- Deadlocks
- Infinite loops
- Unrecoverable internal errors
A liveness probe should be cheap and reliable.
If it fails, Kubernetes assumes the only safe recovery strategy is a restart.
Readiness Probe — “Should traffic be sent right now?”
The readiness probe controls traffic flow, not container lifecycle.
Key behavior:
- Runs continuously
- Failure → container is removed from Service endpoints
- Container keeps running
- Success → traffic is restored automatically
Readiness probes are ideal for handling temporary failures, such as:
- Database connection loss
- Dependency timeouts
- Warm-up phases after restarts
How Probes Work Together
Probes are evaluated in a strict order:
- Startup probe (only at startup)
- Liveness probe (container survival)
- Readiness probe (traffic eligibility)
This separation allows Kubernetes to:
- Avoid killing containers too early
- Restart containers only when necessary
- Protect users from unstable instances
Visual Flow of a Startup Probe
During startup:
- Kubernetes checks whether a startup probe exists
- If defined, it runs instead of liveness/readiness
- Failure threshold exceeded → restart
- Success → normal lifecycle continues
Visual Flow of a Liveness Probe
Key takeaway:
Liveness failures indicate irrecoverable state and trigger restarts.
Visual Flow of a Readiness Probe
Key takeaway:
Readiness failures do not kill containers — they only stop traffic.
Common Probe Configuration Options
| Field | Description |
|---|---|
initialDelaySeconds | Delay before first execution |
periodSeconds | Interval between checks |
timeoutSeconds | Probe execution timeout |
failureThreshold | Failures before action |
successThreshold | Successes required to recover |
Common Mistakes and Anti‑Patterns
- Using database checks in liveness probes
- Reusing the same endpoint for all probes
- Making probes too slow or expensive
- Forgetting startup probes for slow applications
These mistakes often cause restart loops or unnecessary downtime.
Summary
| Probe | Question Answered | Action on Failure |
|---|---|---|
| Startup | Did startup finish? | Restart container |
| Liveness | Is it still healthy? | Restart container |
| Readiness | Can it receive traffic? | Stop traffic |
Proper probe design enables Kubernetes to:
- Self-heal automatically
- Minimize user impact
- Run production workloads safely
Comments
Post a Comment