Lab · Intermediate
Lab: Probe Semantics Under Load
Probes are not decoration. Practice designing probes that prevent silent outages and avoid restart storms.
Prerequisites
What you should have before you begin.
WorkloadsReliabilityOperations
- A cluster and namespace
- kubectl installed
- Basic probe knowledge
Lab text
Follow the sequence. Change one thing at a time.
Goal
You will learn probe discipline: make readiness represent serving ability, make liveness conservative, and use startupProbe to protect warm-up.
- Separate readiness from liveness.
- Avoid dependency checks in liveness.
- Tune timeouts using measured latency.
Scenario
A workload behaves correctly in quiet conditions, but under load the liveness probe fails and restarts the pod. Stability collapses when you need it most.
Your job is to redesign probes so the system degrades safely.
Investigate the current probe configuration
Read the spec and correlate with events.
- Probe failures should be visible in events.
- Look for timeouts vs explicit status codes.
kubectl
shell
kubectl get pod <pod> -n <ns> -o yaml | rg -n "readinessProbe|livenessProbe|startupProbe|timeoutSeconds|failureThreshold|periodSeconds"
kubectl describe pod <pod> -n <ns>Resolution patterns
Fix the semantics before tuning numbers.
- Move dependency checks into readiness (traffic gate).
- Add startupProbe if warm-up is slow or bursty.
- Make liveness check only what must be true to keep running (avoid deep calls).
- Tune timeouts to observed p99 under expected load, not under idle conditions.
Related
Canonical link
Canonical URL: /labs/probe-semantics-under-load