Canonical Texts
kubectl Debugging Workflows That Actually Matter
kubectl is not a spellbook. It is an instrument. Learn a small number of workflows that reliably surface truth when the cluster is stressed.
Text
Authored as doctrine; evaluated as systems craft.
Doctrine
Under pressure, operators thrash because they lack a sequence. They run random commands, apply random YAML, and destroy the causal thread.
Kubblai doctrine: evidence first; smallest change; explicit verification; memorialize the lesson.
- Prefer `describe` + events before logs when the failure might be platform-level.
- Treat `rollout status` as a gate; don’t assume progress.
- Use `auth can-i` to prove RBAC instead of arguing about it.
The minimal workflow
A sequence that covers most incidents without expanding blast radius.
kubectl
shell
kubectl get pods -n <ns> -o wide
kubectl describe pod <pod> -n <ns>
kubectl get events -n <ns> --sort-by=.lastTimestamp | tail -n 40
kubectl logs <pod> -n <ns> --previous --all-containers=trueRollouts and reversibility
Rollouts are controlled change. Make them explicit.
kubectl
shell
kubectl rollout status deploy/<name> -n <ns>
kubectl rollout history deploy/<name> -n <ns>
# rollback posture (use deliberately):
kubectl rollout undo deploy/<name> -n <ns>Targeted YAML inspection
You rarely need the entire object. You need the governing fields: requests, probes, selectors, and references. Extract them and keep the investigation narrow.
kubectl
shell
kubectl get pod <pod> -n <ns> -o yaml | rg -n "resources:|requests:|limits:|readinessProbe|livenessProbe|startupProbe|envFrom|secretKeyRef|configMapKeyRef"RBAC proof
When permission is denied, prove it with an exact sentence.
kubectl
shell
kubectl auth whoami
kubectl auth can-i <verb> <resource> -n <ns>Field notes
kubectl can become the incident when you list the world during API saturation. Prefer targeted queries. Prefer read-only diagnosis during systemic failure.
When the control plane is unhealthy, reduce churn. ‘Trying harder’ is not a strategy.
Canonical Link
Canonical URL: /library/kubectl-debugging-workflows-that-actually-matter
Related Readings
Advanced Disciplines
LibraryDebugging the Control Plane Under Pressure
The control plane fails quietly, then all at once. Debugging it requires you to reduce churn, read saturation signals, and avoid write amplification.
Canonical Texts
LibraryIncident Response as a Trial of Faith
Incidents reveal the true governance of your platform: who can act, what can be changed, and whether your system can recover with discipline.
Rites & Trials
LibraryIncident Doctrine for Platform Teams
Platform incidents are governance incidents. The doctrine must define authority, evidence, safe mitigations, and how memory becomes guardrail.
Canonical Texts
LibraryObservability for People Who Actually Carry the Pager
If observability does not change decisions during an incident, it is decoration. Signal must be tied to failure modes and owned by the people who respond.
Governance & Power
LibraryRBAC and the Governance of Power
RBAC is the cluster’s constitution. Poorly written, it becomes silent catastrophe during incident response.