Skip to content

Reference

Lexicon

Definitions that hold in production. Each term includes: what it is, how it fails, what to inspect, and where to read next.

Terms

15 entries · concise and operational

Entries

Readable, copyable, and linked into the rest of the shrine.

Term

Pod

The smallest schedulable unit in Kubernetes.

Top

Definition

A Pod is a wrapper around one or more containers that share networking and (optionally) storage. The scheduler places pods onto nodes; kubelet makes them real. Most operational symptoms begin at the pod layer: probes, resource pressure, image pulls, and container exits.

In practice

  • A pod is ephemeral by design; treat it as cattle, not a pet.
  • Readiness gates traffic; liveness controls restart behavior.
  • Pod status is testimony: look at conditions, container states, and events.

What to inspect

  • kubectl get pod -n <ns> -o wide
  • kubectl describe pod <pod> -n <ns>
  • kubectl logs <pod> -n <ns> --previous

Common mistakes

  • Using liveness probes to model dependency readiness (causes restart storms).
  • Ignoring events and reading only the ‘STATUS’ column.
  • Treating a pod restart as a fix instead of a symptom.

Related reading

Term

Deployment

A rollout controller for stateless workloads.

Top

Definition

A Deployment manages ReplicaSets and performs rolling updates toward a declared pod template. It is the default mechanism for safe, staged change—if probes and surge math reflect reality.

In practice

  • Deployments converge via ReplicaSet churn; watch conditions and rollout status.
  • maxSurge/maxUnavailable are capacity and risk decisions, not defaults.
  • Rollback is a bounded operation; external state and migrations can be one-way doors.

What to inspect

  • kubectl rollout status deploy/<name> -n <ns>
  • kubectl describe deploy/<name> -n <ns>
  • kubectl get rs -n <ns> --sort-by=.metadata.creationTimestamp

Common mistakes

  • Shipping without readiness probes and calling it ‘safe’.
  • Over-surge rollouts that exceed headroom and create Pending storms.
  • Treating success as ‘pods are Running’ instead of verifying serving health.

Related reading

Term

Service

A stable network identity backed by endpoints.

Top

Definition

A Service selects pods (endpoints) and provides stable DNS and virtual IP routing. If endpoints are empty, routing cannot work—regardless of DNS, ingress, or client code.

In practice

  • Selectors must match labels exactly; one mismatch yields zero endpoints.
  • Readiness gates whether a pod becomes an endpoint.
  • Ports must align: port/targetPort/container listener.

What to inspect

  • kubectl get svc,ep,endpointslices -n <ns>
  • kubectl describe svc <svc> -n <ns>
  • kubectl get pods -n <ns> --show-labels

Common mistakes

  • Assuming DNS is broken when endpoints are empty.
  • Misaligned targetPort causing silent connection failures.
  • Using overly broad selectors that accidentally pick the wrong pods.

Related reading

Term

Ingress

External HTTP routing into the cluster.

Top

Definition

Ingress defines HTTP routing rules, but it is only as real as the ingress controller that implements it. Many ‘ingress problems’ are actually service endpoint problems, DNS problems, or controller health issues.

In practice

  • Ingress is a contract + an implementation (controller).
  • Debug from the edge inward: controller → service → endpoints → pods.
  • TLS and DNS failures often masquerade as routing failures.

What to inspect

  • kubectl get ingress -A
  • kubectl describe ingress <ing> -n <ns>
  • kubectl logs -n <ingress-ns> deploy/<controller> --tail=200

Common mistakes

  • Changing ingress YAML repeatedly without checking controller logs.
  • Ignoring service endpoints (routing cannot work without them).
  • Confusing DNS resolution failures for ingress routing problems.

Related reading

Term

Control Plane

The governing system that stores intent and drives convergence.

Top

Definition

The control plane is the API server, persistence (etcd), controllers, and scheduler. If it is slow or failing admission, everything else becomes unreliable: rollouts stall, controllers thrash, and recovery actions fail.

In practice

  • Treat API latency and admission health as first-order signals.
  • Backpressure and rate limits are real failure modes.
  • Many incidents are control-plane incidents wearing workload masks.

What to inspect

  • kubectl get --raw /readyz?verbose
  • kubectl get events -A --sort-by=.lastTimestamp | tail -n 50

Common mistakes

  • Assuming workload failures are app-only during API latency spikes.
  • Ignoring webhook timeouts until deploys stop.
  • Saturating the API with list/watch loops during an incident.

Related reading

Term

Namespace

A logical boundary for names and policy.

Top

Definition

A namespace scopes names (most objects), RBAC bindings, quotas, and many policies. Namespaces are a governance tool, not a security boundary by themselves; isolation requires policy, identity discipline, and network posture.

In practice

  • Use namespaces to express ownership and blast radius.
  • Apply quotas and policy baselines per namespace.
  • Keep exceptions explicit and reviewed.

What to inspect

  • kubectl get ns
  • kubectl get resourcequota,limitrange -n <ns>
  • kubectl auth can-i --list -n <ns>

Common mistakes

  • Treating namespaces as tenant isolation without enforcement.
  • Allowing unowned namespaces to persist indefinitely.
  • Mixing unrelated workloads that should not share failure domains.

Related reading

Term

Node

A machine that runs pods under kubelet control.

Top

Definition

A node provides compute, memory, and local runtime state. Scheduling is placement onto nodes; reliability depends on node health, pressure signals, and correct capacity reporting.

In practice

  • Treat nodes as failure domains.
  • Watch pressure conditions and eviction behavior.
  • Separate node pools by workload class and risk.

What to inspect

  • kubectl get nodes -o wide
  • kubectl describe node <node>
  • kubectl top nodes

Common mistakes

  • Ignoring node pressure until evictions cascade.
  • Assuming all nodes are interchangeable when topology differs.
  • Mixing sensitive and noisy workloads on the same pool without governance.

Related reading

Term

kubelet

The node agent that makes pods real.

Top

Definition

kubelet watches the API for pod assignments, pulls images, sets up volumes, and reports status. Many incidents present as ‘pods failing’ while the root cause is kubelet health, container runtime issues, or node pressure.

In practice

  • Know where kubelet logs live in your environment.
  • Correlate pod failures with node pressure and runtime errors.
  • Treat kubelet as critical infrastructure.

What to inspect

  • kubectl describe node <node>
  • kubectl get events -A --sort-by=.lastTimestamp | tail -n 50
  • kubectl describe pod <pod> -n <ns>

Common mistakes

  • Restarting workloads repeatedly instead of fixing node/runtime.
  • Ignoring image pull and volume mount errors at the node layer.
  • Assuming node NotReady is always a network issue.

Related reading

Term

Scheduler

The control-plane component that decides placement.

Top

Definition

The scheduler chooses a node for each pod based on requests and constraints: taints/tolerations, affinity, topology, quotas, and priorities. Under scarcity, scheduling becomes governance.

In practice

  • Requests must be honest; they drive placement.
  • Constraints must be satisfiable; avoid accidental impossibility.
  • Define priority/preemption posture before scarcity arrives.

What to inspect

  • kubectl describe pod <pod> -n <ns>
  • kubectl get events -n <ns> --sort-by=.lastTimestamp | tail -n 40
  • kubectl get nodes

Common mistakes

  • Using anti-affinity defaults that waste capacity.
  • Over-constraining placement until pods can never schedule.
  • Treating Pending as ‘the cluster is broken’ instead of reading scheduler testimony.

Related reading

Term

Reconciliation

The control-loop discipline that closes drift.

Top

Definition

Reconciliation is the continuous process of comparing declared intent to observed state and taking repeatable steps toward convergence. It is the core operational model of Kubernetes controllers.

In practice

  • Measure time-to-converge, not just current state.
  • Design controllers to be idempotent and backpressure-aware.
  • Treat drift as an incident precursor.

What to inspect

  • kubectl get <kind> -o yaml
  • kubectl describe <kind> <name>
  • kubectl get events -A --sort-by=.lastTimestamp | tail -n 50

Common mistakes

  • Assuming ‘eventual consistency’ excuses persistent drift.
  • Building reconcile loops that thrash the API with status spam.
  • Not instrumenting controllers and then guessing under pressure.

Related reading

Term

ConfigMap

Configuration data injected into workloads.

Top

Definition

ConfigMaps store non-secret configuration and can be mounted as files or injected as env vars. Operationally, the important questions are: how config changes are rolled out, how defaults are controlled, and how misconfigurations are detected quickly.

In practice

  • Version config changes like code.
  • Prefer explicit keys and validation over implicit defaults.
  • Treat config rollout as a change with a rollback story.

What to inspect

  • kubectl get configmap -n <ns>
  • kubectl describe configmap <cm> -n <ns>
  • kubectl get deploy/<name> -n <ns> -o yaml | rg -n "configMap"

Common mistakes

  • Embedding huge config blobs (creates API/storage debt).
  • Changing config without correlating to rollout behavior.
  • Mixing secrets into ConfigMaps.

Related reading

Term

Secret

Sensitive values distributed to workloads.

Top

Definition

Secrets are API objects used to distribute sensitive values. Their safety depends on encryption at rest, RBAC, audit, node compromise assumptions, and how workloads consume and rotate credentials.

In practice

  • Minimize secret material in-cluster; prefer workload identity when possible.
  • Make rotation a designed workflow.
  • Audit access continuously.

What to inspect

  • kubectl auth can-i get secrets -n <ns>
  • kubectl get secret -n <ns>
  • kubectl describe sa <sa> -n <ns>

Common mistakes

  • Treating base64 as encryption.
  • Overbroad RBAC that makes secrets effectively public.
  • No rotation plan; credentials become permanent liabilities.

Related reading

Term

ServiceAccount

A workload identity for in-cluster API access.

Top

Definition

Service accounts represent workload identity within a namespace. Their permissions come from RBAC bindings. Mis-scoped bindings are a common source of both outages (forbidden) and security incidents (overgrant).

In practice

  • Bind permissions to the smallest scope possible.
  • Separate build/deploy identities from runtime identities.
  • Audit bindings regularly.

What to inspect

  • kubectl get sa -n <ns>
  • kubectl get rolebinding,clusterrolebinding -A | rg -n "<sa-name>" || true

Common mistakes

  • Using the default service account unintentionally.
  • Granting cluster-wide privileges for a namespace-only need.
  • Embedding long-lived credentials when workload identity exists.

Related reading

Term

RBAC

Authorization rules for the Kubernetes API.

Top

Definition

RBAC defines who can do what to which resources, in which namespaces. RBAC errors are deterministic and should be diagnosed with exact subject/verb/resource tests.

In practice

  • Treat RBAC as governance, not convenience.
  • Use templates and review to avoid bespoke sprawl.
  • Make break-glass explicit and audited.

What to inspect

  • kubectl auth whoami
  • kubectl auth can-i <verb> <resource> -n <ns>
  • kubectl get rolebinding,clusterrolebinding -A

Common mistakes

  • Fixing forbidden by granting cluster-admin.
  • Ignoring aggregated roles and wildcard rules.
  • Letting break-glass become permanent access.

Related reading

Term

Probe

Health checks that gate traffic and restarts.

Top

Definition

Probes are contracts. Readiness controls whether a pod receives traffic; liveness controls restarts. Incorrect probes are a frequent cause of self-inflicted incidents.

In practice

  • Use readiness to represent serving ability.
  • Use liveness only for irrecoverable deadlock, not transient slowness.
  • Use startupProbe for slow initialization.

What to inspect

  • kubectl describe pod <pod> -n <ns>
  • kubectl get pod <pod> -n <ns> -o yaml | rg -n "readinessProbe|livenessProbe|startupProbe"

Common mistakes

  • Aggressive liveness that kills slow startups under load.
  • Readiness that depends on fragile external systems without justification.
  • Probes with timeouts too small for real-world latency.

Related reading