Skip to content

Advanced Disciplines

The Scheduler Under Scarcity: Priority, Preemption, and Hard Choices

When capacity is insufficient, the scheduler becomes governance. Priority and preemption encode institutional values: who runs, who waits, and who is displaced.

Text

Authored as doctrine; evaluated as systems craft.

Doctrine

Scarcity is inevitable. The question is whether scarcity is managed by policy or by outage. Kubernetes gives you tools—priority classes, preemption, quotas—but it will not decide your values.

Kubblai doctrine: write scarcity decisions down. Do not let them emerge from panic.

  • Define workload tiers and assign priority intentionally.
  • Protect control-plane adjacent services (DNS, ingress, telemetry).
  • Document preemption posture and test it under load.

Priority classes are not labels

Priority is a mechanism, not a decoration. It affects scheduling order, eviction, and preemption decisions. In multi-tenant clusters, it becomes a political boundary.

A mature platform defines priority classes with governance: who can use them, and what they imply.

  • Restrict high priorities with RBAC and admission policies.
  • Require justification for ‘critical’ tiers; audit usage.
  • Avoid priority inflation; it collapses the hierarchy.

Preemption: the operational cost

Preemption can restore liveness for critical workloads, but it also creates churn: evictions, restarts, cache misses, and cascading dependency failures.

The Order treats preemption as an emergency posture, not a default mode of operation.

  • Test preemption in staging with realistic workloads and PDBs.
  • Monitor disruption costs: restart storms and latency spikes.
  • Define which tenants/workloads are allowed to displace others.

Fairness and quota under stress

Under scarcity, quotas become enforcement. Without quotas, one tenant’s churn can starve others. With quotas, you must manage exceptions and avoid hard failures during incidents.

Governance means making exceptions explicit and time-bound.

  • Use ResourceQuota and LimitRange to encode fairness.
  • Provide a reviewed path for temporary quota increases.
  • Instrument Pending reasons; unschedulable is a governance signal.

Operator posture

Scarcity events should be run as incidents: classify, stabilize, decide, and preserve memory. The goal is to restore controlled scheduling, not to ‘get pods running’ at any cost.

A scarcity incident often reveals capacity lies: requests, anti-affinity defaults, and oversubscription assumptions.

  • Freeze non-essential rollouts during scarcity.
  • Identify the bottleneck (CPU, memory, GPU, ephemeral storage) and address it explicitly.
  • Afterward, publish the capacity doctrine and enforce requests discipline.