Advanced Disciplines
The Scheduler Under Scarcity: Priority, Preemption, and Hard Choices
When capacity is insufficient, the scheduler becomes governance. Priority and preemption encode institutional values: who runs, who waits, and who is displaced.
Text
Authored as doctrine; evaluated as systems craft.
Doctrine
Scarcity is inevitable. The question is whether scarcity is managed by policy or by outage. Kubernetes gives you tools—priority classes, preemption, quotas—but it will not decide your values.
Kubblai doctrine: write scarcity decisions down. Do not let them emerge from panic.
- Define workload tiers and assign priority intentionally.
- Protect control-plane adjacent services (DNS, ingress, telemetry).
- Document preemption posture and test it under load.
Priority classes are not labels
Priority is a mechanism, not a decoration. It affects scheduling order, eviction, and preemption decisions. In multi-tenant clusters, it becomes a political boundary.
A mature platform defines priority classes with governance: who can use them, and what they imply.
- Restrict high priorities with RBAC and admission policies.
- Require justification for ‘critical’ tiers; audit usage.
- Avoid priority inflation; it collapses the hierarchy.
Preemption: the operational cost
Preemption can restore liveness for critical workloads, but it also creates churn: evictions, restarts, cache misses, and cascading dependency failures.
The Order treats preemption as an emergency posture, not a default mode of operation.
- Test preemption in staging with realistic workloads and PDBs.
- Monitor disruption costs: restart storms and latency spikes.
- Define which tenants/workloads are allowed to displace others.
Fairness and quota under stress
Under scarcity, quotas become enforcement. Without quotas, one tenant’s churn can starve others. With quotas, you must manage exceptions and avoid hard failures during incidents.
Governance means making exceptions explicit and time-bound.
- Use ResourceQuota and LimitRange to encode fairness.
- Provide a reviewed path for temporary quota increases.
- Instrument Pending reasons; unschedulable is a governance signal.
Operator posture
Scarcity events should be run as incidents: classify, stabilize, decide, and preserve memory. The goal is to restore controlled scheduling, not to ‘get pods running’ at any cost.
A scarcity incident often reveals capacity lies: requests, anti-affinity defaults, and oversubscription assumptions.
- Freeze non-essential rollouts during scarcity.
- Identify the bottleneck (CPU, memory, GPU, ephemeral storage) and address it explicitly.
- Afterward, publish the capacity doctrine and enforce requests discipline.
Canonical Link
Canonical URL: /library/the-scheduler-under-scarcity-priority-preemption-and-hard-choices
Related Readings
Advanced Disciplines
LibraryCapacity, Bin Packing, and the Lies We Tell the Scheduler
The scheduler is not a magician. It places pods based on the numbers you give it. When those numbers are lies, placement becomes a slow-motion incident.
Advanced Disciplines
LibraryThe Scheduler and the Ethics of Placement
Placement is policy made physical. When you schedule, you are allocating failure domains, cost, and contention.
Advanced Disciplines
LibraryTaints, Tolerations, and the Law of Affinity
Affinity is desire; taints are refusal. Together they define where work may live and where it must never settle.
Advanced Disciplines
LibraryCluster Autoscaling and the Economics of Expansion
Adding nodes is not ‘scale.’ It is a controlled expansion of failure domains, cost, and operational surface area.
Governance & Power
LibraryThe Cost of Tenant Illusions in Shared Clusters
Shared clusters promise efficiency. Without real isolation, they deliver shared outages: quota fights, RBAC mistakes, policy coupling, and security ambiguity.