Skip to content

Advanced Disciplines

Resource Requests, Limits, and Scheduling Tradeoffs

Requests are promises. Limits are constraints. Misusing either creates clusters that lie about capacity and workloads that fail when load arrives.

Text

Authored as doctrine; evaluated as systems craft.

Doctrine

The scheduler places pods based on requests. If requests are dishonest, scheduling is dishonest, and every downstream system inherits the lie: autoscaling, capacity planning, and reliability posture.

Kubblai doctrine: treat requests as economic inputs. Treat limits as risk constraints. Do not set them as rituals.

  • Requests drive placement; keep them aligned to observed steady-state.
  • Limits shape runtime failure modes (throttling, OOM) and must be tested under load.
  • Headroom is a policy decision; scarcity becomes governance.

QoS classes and eviction reality

Kubernetes assigns QoS classes based on requests and limits. Under node pressure, QoS influences eviction order. This is not a theoretical detail; it is how outages propagate.

If you operate multi-tenant clusters, QoS and quotas are part of your containment posture.

  • BestEffort is fragile under pressure; it should be intentional.
  • Burstable is common; tune with awareness of eviction behavior.
  • Guaranteed workloads are expensive; reserve them for real commitments.

CPU throttling and tail collapse

CPU limits can cause throttling. Under bursty load, throttling increases latency, which can cause readiness flapping, which can cause retry storms, which can cause incidents. This chain is common.

If you set CPU limits, test under the load profile you claim to support.

  • Watch p95/p99 latency when CPU is constrained.
  • Avoid probe endpoints that share the same constrained thread pools as heavy work.
  • Tune limits alongside concurrency and queueing posture.

Memory limits and OOM behavior

Memory limits kill containers. If you set a limit below real peaks, you get OOMKilled during warm-up, cache fill, or rare load spikes. If you set no limit, you risk node-level pressure and eviction cascades.

Right-sizing requires measurement and humility: you are not guessing; you are budgeting.

Field notes

The fastest way to ruin an otherwise healthy cluster is to ‘improve efficiency’ by padding requests or squeezing limits without measurement. The system will obey you. The incident will be yours.

Cost doctrine is not simply density. It is stable density under SLO constraints.