Governance & Power
The Cost of Tenant Illusions in Shared Clusters
Shared clusters promise efficiency. Without real isolation, they deliver shared outages: quota fights, RBAC mistakes, policy coupling, and security ambiguity.
Text
Authored as doctrine; evaluated as systems craft.
Doctrine
Namespaces are not isolation; they are a naming boundary. Multi-tenancy is a portfolio of controls: identity, policy, quotas, network boundaries, and operational procedures that prevent one tenant’s mistake from becoming everyone’s incident.
Kubblai doctrine: do not sell ‘isolation’ when you have only names.
- Define the tenant model: who owns namespaces, quotas, policy, and budgets.
- Treat quota as economics: resource usage is a cost with governance.
- Instrument tenant-level blast radius: errors, latency, and capacity consumption.
Noisy neighbors are governance failures
A tenant that floods the API with churn, creates too many objects, or triggers eviction storms is a governance failure, not a moral failing. The platform must prevent it by design.
In shared clusters, fairness is policy, not goodwill.
- Use ResourceQuota and LimitRange to encode fairness.
- Use priority classes and preemption consciously; document who can take scarcity.
- Use admission to reject pathological objects (huge env vars, giant ConfigMaps).
Identity boundaries and RBAC reality
RBAC is easy to misconfigure at scale. ‘View’ roles become write roles through aggregation. Service accounts become human proxies. Break-glass becomes permanent access.
Tenant isolation requires identity posture: least privilege, audit, and structured exceptions.
- Avoid cluster-admin by default. Make break-glass explicit and time-bound.
- Standardize role templates; avoid bespoke RBAC per team.
- Audit role bindings continuously; treat drift as an incident precursor.
Network policy is not a checkbox
Network policy enforces communication boundaries only when implemented correctly by the CNI and when services are designed with explicit trust boundaries.
In shared clusters, ambiguous connectivity becomes a security incident waiting for a timestamp.
- Define default-deny posture for sensitive namespaces.
- Document ingress/egress paths; treat exceptions as artifacts with owners.
- Test policy with real traffic flows, not assumptions.
Operational posture
Shared clusters multiply incident complexity. You need runbooks that include tenant communication, blast radius estimation, and safe throttling mechanisms.
If you cannot pause one tenant’s chaos without pausing the cluster, you do not have multi-tenancy.
- Maintain tenant-level dashboards and budgets.
- Provide ‘pause’ mechanisms for pathological controllers or workloads.
- Treat tenancy as a product: documentation, expectations, and enforcement.
Canonical Link
Canonical URL: /library/the-cost-of-tenant-illusions-in-shared-clusters
Related Readings
Governance & Power
LibraryNamespaces, Boundaries, and the Shape of Order
Namespaces are not security by themselves. They are the primary unit of operational containment and governance.
Governance & Power
LibraryRBAC and the Governance of Power
RBAC is the cluster’s constitution. Poorly written, it becomes silent catastrophe during incident response.
Advanced Disciplines
LibraryNetwork Policy and the Discipline of Isolation
Isolation is not paranoia; it is how you keep a single compromised workload from becoming a platform incident.
Governance & Power
LibraryPolicy as Doctrine, Not Suggestion
Policy is what makes a platform institutional. Without it, every incident is negotiated from scratch.
Governance & Power
LibraryPlatform Cost Doctrine: Waste, Density, and the Economics of the Cluster
Cost is a signal. When ignored, it reappears as fragility: overloaded nodes, under-provisioned control planes, and rushed change driven by budget panic.