Sacred Systems
CNI as the Nervous System of the Cluster
Your CNI is not plumbing. It is a distributed system with its own control plane, performance ceiling, and failure modes.
Text
Authored as doctrine; evaluated as operations.
Doctrine
Networking is where most clusters become honest. Latency, MTU, conntrack, DNS, and policy enforcement all surface here.
Kubblai doctrine: choose a CNI as an institutional decision. Then operate it like one.
Dataplane tradeoffs
Overlay vs routed models, iptables vs eBPF, kube-proxy modes, and service handling all change performance and debuggability.
A fast dataplane that you cannot debug under stress is a false economy.
- Overlay simplifies addressing, but can complicate MTU and observability.
- eBPF can improve performance/visibility, but increases kernel coupling.
- Service handling (iptables/ipvs/eBPF) changes failure signatures.
Operational realities
Most ‘application’ incidents in Kubernetes are networking incidents in disguise: DNS flaps, conntrack exhaustion, asymmetric routing, or policy drops.
If you cannot explain the packet path, you are operating blind.
- Document the packet path for a request.
- Know how to capture flow logs.
- Measure DNS latency and error rates.
Policy enforcement and performance
NetworkPolicy is only as real as your enforcement. Enforcement has cost. At scale, naive policy sets can become CPU pressure or rule explosion.
Budget policy like you budget CPU: with measurement and restraint.
Canonical Link
Canonical URL: /library/cni-as-the-nervous-system-of-the-cluster
Related Readings
Advanced Disciplines
LibraryNetwork Policy and the Discipline of Isolation
Isolation is not paranoia; it is how you keep a single compromised workload from becoming a platform incident.
Advanced Disciplines
LibraryIngress, Egress, and the Borders of the Mesh
Ingress is not a convenience; it is the public boundary of your system. Egress is the boundary you forget until it becomes the breach.
Canonical Texts
LibraryIncident Response as a Trial of Faith
Incidents reveal the true governance of your platform: who can act, what can be changed, and whether your system can recover with discipline.