Skip to content

Sacred Systems

Services, Service Discovery, and Traffic Flow

A Service is stable naming plus an endpoint set. When the endpoint set is wrong—or withheld by readiness—traffic becomes myth. Prove endpoints first.

Text

Authored as doctrine; evaluated as systems craft.

Doctrine

Services exist to give stable addressing over ephemeral pods. Kubernetes achieves this with selectors and an endpoint set. There is no magic beyond that: if endpoints are empty, routing cannot happen.

Kubblai doctrine: never debug ingress or DNS before you have proven endpoints.

  • Selectors define membership; labels satisfy membership.
  • Readiness controls eligibility; not-ready endpoints are withheld.
  • Ports must align: service port, targetPort, and container listener.

EndpointSlices and readiness gating

EndpointSlices are the modern truth source for service backends. If they are empty, you have a selector mismatch or readiness gating.

If they are populated, you can test service reachability locally with port-forwarding to isolate ingress and DNS from the equation.

kubectl

shell

kubectl get svc,ep,endpointslices -n <ns>
kubectl describe svc <svc> -n <ns>
kubectl get pods -n <ns> -l app=<label> -o wide

kube-proxy and the cost of abstraction

Service routing is implemented by kube-proxy (iptables/ipvs) or equivalent mechanisms. You rarely need to debug kube-proxy itself—but you do need to remember that service routing depends on node networking health and consistent endpoint data.

When service routing fails intermittently, investigate endpoint churn, readiness flapping, and node-level partitions before you rewrite YAML.

Common failure modes

Most service failures are misalignments, not deep platform problems.

  • Selector mismatch (one label off).
  • Pods not Ready (readiness endpoint wrong, dependency gating).
  • Port mismatch (service points at a port the pod never listens on).
  • Namespace mismatch (service cannot target pods in another namespace).

Field notes

Services become unreliable when readiness becomes theatre. If readiness lies, traffic will follow the lie. Treat readiness as an SLO-aligned gate: serving ability, not process existence.

When in doubt: test the service directly. Then test ingress. Then test DNS.