localhost is not localhost

Preface: anyone who has any level of networking experience should probably move on. This have no new information for you. This post is also not intended for someone looking to learn container networking.

As mentioned in the first post on this blog, it is not intended as a "how to" or "getting started" guide for any technology, but rather simply a collection of concepts, quirks, and edge cases found along the journey of the wholesale technological shift that is DevOps and associated philosophies. I felt obligated to write this after seeing some colleagues spend a few hours fighting against a container networking issue that was solved with a simple "have you tried quad zero?"

The tl;dr of this post is "quad zero is a shotgun, and sometimes that's a good thing".

And that brings us to the title of the post - when localhost is not actually localhost.

From a traditional systems perspective, "localhost" is an alias for "the system I am on".

But with today's world of system virtualization, what exactly is the system you are on? Is it the virtualized system which your terminal is currently running in? Is it the hypervisor which manages the cluster of VMs on which you are a single tenant? Or is it the physical hardware which actually runs your servers?

As us cloud engineers like to say, "the cloud is nothing more than someone else's computer" - and like one of my colleagues likes to add, "... in Reston, VA".

For the most part, virtualized environments have kept up in ensuring common assumptions of system networking are carried on into virtualized environments - however as any reader of the posts on this blog would know, the deeper you get into cloud and virtualized networking, the more you find edge cases and situations where traditional network philosophies don't always directly translate.

At this point, seasoned container network and DevNetOps engineers are probably pulling their hair out - see preface above.

However I have found that even many seasoned Ops and DevOps engineers have trouble understanding how container networks operate. While they are essentially just traditional networks on a more granular scale, the notion of having networks and subnetworks within, rather than above the traditional *nix userspace is something that is sometimes difficult to grasp.

Container orchestrators create a bridge network between the host OS and the orchestrator's overlay network which allows internal-orchestrator addressing and service discovery, while also enabling a NAT'ed net path through the host OS interface(s).

This means that when binding ports to make a container accessbile from the host through the bridge network, the container still respects its internal addressing and networking constraints.

Anyone who has worked with containers (or virtualized environments in general) long enough will immediately know that "localhost" is relegated to the container / VM, which is independent of "localhost" on the host system / VM. This is by design - virtualized environments should theoretically have no concept or awareness of their host systems or architecture, and should operate as logically independent "traditional" systems within a traditional network topology.

This concept is generally easy enough to grasp. Where I have seen issues is when container begineers try to address a container from the host, or vice versa.

This is when "localhost" becomes an ambiguous concept which takes a bit of network knowledge to understand.

localhost is simply an entry in a hosts file which points to the system's loopback address.

When that system is a container, that means localhost is localhost of the container, not the host. This is a basic concept of container networking, but must be stated for the following reason: when running a web server, many servers will default to attach to localhost.

This means that the web or app server will be up - exec'ing into the container will elicit all thexpected results - and yet it will be inaccessible from the host or through an exposed orchestrator service.

The issue at hand is that to the container and the service within it, it is attaching to its localhost, with no concept of the host. It never touches the container's bridge network interface, since it is following the loopback path that it is aware of.

This is where the quad-zero "shotgun" comes in. For the uninitiated, quad-zero, or the default route - 0.0.0.0/0 - is an address which routes all traffic if no other valid net path is found.

In a container where the IPs are dynamically allocated, second class citizens in a network, and change frequently, the route in and out of a container can change at a moment's notice.

The default route is the most assured way to route traffic in and out of a container, as it will operate against all the interfaces of the container, finding the bridge network interface of the host, and properly handling all ingress and egress traffic.

Admittedly, the title of the post is a bit clickbait-y. localhost is still localhost, and loopback is still loopback. Maybe the title should have been "quad zero is the shotgun we all need" - either way, if your application server is clearly running in your container but still not accesible from your cluster service - make sure your server is attaching to the default route, and not attaching to the "localhost" alias route.

last updated 2022-08-20