aws iam container escape exploit

Containers should be completely isolated from their host environment and should be unable to access resources on their host without explicit configuration to do so.

However due to the nature of how AWS injects IAM credentials into an instance via the Metadata API, without overriding the route tables of the host or control plane, or within K8S - utilizing the optional KIAM add-on - pods will be able to assume the IAM role associated with the node on which it is running.

The AWS Metadata API - http://169.254.169.254 - is available from within instances to retrieve data about the system, but it also returns credentials to assume the IAM role attached to the instance itself. Unless otherwise configured, aws CLI commands will query the metadata API to get a STS token to execute authenticated calls to AWS using the IAM role attached to the instance.

The security vulnerabilities of the metadata API have been well documented. Rather than running as a binary or process on the system which can either be moved, deleted, or limited based on user / group ownership, by exposing critical instance data and security credentials over an HTTP API on a loopback interface, the metadata API leaves systems vulnerable from process - and in this case, container - escape exploits.

An escape exploit enables a virtualized machine, process, or container from breaking out of the virtualized enironment and interacting with the host system itself. Containers provide a layer of abstraction on top of a host system to logically separate applications, but also to sandbox processes and networks on hosts.

If running an API on an AWS instance which makes a net request to a user-specified URL and returns the data to the user, without proper firewalls in place, the user would be able to enter the metadata URL and retrieve data about the system(s) running the API.

This vulnerability is tantamount to enabling users to run arbitary shell commands on a server, except in this case, it effectively enables the user to run arbitrary commands against an entire cloud instance / account.

Docker creates a default bridge network with the host to enable outbound communication, as well as a separate overlay network for container-to-container communication. The default bridge network utilizes the route table of the host, however the Docker daemon must be specifically configured to utilize unique DNS servers if desired.

Since the aws binary uses the metadata API for IAM role binding, and because requests to the instance metadata API are handled at the hypervisor level, requests to the metadata API - even from within a sandboxed container - will be answered as if the caller was on the instance itself.

This is effectively a container escape exploit which enables processes within a container to execute actions in the AWS account in which the cluster is running, assuming the IAM role which is attached to the node running the container.

These actions are not limited to the pod, deployment, node, or even VPC, but the entire AWS account, due to the fact that IAM roles span all other resources and services in an account.

This exploit does not require any specific configuration of the host or container to be enabled. In actuality, it is the default configuration which enables the exploit, and additional configuration must be implemented to disable the vulnerability.

Even with the most restrictive IAM role this can give processes running within pods contextual information about the AWS account which runs the cluster. An attacker could use this information to learn more about the cloud environment, craft a targeted attack against a vulnerable server outside of the affected node(s), or even jump from one machine to another and attack systems from inside the network.

A less-strict IAM role attached to the node would further enable an attacker to escape the container and execute actions against instances inside and outside the cluster, all by simply using the aws CLI as if they were an administrator in the system.

For example, with an IAM role attached to a private instance which enables ec2:AllocateAddress and ec2:AssociateAddress, an attacker would be able to create and attach an EIP to an instance and effectively make a once-private server public. With additional rights, an attacker would be able to start, stop, or even terminate servers in an account.

Ultimately the most robust solution is to use a host-based firewall such as iptables or ufw to drop all outbound connections from the instance to the metadata API. While this may cause issues with some scripts which make use of the metadata API for configuration / contextual information, all data from the metadata API could be otherwise injected by an administrator to ensure scripts / applicaitions operate properly, without the vulnerabilities associated with the API.

This approach would ensure all connections to the metadata API are blocked.

If running vanilla Docker or another orchestrator such as Swarm, a reverse proxy can be launched on the instance(s) to intercept requests to the metadata API to return a modified response when the credentials endpoint is accessed.

If running a K8S cluster, using the kiam add on will intercept all IAM calls from within pods to replace the node IAM role with the role specified for the requesting pod(s). kiam requires separate kiam nodes and running kiam agent pods in each node, however it provides a very robust IAM proxying, modification, and firewalling mechanism.

However the need for an entire cluster add on just to protect an AWS account from a container escape exploit is a bit untennable, and it would be more ideal if the metadata API was replaced with a system-level binary / executable. This would ensure that only authorized system administrators could access the metadata service, and if required, user groups and file permissions could be utilized to further limit access to the service.

last updated 2022-08-20