Docker containers are the latest craze taking the world by storm. They enable software vendors to have more control over how their software is executed reducing the amount of work that software hosters need to be responsible for. By shifting the burden of figuring out environment requirements on to the software vendor, certain critical decisions that help improve security can be made once and only once and distributed to end-users. This reduces the cost barrier of having more stable/secure software as users no-longer have to think about intricacies of security and management, which we can see that users rarely take the time to invest in.
Docker containers have a number of different security mechanisms. I won’t go into details on that, if you’re interested in learning more, make sure to read the Docker security documentation page.
In Linux kernels, each process has a set of capability flags that the kernel checks when the process makes certain privileged syscalls. Processes running as root automatically get certain capabilities assigned to it.
Some example capabilities:
- CAP_NET_BIND_SERVICE – Enables processes to bind to ports < 1024. By default, non-root processes can’t find to these reserved ports. Dropping this capability prevents even root processes from binding to these ports
- Even more on the man page
According to the principal of least privilege, running with fewer capabilities will reduce the attack surface of a given piece of software.
Docker compose files are a popular way to vendor an entire service stack to users. With it you can describe one or more Docker containers in a YAML-based format. More information is available in the official docs. A little used feature enables you to specify which capabilities your service requires.
For example, this is the configuration that I use for running NGINX on my server:
In this example, I enable a whitelist for capabilities instead of using the default list that Docker provides and enable only the minimal capabilities that are required. This list enables NGINX to modify file permissions (for access logs,) bind to port 80 and 443, and change the process user account. The default whitelist is available in the Docker source code here. Based on this, we’re reducing the attack surface that a malicious actor can leverage.
Docker compose is fully self-contained and doesn’t require the user to make any changes to their environment to start using. Docker compose and capabilities are a low-cost way to start reducing the attack surface of an application. Every service owner should attempt to run their application with –cap-drop ALL, then selectively enable capabilities until their application works, then vend that list as a best practice.
Capabilities are a cheap way to begin to improve security, but they can only restrict a limited subset of kernel sys calls, making fine grained security control impossible. This is where mandatory access control and AppArmor strives. For distributions that support it (such as Ubuntu,) AppArmor is an opt-in security model that enables you to whitelist and/or blacklist specific sys calls, along with the parameters of those sys calls. For example, you could configure a Docker container application to only be able to open TCP connections to specific IP ranges and ports. Docker supports the ability to run containers with specific AppArmor profiles. While this requires more work on the user’s side to use, security conscious service vendors could vend an AppArmor profile along with their service that users could install. I plan to go into more detail on this in the future.
Anybody who builds a Docker container should leverage the security model that Docker provides by running with least privileges and capabilities, then include that configuration in vendor configuration, like Docker compose files. By doing this, your end users all will be able to take advantage of slightly reduced attack surface area, with only minimal effort on your side. Capabilities are in no way fool-proof, and one should never believe that they will significantly reduce the attack surface, but it’s better than nothing.