You don’t have enough static analysis

Introduction

Pretty much every programming language out there has tools that statically analyze your source code and detect different problems. These problems can range from simple things like ensuring that you have consistent casing for variable names in Java to ruthlessly enforcing method limits in Ruby. If you’ve ever used one of these tools, they may seem overbearing and not worth the hassle, but they will soon prove their value once your application becomes larger, has multiple developers, or is business critical and can’t afford outages caused by trivial mistakes. Static analysis tools are a super-low cost solution for improving the quality of a code-base.

Continue reading “You don’t have enough static analysis”

Structured and auditable changes to infrastructure

Note: I’m going to use AWS services as most of my examples for this post, but that’s just because I’m most familiar with them, the patterns found below are not limited to just AWS and can be applied to any cloud provider or self-hosted where similar patterns exist.

Introduction

Every service has some amount of supporting infrastructure required to support it. This includes any virtual servers (EC2 or other), storage (ex. S3, DynamoDB), load balancing, etc. basically any resources that your service uses that is not your direct business logic could be considered infrastructure. If you use continuous integration and change control on your business logic, then why would you not apply the same rules to your infrastructure?

Allowing and requiring developers to make changes using the UI introduces risk that one might make a mistake and bring down your production service. Continuing from my last post about infrastructure names, you could also make a mistake in any of regional clones.
Continue reading “Structured and auditable changes to infrastructure”

Vending Software Good Practices – Docker Security

Docker containers are the latest craze taking the world by storm. They enable software vendors to have more control over how their software is executed reducing the amount of work that software hosters need to be responsible for. By shifting the burden of figuring out environment requirements on to the software vendor, certain critical decisions that help improve security can be made once and only once and distributed to end-users. This reduces the cost barrier of having more stable/secure software as users no-longer have to think about intricacies of security and management, which we can see that users rarely take the time to invest in.

Docker containers have a number of different security mechanisms. I won’t go into details on that, if you’re interested in learning more, make sure to read the Docker security documentation page.

Capabilities

In Linux kernels, each process has a set of capability flags that the kernel checks when the process makes certain privileged syscalls. Processes running as root automatically get certain capabilities assigned to it.

Some example capabilities:

  • CAP_NET_BIND_SERVICE – Enables processes to bind to ports < 1024. By default, non-root processes can’t find to these reserved ports. Dropping this capability prevents even root processes from binding to these ports
  • Even more on the man page

According to the principal of least privilege, running with fewer capabilities will reduce the attack surface of a given piece of software.

Docker compose.yml

Docker compose files are a popular way to vendor an entire service stack to users. With it you can describe one or more Docker containers in a YAML-based format. More information is available in the official docs. A little used feature enables you to specify which capabilities your service requires.

For example, this is the configuration that I use for running NGINX on my server:

nginx:
  image: nginx:1.9.10
  cap_drop:
    - ALL
  cap_add:
    - CHOWN
    - DAC_OVERRIDE
    - NET_BIND_SERVICE
    - SETGID
    - SETUID

In this example, I enable a whitelist for capabilities instead of using the default list that Docker provides and enable only the minimal capabilities that are required. This list enables NGINX to modify file permissions (for access logs,) bind to port 80 and 443, and change the process user account. The default whitelist is available in the Docker source code here. Based on this, we’re reducing the attack surface that a malicious actor can leverage.

Docker compose is fully self-contained and doesn’t require the user to make any changes to their environment to start using. Docker compose and capabilities are a low-cost way to start reducing the attack surface of an application. Every service owner should attempt to run their application with –cap-drop ALL, then selectively enable capabilities until their application works, then vend that list as a best practice.

AppArmor/Security Profiles

Capabilities are a cheap way to begin to improve security, but they can only restrict a limited subset of kernel sys calls, making fine grained security control impossible. This is where mandatory access control and AppArmor strives. For distributions that support it (such as Ubuntu,) AppArmor is an opt-in security model that enables you to whitelist and/or blacklist specific sys calls, along with the parameters of those sys calls. For example, you could configure a Docker container application to only be able to open TCP connections to specific IP ranges and ports. Docker supports the ability to run containers with specific AppArmor profiles. While this requires more work on the user’s side to use, security conscious service vendors could vend an AppArmor profile along with their service that users could install. I plan to go into more detail on this in the future.

Conclusion

Anybody who builds a Docker container should leverage the security model that Docker provides by running with least privileges and capabilities, then include that configuration in vendor configuration, like Docker compose files. By doing this, your end users all will be able to take advantage of slightly reduced attack surface area, with only minimal effort on your side. Capabilities are in no way fool-proof, and one should never believe that they will significantly reduce the attack surface, but it’s better than nothing.

Dynamic AWS resource discovery for one-click region spin-ups

Disclaimer: At the time of this article’s writing, I work at Amazon, but not in AWS. This article is based on my own research and ideas and is not the official position of Amazon. This article is not intended as marketing material for AWS, only as some architectural patterns for you to use if you do leverage AWS.

AWS provides a number of different resources that you can use to build services using, including S3 buckets, SQS queues, etc. When you create a new instance of that resource, you must pick a name that usually must be unique in a given namespace. Depending on your naming scheme, you may also have to start embedding resource names in code or configuration files. This makes spinning up new regions difficult as now you have to update configuration with names for every stage/region that you might use. This may not seem like that big of a deal, but consider that you may have tens of different SQS queues, S3 buckets, etc. for each region/stage. This can begin to combinatorically explode as you now have # regions * # stages * # resources of different configuration definitions. This results in a lot of boilerplate.

But what if there was a better way?

Continue reading “Dynamic AWS resource discovery for one-click region spin-ups”

Fast development environments

Setting up new hosts entries for every different web site that you develop is hard. This workflow allows you to completely automate it. First thing you’ll want to do is setup a wildcard DNS record that points to your host. This allows you to dynamically setup new development websites without having create new DNS records for each one of them. I created a fake internal-only TLD on my local network’s DNS server that automatically returns the IP address of my development VM for any query to *.devvm. If you don’t have access to that, you could re-use an actual domain and automatically forward something like *.dev.technowizardry.net to the VM. For example, I have the ASUS RT-AC68U router for my personal network. So I SSH’d to the router, typed vi /etc/dnsmasq.conf, then appended:

Continue reading “Fast development environments”