In my previous post series, I described how I installed my Kubernetes Home Lab using Calico and MetalLB. This worked great up until I started installing smart home software that expected to be able to do local network discovery. For example, Home Assistant and my Sonos control software both attempted to do subnet local discovery using mDNS or broadcast packets. This did not work because the pods were running on a 192.168.4.0/24 subnet, but all of my physical devices were on 192.168.2.0/24.
This prevented Home Assistant from discovering any devices and had to be fixed.
Next up in the series, we’re going to manually configure all of the network settings to get our flat network home lab. Our flat network should not use any packet encapsulation with all pods and services fully routable to and from the existing network.
Detailed in the previous post, I want a so-called flat network because packet encapsulation tunnels IP packets inside of other IP packets and creates a separate IP network that runs on-top of my existing network.) I wanted all nodes, pods, and services to be fully routable on my home network. Additionally, I had several Sonos speakers and other smart-home devices that I wanted to be control from my k8s cluster which required pods that ran on the same subnet as my other software.
Install CNI Plugin
The CNI (Container Network Interface) plugin is responsible configuring the network adapter that each Kubernetes pod has. Since each pod usually gets a separate network namespace isolated from the host’s main network adapter, without it, no pod could make any network calls. For more information, check out cni.dev or the K8s documentation.
I recently setup a Kubernetes cluster home lab and wanted to do it the hard-way and share it with you. I setup a home lab so I could run my smart home software and learn more about different Kubernetes networking technologies.
This blog post is broken up into several sections. Feel free to skip directly to the section that applies to you.
When I started I had a few things already:
I was already using Rancher as a UI to manage my Kubernetes clusters on my dedicated servers
I wanted a fully flat network, that means no packet encapsulation. Packet encapsulation tunnels IP packets inside of other IP packets and creates a separate IP network that runs on-top of my existing network.) I wanted all nodes, pods, and services to be fully routable on my home network. Additionally, I had several Sonos speakers and other smart-home devices that I wanted to be control from my k8s cluster which required pods that ran on the same IP network.
Alternatives
Docker Desktop and WSL2 are both great for development Docker projects where you use the Docker CLI, but when you try to run Kubernetes you’ll quickly run into networking issues. WSL2 and Docker Desktop can’t expose services to the rest of your network very easily because they use NAT’d network adapters. (GitHub microsoft/WSL#4150) This means you can’t expose nodes or pods as devices on the network, they will always be NAT’d to the host’s IP address. This failed my requirement.
This project is still a work in progress, so this article serves as an introduction to the problem space and walks through how the code works.
In the past when I wrote different web applications, I used Ruby on Rails combined with the HAML template language. HAML is my favorite way to write HTML because it is an abstract representation of an HTML DOM combined with a hint of Python syntax.
Being an abstract representation means that it doesn’t have to directly correspond to what the resulting HTML looks like. This decoupling enables a HAML render engine to reorganize the code to cleaner and simpler.
Indention can be handy when developing, but why waste the space when running production? One could just delete all the spacing in the source code and check this in, but now your code is harder to read. Can we have the best of both worlds?
The current state of the world
I’ve started experimenting with the new .net Core framework a lot because I like the framework and C# as a language. Unfortunately, HAML isn’t directly supported and instead the default render engine in ASP.NET MVC is just an low level HTML renderer which has the same problems as we highlighted above.
Instead I wanted to try to see if I can build my own solution and want to see how far we can push it with performance optimizations. Can we precompile the template into partial HTML streams? Can we optimize the HTML to be more friendly to Gzip? For example, if you have <a class="foo bar" /> and <a class="bar foo" /> Both of these elements are semantically equivalent and the classes can be ordered consistently so that Gzip can be efficiently compress them.
Fair warning, this will be prototype code and not ready for production quite yet.
Adding C# to the mix
I found a previous attempt at this called NHaml. There was quite a bit of work done on it, but it did not support .NET Core and seemed coupled to ASP.NET. I ended up borrowing the parsing logic (with modifications) and writing my own rendering engine.
But first, let’s see some results:
!!!
%html{ lang: 'en' }
%head
%title Hello world
%meta{ charset: 'utf-8' }
%meta{ content: 'width=device-width, initial-scale=1.0, maximum-scale=1.0', name: 'viewport' }
%body
.page-wrap{ class: DateTime.Now.ToString("yyyy"), d: 'bar', a: 'foo' }
= DateTime.Now.ToString("yyyy-mm-dd")
%h1= new Random().Next().ToString()
%p= model.ToString()
.content-pane.container
- if (true)
- if (1 > 0)
%div really true
%div Is True
- else
%div wat
- if (false)
%div Is False
.modal-backdrop.in
Gets compiled into the following, then the cached class is called for following executions.
At first, Elasticsearch may appear to be schemaless since you can add new fields any time you want, but every field in a document must match the mapping.
Dynamic Templates reduce boilerplate
How many times have you opened up a mapping file to something like this where the same type definition is repeated over and over again?
It’s super easy to refactor this into an alternative where by default all string values are mapped as keyword, except for the specific field listed as “text”.
For new fields, Elasticsearch can automatically identify what type to use, but it can be wrong or do unexpected things. For example, I’ve seen Elasticsearch accidentally identify a decimal value as a long because the first value to go into the index did not have any decimal points. Then all other documents failed to be indexed because they did not match. This is especially important if you have fields that have a wide range of values (for example, user controlled) because you can’t predict if the first value is going to look like a number or a date, when it should always be considered to be a string.
MySQL and PostgreSQL can be a bit of a black box when running if you don’t take the time to configure metrics. How do you identify which queries are slow and need to be optimized? MySQL has the slow log, but that requires a time threshold to log queries that run for longer than >N seconds. What if you want to identify the most common queries even if they are fast?
Ruby on Rails recently launched support for compiling static assets, such as JavaScript, using Webpack. Among other things, Webpack is more powerful at JS compilation when compared to the previous Rails default of Sprockets. Integration with Rails is provided by the Webpacker gem. Several features that I was interested in leveraging were tree shaking and support for the NPM package repository. With Sprockets, common JS libraries such as ReactJS had to be imported using Gems such as react-rails or classnames-rails. This added friction to adding new dependencies and upgrading to new versions of dependencies.
A couple of my projects used react-rails to render React components on the server-side using the legacy Sprockets system. This worked well, but I wanted to migrate to Webpacker to easily upgrade to the newest versions of React and React Bootstrap (previously I imported this using the reactbootstrap-rails, but this stopped being maintained with the launch of Webpacker.) However, migrating React components to support Webpack required changes to every single file adding ES6-style imports, file moves/renames, and scoping changes. This would have been too large to do all at once. What if there was a way to slowly migrate the JS code from Sprockets to Webpack, making components in either side available to the other side?
Pretty much every programming language out there has tools that statically analyze your source code and detect different problems. These problems can range from simple things like ensuring that you have consistent casing for variable names in Java to ruthlessly enforcing method limits in Ruby. If you’ve ever used one of these tools, they may seem overbearing and not worth the hassle, but they will soon prove their value once your application becomes larger, has multiple developers, or is business critical and can’t afford outages caused by trivial mistakes. Static analysis tools are a super-low cost solution for improving the quality of a code-base.
Note: I’m going to use AWS services as most of my examples for this post, but that’s just because I’m most familiar with them, the patterns found below are not limited to just AWS and can be applied to any cloud provider or self-hosted where similar patterns exist.
Introduction
Every service has some amount of supporting infrastructure required to support it. This includes any virtual servers (EC2 or other), storage (ex. S3, DynamoDB), load balancing, etc. basically any resources that your service uses that is not your direct business logic could be considered infrastructure. If you use continuous integration and change control on your business logic, then why would you not apply the same rules to your infrastructure?
Docker containers are the latest craze taking the world by storm. They enable software vendors to have more control over how their software is executed reducing the amount of work that software hosters need to be responsible for. By shifting the burden of figuring out environment requirements on to the software vendor, certain critical decisions that help improve security can be made once and only once and distributed to end-users. This reduces the cost barrier of having more stable/secure software as users no-longer have to think about intricacies of security and management, which we can see that users rarely take the time to invest in.
Docker containers have a number of different security mechanisms. I won’t go into details on that, if you’re interested in learning more, make sure to read the Docker security documentation page.
Capabilities
In Linux kernels, each process has a set of capability flags that the kernel checks when the process makes certain privileged syscalls. Processes running as root automatically get certain capabilities assigned to it.
Some example capabilities:
CAP_NET_BIND_SERVICE – Enables processes to bind to ports < 1024. By default, non-root processes can’t find to these reserved ports. Dropping this capability prevents even root processes from binding to these ports
According to the principal of least privilege, running with fewer capabilities will reduce the attack surface of a given piece of software.
Docker compose.yml
Docker compose files are a popular way to vendor an entire service stack to users. With it you can describe one or more Docker containers in a YAML-based format. More information is available in the official docs. A little used feature enables you to specify which capabilities your service requires.
For example, this is the configuration that I use for running NGINX on my server:
In this example, I enable a whitelist for capabilities instead of using the default list that Docker provides and enable only the minimal capabilities that are required. This list enables NGINX to modify file permissions (for access logs,) bind to port 80 and 443, and change the process user account. The default whitelist is available in the Docker source code here. Based on this, we’re reducing the attack surface that a malicious actor can leverage.
Docker compose is fully self-contained and doesn’t require the user to make any changes to their environment to start using. Docker compose and capabilities are a low-cost way to start reducing the attack surface of an application. Every service owner should attempt to run their application with –cap-drop ALL, then selectively enable capabilities until their application works, then vend that list as a best practice.
AppArmor/Security Profiles
Capabilities are a cheap way to begin to improve security, but they can only restrict a limited subset of kernel sys calls, making fine grained security control impossible. This is where mandatory access control and AppArmor strives. For distributions that support it (such as Ubuntu,) AppArmor is an opt-in security model that enables you to whitelist and/or blacklist specific sys calls, along with the parameters of those sys calls. For example, you could configure a Docker container application to only be able to open TCP connections to specific IP ranges and ports. Docker supports the ability to run containers with specific AppArmor profiles. While this requires more work on the user’s side to use, security conscious service vendors could vend an AppArmor profile along with their service that users could install. I plan to go into more detail on this in the future.
Conclusion
Anybody who builds a Docker container should leverage the security model that Docker provides by running with least privileges and capabilities, then include that configuration in vendor configuration, like Docker compose files. By doing this, your end users all will be able to take advantage of slightly reduced attack surface area, with only minimal effort on your side. Capabilities are in no way fool-proof, and one should never believe that they will significantly reduce the attack surface, but it’s better than nothing.