The one where Rancher ruined my birthday

Artwork by Sami Lee.

Other titles:

  • You were supposed to bring balance to Kubernetes, Rancher, not destroy it
  • et tu? Rancher?

I’ve been maintaining my own dedicated servers for around 7 years now as a way to learn and improve skills and have a place to run my different web sites, mail servers, even this blog. Over the years the hardware has changed and I’ve moved from hosting Rails applications directly on the OS to Docker and finally Kubernetes. I’ve learned a lot of skills that eventually helped me in my professional career at my job that it’s definitely been worth it, but maintaining this server has had its massive pain points where I’ve just had to walk away and leave stuff broken for days until I finally fix the issues.

I selected Rancher several years ago (at least 3 or 4 years ago I’d estimate) when I finally moved to Kubernetes. I liked how it automatically provisioned my clusters, managed networking, and provided a nice UI. It was also reasonably recommended by the internet. Things worked reasonably well, but after adopting Rancher and Kubernetes, every 6-12 months I’d end up having something massively break and I’d have to rebuild the entire Kubernetes cluster painstakingly and many times I’d tell myself if it broke, then I’d just swear off Rancher entirely, but it never happened because I eventually got everything working.

After upgrading to Rancher v2.6.3 that just launched yesterday and finding that all my clusters were removed from Rancher, I hit my breaking point.

Continue reading “The one where Rancher ruined my birthday”

Home Lab – Using the bridge CNI with Systemd

This entry is part 7 of 8 in the series Home Lab

After I’ve had time to run my home lab for a while, I’ve started switching to a more up to date Linux distribution (instead of RancherOS.) I’m currently testing Ubuntu Server which leverages Systemd. Systemd-networkd is responsible for managing the network interface configuration and it differs in behavior compared to NetworkManager enough that we need to update the Home Lab Bridge CNI to handle it.

Previously the CNI was creating a bridge network adapter when the first container started up, but this causes problems with systemd because resolved (the DNS resolver component) was eventually failing to make DNS queries and networkd was duplicating IP addresses on both eth0 (the actual uplink adapter) and on cni0 because we were copying it over.

Continue reading “Home Lab – Using the bridge CNI with Systemd”

Upgrading Longhorn from Helm 2 in Rancher 2.6 the hard-way

Long ago, I installed Longhorn onto my Kubernetes cluster using Helm 2. Then eventually Helm 3 was released and helm 2to3 was made available. However, I was not able to use helm 2to3 for whatever reason because Rancher didn’t deploy Tiller in the way that this CLI expected. Additionally, Rancher did not provide an upgrade mechanism to handle this. Eventually Rancher 2.6 was released which entirely dropped Helm 2 support and I was stuck with a cluster where Longhorn was deployed, but not managed by a working Helm installation.

This blog post outlines how you can recover Longhorn and upgrade it to Helm 3 without deleting all your volumes. This guide isn’t specific to Rancher 2.6.

Continue reading “Upgrading Longhorn from Helm 2 in Rancher 2.6 the hard-way”

Why is Kubernetes opening random ports?

Kubernetes automatically exposes certain services as a port on your host and may unintentionally exposing private services. Here’s how to fix that

I recently responded to the Log4j vulnerability. If you’re not aware, Log4j is a very popular Java logging library used in many Java applications. There was a vulnerability where malicious actors could remotely take control of your computer by submitting a specially crafted request parameter that gets directly logged to log4j.

This situation was not ideal since I was running several Java applications on my servers, thus I decided to use Nmap to port scan my dedicated server to see what ports were open. I ended up finding a number of ports I didn’t expect because several of Kubernetes Service instances were being mapped as node ports.

In this post, I outline the problem with Kubernete’s default strategy for services and how to avoid exposing ports that you don’t need.

Continue reading “Why is Kubernetes opening random ports?”

Picking a mortgage for data engineers using Python

Over the past year I helped a few people pick mortgages while buying their homes by helping them visualize different mortgage options from different companies. In a seller’s market, like where I live, you only get a few days to pick from a number of different mortgages that all offer different fees, points, and interest rates that all influence the monthly rate that you pay.

Given all this data, how do you compare the difference options and decide which one to go with? The lowest monthly rate isn’t always the best option.

Continue reading “Picking a mortgage for data engineers using Python”

Home Lab: Part 6 – Replacing MACvlan with a Bridge

In previous posts, I leveraged the MACvlan CNI to provide the networking to forward packets between containers and the rest of my network, however I ran into several issues rooted from the fact that MACvlan traffic bypasses several parts of the host’s IP stack including conntrack and IPTables. This conflicted with how Kubernetes expects to handle routing and meant we had to bypass and modify IPTables chains to get it to work.

While I got it to work, there was simply too much wire bending involved and I wanted to investigate alternatives to see if anything was able to fit my requirements better. Let’s consider the bridge CNI.

Continue reading “Home Lab: Part 6 – Replacing MACvlan with a Bridge”

Home Lab: Part 5 – Problems with asymmetrical routing

In the previous post (DHCP IPAM), we successfully got our containers running with macvlan + DHCP. I additionally installed MetalLB and everything seemingly worked, however when I tried to retroactively add this to my existing Kubernetes home lab cluster already running Calico, I was not able to access the Metallb service. All connections were timing out.

A quick Wireshark packet capture of the situation exposed this problem:

The SYN packet from my computer made it to the container (LB IP 1921.168.6.2), but the responding SYN/ACK packet that came back had a source address of 192.168.2.76 (the pod’s network interface.) This wouldn’t work because my computer ignored it because it didn’t belong to an active flow.

Continue reading “Home Lab: Part 5 – Problems with asymmetrical routing”

Home Lab: Part 4 – A DHCP IPAM

In the previous post, we end up abusing subnets and routing to get Calico to exist on the correct subnet, but what if we could get rid of Calico’s duplicate IPAM system and just depend on our existing DHCP server to handle reservations? In this post, we’re going to prototype a cluster that uses DHCP + layer 2 Linux bridging to avoid the complications outlined in Part 3.

The official CNI documentation describes two plugins that could be relevant.

With dhcp plugin the containers can get an IP allocated by a DHCP server already running on your network.

https://www.cni.dev/plugins/current/ipam/dhcp/

This avoids overlapping IPAM problems with the previous solution and means that the DHCP server already running on my network would be responsible for handing out IP addresses directly to the containers.

Continue reading “Home Lab: Part 4 – A DHCP IPAM”

Home Lab: Part 3 – Networking Revisited

The Problem

In my previous post series, I described how I installed my Kubernetes Home Lab using Calico and MetalLB. This worked great up until I started installing smart home software that expected to be able to do local network discovery. For example, Home Assistant and my Sonos control software both attempted to do subnet local discovery using mDNS or broadcast packets. This did not work because the pods were running on a 192.168.4.0/24 subnet, but all of my physical devices were on 192.168.2.0/24.

This prevented Home Assistant from discovering any devices and had to be fixed.

Continue reading “Home Lab: Part 3 – Networking Revisited”

Home Lab: Part 2 – Networking Setup

Next up in the series, we’re going to manually configure all of the network settings to get our flat network home lab. Our flat network should not use any packet encapsulation with all pods and services fully routable to and from the existing network.

Detailed in the previous post, I want a so-called flat network because packet encapsulation tunnels IP packets inside of other IP packets and creates a separate IP network that runs on-top of my existing network.) I wanted all nodes, pods, and services to be fully routable on my home network. Additionally, I had several Sonos speakers and other smart-home devices that I wanted to be control from my k8s cluster which required pods that ran on the same subnet as my other software.

Install CNI Plugin

The CNI (Container Network Interface) plugin is responsible configuring the network adapter that each Kubernetes pod has. Since each pod usually gets a separate network namespace isolated from the host’s main network adapter, without it, no pod could make any network calls. For more information, check out cni.dev or the K8s documentation.

Continue reading “Home Lab: Part 2 – Networking Setup”