This post continues from the previous post in the series where I walked through the decision process on what energy monitor system to use and how to install Brultech GEM Monitor. I ended with the hardware physically installed and all Current Transformers (CTs) connected.
In this post, I continue from that point and walk through the network and software configuration defining each circuit size.
Previously in my Home Lab series, I described how my home lab Kubernetes clusters runs with a DHCP CNI–all pods get an IP address on the same layer 2 network as the rest of my home and an IP from DHCP. This enabled me to run certain software that needed this like Home Assistant which wanted to be able to do mDNS and send broadcast packets to discover device.
However, not all pods actually needed to be on the same layer 2 network and lead to a few situations where I ran out of IP addresses on the DHCP server and couldn’t connect any new devices until reservations expired:
I also had a circular dependency where the main VLAN told clients to use a DNS server that was running in Kubernetes. If I had to reboot the cluster, my Kubernetes cluster could get stuck starting because it tried to query a DNS server that wasn’t started yet (For simplicity, I use DHCP for everything instead of static config).
In this post, I explain how I built a new home lab cluster with K3s and used Multus to run both Calico and my custom Bridge+DHCP CNI so that only pods that need layer 2 access get access.
In my previous post where I outlined challenges that I’ve encountered with Rancher. As part of the feedback to that I ended up having to rebuild one of my clusters. I took that time to try out RKE2 and K3s for my home lab. In this home lab, I use a custom CNI based on the official Bridge and DHCP IPAM CNIs (Read more) to enable my smart home software (HomeAssistant) to communicate with other devices on the same Layer 2 domain.
However, it seems that if you try to spin up a RKE2 cluster on a host with a Bridge interface setup (See here) then it’ll get stuck during provisioning and you won’t be able to download a Kube Config from Rancher Server because Rancher thinks it’s offline. I reported this issue initially here.
In this blog post, I explain more about the problem and how to directly connect to the cluster to install a working CNI such that Rancher will correctly start.
How to make your system robust against your worst nightmare–your future self
In this post, I talk about some strategies that I’ve learned to simplify class structures in Java services that load and persist data into data stores like DynamoDB or RDS at the same time making the codebase safer.
As always, my opinions are my own.
At Amazon, I ended up joining two teams that were suffering under the technical debt. Each time, I was asked to spend some time understanding why the products were unstable and users were encountering frequent bugs. In one system, responsible for managing critical metadata about products in the catalog, was experiencing problems where users were reporting that they’d randomly lose data.
A service that was losing client data is a terrible service and caused users to lose trust in this system. Note that some details of this story have been modified for confidentiality reasons. Let’s dive in.
For the last 5+ years, I’ve been tracking my various aspects of my personal health using Samsung Health. It helps track weight, calories, heart rate, stress, and exercise and stores all of it in the app.
However, the app only gives some basic high level charts and insights. Luckily, it enables you to export your personal data into CSV files that you can then import into your tool of choice and perform any kind of analytics. In this post, I’m going to show how to export it all, then load it into Zeppelin and some sample Pandas queries that’ll enable you to start building more complex queries yourself.
Ever wondered where the energy is going in your house and know exactly when and which circuit is consuming the most electricity? How much is your air conditioning unit costing you each month in kWh?
Home energy monitors are devices that you can use to monitor how much energy you’re using at any given point in time. You can use them to figure out how much each device or circuit you’re using overnight vs the day. If you have differing energy costs at the day vs night, you can use them to ensure devices run at lower cost time of day, you can use it to as part of a smart home automation to automatically notify you when your washing machine is done, or even identify when you need to upgrade a circuit because your server room is pulling too much.
In this post, I’m going to walk through the different products I considered for a project at a friend’s house, pros and cons, and how to order the appropriate equipment.
I recently helped my friends configure their CenturyLink Gigabit fiber service so they can use their own hardware instead of the provided hardware. This gave them a lot of flexibility in how the network is configured, however CenturyLink requires you to enable PPPoE and use 6RD to use IPv6 instead of natively supporting IP packets, you have to jump through hoops. I’m sure there’s some reason why their network works like that, but I figured I’d document what needs to be done and explain how it works.
In addition to my home lab K8s cluster, I have two dedicated servers that I run in the cloud running a separate Kubernetes cluster. This cluster runs my production servers, like this blog, Postfix, DNS, etc. I wanted to add a VPN between my home network and my prod k8s network for two reasons:
All data should be encrypted between these networks. While I use HTTPS when possible, some traffic like DNS isn’t encrypted
My servers outside the NAT should be able to access servers running behind my NAT. I run a Prometheus instance at home that I want my primary Prometheus instance to be able to scrape. Using a VPN can help bypass the NAT and firewall on my router so it can scrape. Additionally, I wanted to be able to access pods directly from my home as needed.
I came across a number of guides for basic Wireguard VPN tunnel configurations which were fine, but they didn’t describe how to solve some of the more advanced issues like BGP routing for MetalLB or how to encrypt traffic to the host itself.
For example, since I have more than one host in my cluster, if I use MetalLB to announce an IP, the Wireguard instance on my router won’t know which host to forward traffic to because it uses the destination IP to pick the encryption key. This results in Wireguard sending traffic possibly to the wrong host.
This blog post will explain everything you need to know to configure a Wireguard VPN that doesn’t suffer from these limitations.
You were supposed to bring balance to Kubernetes, Rancher, not destroy it
et tu? Rancher?
I’ve been maintaining my own dedicated servers for around 7 years now as a way to learn and improve skills and have a place to run my different web sites, mail servers, even this blog. Over the years the hardware has changed and I’ve moved from hosting Rails applications directly on the OS to Docker and finally Kubernetes. I’ve learned a lot of skills that eventually helped me in my professional career at my job that it’s definitely been worth it, but maintaining this server has had its massive pain points where I’ve just had to walk away and leave stuff broken for days until I finally fix the issues.
I selected Rancher several years ago (at least 3 or 4 years ago I’d estimate) when I finally moved to Kubernetes. I liked how it automatically provisioned my clusters, managed networking, and provided a nice UI. It was also reasonably recommended by the internet. Things worked reasonably well, but after adopting Rancher and Kubernetes, every 6-12 months I’d end up having something massively break and I’d have to rebuild the entire Kubernetes cluster painstakingly and many times I’d tell myself if it broke, then I’d just swear off Rancher entirely, but it never happened because I eventually got everything working.
After upgrading to Rancher v2.6.3 that just launched yesterday and finding that all my clusters were removed from Rancher, I hit my breaking point.
After I’ve had time to run my home lab for a while, I’ve started switching to a more up to date Linux distribution (instead of RancherOS.) I’m currently testing Ubuntu Server which leverages Systemd. Systemd-networkd is responsible for managing the network interface configuration and it differs in behavior compared to NetworkManager enough that we need to update the Home Lab Bridge CNI to handle it.
Previously the CNI was creating a bridge network adapter when the first container started up, but this causes problems with systemd because resolved (the DNS resolver component) was eventually failing to make DNS queries and networkd was duplicating IP addresses on both eth0 (the actual uplink adapter) and on cni0 because we were copying it over.