Split Horizon DNS with external-dns and cert-manager for Kubernetes

There were a few services that I ran that I wanted to be able to access from both inside my home network and outside my home network. If I was inside my home network, I wanted to route directly to the service, but if I was outside I needed to be able to route traffic through a proxy that would then route into my home lab. Additionally, I wanted to support SSL on all my services for security using cert-manager

Since my IPv4 addresses differ inside my network vs outside, I need to use split-horizon DNS to respond with the correct DNS query. Split-horizon DNS refers to the DNS on one horizon (inside the network) showing different results than outside the network.

Continue reading “Split Horizon DNS with external-dns and cert-manager for Kubernetes”

Best Practices for Java testing with JUnit

JUnit is a popular testing library for Java applications and I extensively used it when working at Amazon for the numerous Java applications and services there. However, I came across a number of different anti-patterns and areas to improve the quality of the test code. This post introduces many of the different tricks and patterns that I’ve learned and shared with my coworkers, and now want to share

Another library to know and reference is Mockito, which I use extensively in JUnit test cases and will reference this too below.

These are all real things that I’ve seen developers do.

Continue reading “Best Practices for Java testing with JUnit”

How to build a useful service data change audit log

If you’ve got a service that provides clients with the ability to make changes to those entities, then you probably want an audit log that tracks who makes what changes.

I decided to write this post because I frequently saw teams at Amazon not thinking through these considerations. Some of the guidance does focus on AWS IAM, but a lot of it is practical for any type of audit log.

Important aspects to an audit log:

  • Who made the change?
  • When did they make the change?
  • Where did they make the change?
  • What did they do?
Continue reading “How to build a useful service data change audit log”

Best Practices for working with Google Guice

Google Guice is a dependency injection library for Java and I frequently used it on a number of Java services. Compared to Spring, I liked how simple and narrow focused on just dependency injection it was. However, I often times saw developers using it in incorrect or non-ideal patterns that increased boilerplate or were just wrong.

These are all recommendations that I’ve accumulated over several years at working at Amazon watching engineers and sometimes myself improperly leverage Google Guice.

Continue reading “Best Practices for working with Google Guice”

Domain names actually end with a period and why that might subtly break your system

It’s not DNS, it’s never DNS. It was DNS.

DNS is the protocol that converts domain names like “technowizardry.net” into the IP address of the server that will respond like “144.217.181.222”. In DNS, domain names actually are supposed to end with a period. For example, the URL of this website is not “www.technowizardry.net”, but it’s actually “www.technowizardry.net.” Notice the period at the end.

Where does this come from? If you look at a DNS packet in a packet capture, you’ll see that each query looks something like this:

The queried domain starts right where I’ve highlighted in the above picture. Domain names are separated by each period. In this example, I have 3 separate domain parts: [“www”, “technowizardry”, “net”]. The byte sequence looks like:

Continue reading “Domain names actually end with a period and why that might subtly break your system”

Accurate, Local Home Energy Monitoring: Part 2 – Network Config

This post continues from the previous post in the series where I walked through the decision process on what energy monitor system to use and how to install Brultech GEM Monitor. I ended with the hardware physically installed and all Current Transformers (CTs) connected.

In this post, I continue from that point and walk through the network and software configuration defining each circuit size.

Continue reading “Accurate, Local Home Energy Monitoring: Part 2 – Network Config”

Kubernetes: A hybrid Calico and Layer 2 Bridge+DHCP network using Multus

Previously in my Home Lab series, I described how my home lab Kubernetes clusters runs with a DHCP CNI–all pods get an IP address on the same layer 2 network as the rest of my home and an IP from DHCP. This enabled me to run certain software that needed this like Home Assistant which wanted to be able to do mDNS and send broadcast packets to discover device.

However, not all pods actually needed to be on the same layer 2 network and lead to a few situations where I ran out of IP addresses on the DHCP server and couldn’t connect any new devices until reservations expired:

My DHCP IP pool completely out of addresses to give to clients

I also had a circular dependency where the main VLAN told clients to use a DNS server that was running in Kubernetes. If I had to reboot the cluster, my Kubernetes cluster could get stuck starting because it tried to query a DNS server that wasn’t started yet (For simplicity, I use DHCP for everything instead of static config).

In this post, I explain how I built a new home lab cluster with K3s and used Multus to run both Calico and my custom Bridge+DHCP CNI so that only pods that need layer 2 access get access.

Continue reading “Kubernetes: A hybrid Calico and Layer 2 Bridge+DHCP network using Multus”

How to gain access to a RKE2 cluster without Rancher when the CNI doesn’t work

In my previous post where I outlined challenges that I’ve encountered with Rancher. As part of the feedback to that I ended up having to rebuild one of my clusters. I took that time to try out RKE2 and K3s for my home lab. In this home lab, I use a custom CNI based on the official Bridge and DHCP IPAM CNIs (Read more) to enable my smart home software (HomeAssistant) to communicate with other devices on the same Layer 2 domain.

However, it seems that if you try to spin up a RKE2 cluster on a host with a Bridge interface setup (See here) then it’ll get stuck during provisioning and you won’t be able to download a Kube Config from Rancher Server because Rancher thinks it’s offline. I reported this issue initially here.

In this blog post, I explain more about the problem and how to directly connect to the cluster to install a working CNI such that Rancher will correctly start.

Continue reading “How to gain access to a RKE2 cluster without Rancher when the CNI doesn’t work”

Defensive Coding: Stop using your storage models everywhere

How to make your system robust against your worst nightmare–your future self

In this post, I talk about some strategies that I’ve learned to simplify class structures in Java services that load and persist data into data stores like DynamoDB or RDS at the same time making the codebase safer.

As always, my opinions are my own.

At Amazon, I ended up joining two teams that were suffering under the technical debt. Each time, I was asked to spend some time understanding why the products were unstable and users were encountering frequent bugs. In one system, responsible for managing critical metadata about products in the catalog, was experiencing problems where users were reporting that they’d randomly lose data.

A service that was losing client data is a terrible service and caused users to lose trust in this system. Note that some details of this story have been modified for confidentiality reasons. Let’s dive in.

Continue reading “Defensive Coding: Stop using your storage models everywhere”

Plot your health with Samsung Health and Pandas

Artwork by Sami Lee.

For the last 5+ years, I’ve been tracking my various aspects of my personal health using Samsung Health. It helps track weight, calories, heart rate, stress, and exercise and stores all of it in the app.

However, the app only gives some basic high level charts and insights. Luckily, it enables you to export your personal data into CSV files that you can then import into your tool of choice and perform any kind of analytics. In this post, I’m going to show how to export it all, then load it into Zeppelin and some sample Pandas queries that’ll enable you to start building more complex queries yourself.

Continue reading “Plot your health with Samsung Health and Pandas”