How to build a useful service data change audit log

If you’ve got a service that provides clients with the ability to make changes to those entities, then you probably want an audit log that tracks who makes what changes.

I decided to write this post because I frequently saw teams at Amazon not thinking through these considerations. Some of the guidance does focus on AWS IAM, but a lot of it is practical for any type of audit log.

Important aspects to an audit log:

  • Who made the change?
  • When did they make the change?
  • Where did they make the change?
  • What did they do?
Continue reading “How to build a useful service data change audit log”

Best Practices for working with Google Guice

Google Guice is a dependency injection library for Java and I frequently used it on a number of Java services. Compared to Spring, I liked how simple and narrow focused on just dependency injection it was. However, I often times saw developers using it in incorrect or non-ideal patterns that increased boilerplate or were just wrong.

These are all recommendations that I’ve accumulated over several years at working at Amazon watching engineers and sometimes myself improperly leverage Google Guice.

Continue reading “Best Practices for working with Google Guice”

Domain names actually end with a period and why that might subtly break your system

It’s not DNS, it’s never DNS. It was DNS.

DNS is the protocol that converts domain names like “technowizardry.net” into the IP address of the server that will respond like “144.217.181.222”. In DNS, domain names actually are supposed to end with a period. For example, the URL of this website is not “www.technowizardry.net”, but it’s actually “www.technowizardry.net.” Notice the period at the end.

Where does this come from? If you look at a DNS packet in a packet capture, you’ll see that each query looks something like this:

The queried domain starts right where I’ve highlighted in the above picture. Domain names are separated by each period. In this example, I have 3 separate domain parts: [“www”, “technowizardry”, “net”]. The byte sequence looks like:

Continue reading “Domain names actually end with a period and why that might subtly break your system”

Accurate, Local Home Energy Monitoring: Part 2 – Network Config

This post continues from the previous post in the series where I walked through the decision process on what energy monitor system to use and how to install Brultech GEM Monitor. I ended with the hardware physically installed and all Current Transformers (CTs) connected.

In this post, I continue from that point and walk through the network and software configuration defining each circuit size.

Continue reading “Accurate, Local Home Energy Monitoring: Part 2 – Network Config”

Kubernetes: A hybrid Calico and Layer 2 Bridge+DHCP network using Multus

Previously in my Home Lab series, I described how my home lab Kubernetes clusters runs with a DHCP CNI–all pods get an IP address on the same layer 2 network as the rest of my home and an IP from DHCP. This enabled me to run certain software that needed this like Home Assistant which wanted to be able to do mDNS and send broadcast packets to discover device.

However, not all pods actually needed to be on the same layer 2 network and lead to a few situations where I ran out of IP addresses on the DHCP server and couldn’t connect any new devices until reservations expired:

My DHCP IP pool completely out of addresses to give to clients

I also had a circular dependency where the main VLAN told clients to use a DNS server that was running in Kubernetes. If I had to reboot the cluster, my Kubernetes cluster could get stuck starting because it tried to query a DNS server that wasn’t started yet (For simplicity, I use DHCP for everything instead of static config).

In this post, I explain how I built a new home lab cluster with K3s and used Multus to run both Calico and my custom Bridge+DHCP CNI so that only pods that need layer 2 access get access.

Continue reading “Kubernetes: A hybrid Calico and Layer 2 Bridge+DHCP network using Multus”

How to gain access to a RKE2 cluster without Rancher when the CNI doesn’t work

In my previous post where I outlined challenges that I’ve encountered with Rancher. As part of the feedback to that I ended up having to rebuild one of my clusters. I took that time to try out RKE2 and K3s for my home lab. In this home lab, I use a custom CNI based on the official Bridge and DHCP IPAM CNIs (Read more) to enable my smart home software (HomeAssistant) to communicate with other devices on the same Layer 2 domain.

However, it seems that if you try to spin up a RKE2 cluster on a host with a Bridge interface setup (See here) then it’ll get stuck during provisioning and you won’t be able to download a Kube Config from Rancher Server because Rancher thinks it’s offline. I reported this issue initially here.

In this blog post, I explain more about the problem and how to directly connect to the cluster to install a working CNI such that Rancher will correctly start.

Continue reading “How to gain access to a RKE2 cluster without Rancher when the CNI doesn’t work”

Defensive Coding: Stop using your storage models everywhere

How to make your system robust against your worst nightmare–your future self

In this post, I talk about some strategies that I’ve learned to simplify class structures in Java services that load and persist data into data stores like DynamoDB or RDS at the same time making the codebase safer.

As always, my opinions are my own.

At Amazon, I ended up joining two teams that were suffering under the technical debt. Each time, I was asked to spend some time understanding why the products were unstable and users were encountering frequent bugs. In one system, responsible for managing critical metadata about products in the catalog, was experiencing problems where users were reporting that they’d randomly lose data.

A service that was losing client data is a terrible service and caused users to lose trust in this system. Note that some details of this story have been modified for confidentiality reasons. Let’s dive in.

Continue reading “Defensive Coding: Stop using your storage models everywhere”

Plot your health with Samsung Health and Pandas

Artwork by Sami Lee.

For the last 5+ years, I’ve been tracking my various aspects of my personal health using Samsung Health. It helps track weight, calories, heart rate, stress, and exercise and stores all of it in the app.

However, the app only gives some basic high level charts and insights. Luckily, it enables you to export your personal data into CSV files that you can then import into your tool of choice and perform any kind of analytics. In this post, I’m going to show how to export it all, then load it into Zeppelin and some sample Pandas queries that’ll enable you to start building more complex queries yourself.

Continue reading “Plot your health with Samsung Health and Pandas”

Accurate, Local Home Energy Monitoring: Part 1 – Hardware

Ever wondered where the energy is going in your house and know exactly when and which circuit is consuming the most electricity? How much is your air conditioning unit costing you each month in kWh?

Home energy monitors are devices that you can use to monitor how much energy you’re using at any given point in time. You can use them to figure out how much each device or circuit you’re using overnight vs the day. If you have differing energy costs at the day vs night, you can use them to ensure devices run at lower cost time of day, you can use it to as part of a smart home automation to automatically notify you when your washing machine is done, or even identify when you need to upgrade a circuit because your server room is pulling too much.

In this post, I’m going to walk through the different products I considered for a project at a friend’s house, pros and cons, and how to order the appropriate equipment.

Continue reading “Accurate, Local Home Energy Monitoring: Part 1 – Hardware”

CenturyLink Gigabit service on Mikrotik RouterOS with PPPoE and IPv6

I recently helped my friends configure their CenturyLink Gigabit fiber service so they can use their own hardware instead of the provided hardware. This gave them a lot of flexibility in how the network is configured, however CenturyLink requires you to enable PPPoE and use 6RD to use IPv6 instead of natively supporting IP packets, you have to jump through hoops. I’m sure there’s some reason why their network works like that, but I figured I’d document what needs to be done and explain how it works.

Continue reading “CenturyLink Gigabit service on Mikrotik RouterOS with PPPoE and IPv6”