Domain names actually end with a period and why that might subtly break your system

DNS is the protocol that converts domain names like “technowizardry.net” into the IP address of the server that will respond like “144.217.181.222”. In DNS, domain names actually are supposed to end with a period. For example, the URL of this website is not “www.technowizardry.net”, but it’s actually “www.technowizardry.net.” Notice the period at the end.

Where does this come from? If you look at a DNS packet in a packet capture, you’ll see that each query looks something like this:

The queried domain starts right where I’ve highlighted in the above picture. Domain names are separated by each period. In this example, I have 3 separate domain parts: [“www”, “technowizardry”, “net”]. The byte sequence looks like:

0000  03 77 77 77 0e 74 65 63  68 6e 6f 77 69 7a 61 72   ·www·tec hnowizar
0010  64 72 79 03 6e 65 74 00                            dry·net· 

The first byte (0x03) means the first label is 3 bytes. Then 3 bytes come ASCII encoded as “www”. The next byte (0x0e) comes meaning 14 bytes for “technowizardry”. Then another 0x03 saying 3 bytes for “net”. Then finally a 0x00 meaning no more labels in the domain name. The array is null terminated. Once you hit a 0 length label, then you’re done. However, this 0x00 also represents a “.” suffix.

Why does this matter?

You might think, I’ve never had to type a dot at the end of a domain name, why should I care? To find out, first thing is to understand how a DNS query gets performed. When you type in a domain name into your browser, it doesn’t directly turn what you typed into a DNS query and see what it returns. This process is more complicated with modern browsers that combine the search bar with the address bar and have to figure out if you mean to search or go to a domain name. I’ll ignore the search part for now.

On Linux, the browser or application will call a method, getaddrinfo, which itself is responsible for performing DNS lookup. This method will then consult /etc/resolv.conf to understand how to perform the query. Something like this:

nameserver 192.168.2.1
search us-west-2.technowizardry.net technowizardry.net
options ndots:1

In the above file, there’s 3 configuration options:

  • nameserver – Which just states where to forward your queries. This often times come from DHCP
  • search – This is the DNS search list. Multiple can be specified. This states that queries for “example” may be translated into “example.us-west-2.technowizardry.net” or “example.technowizardry.net” when creating the query
  • options ndots – If the query contains 1 or more dots in the query, then it’s considered to be fully qualified. If it contains 0, then it’s uses the search list

Now, maybe you’re starting to see the risk. ndots controls whether or not the query is *assumed* to be fully qualified, i.e. complete. Most web users will type in a full domain name and domains will always have at least one dot in them. facebook.com, google.com, etc. All have a dot, so users never notice an issue.

However, there are a few situations where this will break:

The Curse of Email Addresses

At my job, a coworker recently had to figure out how to validate an email address to provide early feedback to a user that it may be wrong. That coworker found the Apache Commons EmailValidator.java class (note that and the validation looked something like:

public boolean isValid(String email) { // ... if (email == null) { return false; } if (email.endsWith(".")) { return false; }
Code language: JavaScript (javascript)

Now this is interesting. If you give an email that looks like “user@foo.”, it’s considered to be invalid, however this check is wrong.

It’s technically valid to have an MX record (the DNS record type that denotes where to deliver email) on a Top Level Domain (TLD). Several TLDs actually have it (source):

.AI   =>   mail.offshore.AI.
.AS   =>   dca.relay.gdns.net.
.BJ   =>   mail6.domain-mail.com.
.CF   =>   mail.intnet.CF.
.DJ   =>   smtp.intnet.DJ.
      =>   relais2.intnet.DJ.
.DM   =>   mail.nic.DM.
.GP   =>   ns1.nic.GP.
      =>   ns34259.ovh.net.
      =>   manta.outremer.com.
.HR   =>   alpha.carnet.HR.
.IO   =>   mailer2.IO.
.KH   =>   ns1.dns.net.KH.
.KM   =>   mail1.comorestelecom.KM.
.MH   =>   imap.pwke.twtelecom.net.
.MQ   =>   mx1-mq.mediaserv.net.
.NE   =>   bow.rain.fr.
      =>   bow.intnet.NE.
.PA   =>   ns.PA.
.TD   =>   mail.intnet.TD.
.TT   =>   66-27-54-142.san.rr.com.
      =>   66-27-54-138.san.rr.com.
.UA   =>   mr.kolo.net.
.VA   =>   proxy2.urbe.it.
      =>   john.vatican.VA.
      =>   paul.vatican.VA.
      =>   lists.vatican.VA.
.WS   =>   mail.worldsite.WS.
.TD   =>   mail.intnet.TD
.YE   =>   mail.yemen.net.YE.

However, if I try to send an email to user@ai, what does my mail server do? It will call getaddrinfo to look up the MX record for “ai”. Consulting the previous resolv.conf, since there’s no dots, it’ll try to query “ai.us-west-2.technowizardry.net” or “ai.technowizardry.net”– not what I expected.

This kind of configuration is very common in corporate networks. They’ll specify a DNS search path of their corporate domain name so employees can type “www”. Users end up not being able to email this perfectly valid email account because their mail servers end up being misconfigured.

The generally accepted practice in this case (mentioned in the Stack Overflow answer) is to instead email “user@ai.“. Unfortunately this email validator considers this to be invalid.

Technically it should be legal to send email to both “user@example.com” and “user@example.com.” However due to this subtle behavior we end up with non-compliant software configuration.

The Curse of Kubernetes

In the previous example, we talked about an issue when ndots is set 1, but what if we dial it up to 5? That’s exactly the case in Kubernetes.

By default, pods running in Kubernetes get a custom /etc/resolv.conf that looks like this:

nameserver 10.43.0.10 # kube-dns instance in the cluster
search mail.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

This is done because Kubernetes wants you to be able to query for other pods in the same namespace using just the name like “dovecot” -> “dovecot.mail.svc.cluster.local.” or “mysql.datastore” -> “mysql.datastore.svc.cluster.local“.

But what happens if you try to query for “www.google.com”? There’s two dots there, you and I know it’s fully qualified, but unfortunately getaddrinfo doesn’t. Instead it issues queries for:

  1. www.google.com.mail.svc.cluster.local.
  2. www.google.com.svc.cluster.local.
  3. www.google.com.cluster.local.
  4. www.google.com.

Everything but the last query all result in NXDOMAINs (DNS response code for no record found.) This happens for every query you make in a Kubernetes container, except if you explicitly query for “www.google.com.” with a trailing dot.

All of this results in a significant amount of DNS traffic flying around my cluster, and we can clearly see the amount of NXDOMAIN responses for cluster.local domains in the following graph. This causes increased latencies for service operations inside the cluster:

sum(increase(coredns_dns_responses_total{rcode=”NXDOMAIN”}[1h])) by (rcode, zone)

Unfortunately, the search list is part of Kubernetes’ service discovery process and can’t be changed across the cluster.

Fixing Kubernetes

The only way to avoid the increased DNS traffic in Kubernetes is to change the ndots configuration for each Pod:

apiVersion: v1
kind: Pod
metadata:
  namespace: default
  name: dns-example
spec:
  containers:
    - name: test
      image: nginx
  dnsConfig:
    options:
      - name: ndots
        value: "0"

Once I deployed this across a few key deployments, I saw a dramatic decrease in the number of DNS queries that my cluster sent.

The Curse of DNS Servers

Trailing dots are important when creating records in a DNS zone too. In BIND and many DNS configuration UIs, when you create a CNAME, it looks something like this:

google.technowizardry.net.	900	IN	CNAME	google.com.
^ Domain Name                   ^TTL            ^Type   ^ Target

A CNAME is a type of DNS alias, when I query google.technowizardry.net, it should tell the client to go lookup google.com:

dig google.technowizardry.net

;; ANSWER SECTION:
google.technowizardry.net. 0    IN      CNAME   google.com

However, in some servers if I leave off the trailing dot in the CNAME, I actually end up seeing:

dig google.technowizardry.net

;; ANSWER SECTION:
google.technowizardry.net. 0    IN      CNAME   google.com.technowizardry.net

Thus, trailing dot is again critical.

Conclusion

In this post, I talked about the subtle assumption that everybody makes that domain names don’t need a trailing dot. While most users don’t have to type trailing dots normally because the DNS query library usually fixes the problem for you, it’s technically required and will cause some strange behavior if you’re not aware of it.

References

https://github.com/kubernetes/kubernetes/issues/45976

https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html

Leave a Reply

Your email address will not be published.