Recursive DNS, Round Trip Times, Delegations & DNS Performance
DNS Traffic Management // May 3, 2012 // Tom Daly
Usually on the Dyn blog, you’ll hear us talking about the technology behind our authoritative DNS infrastructure, the benefits of IP anycast routing, the scale of our global infrastructure and more.
Often ignored, but still a major part of the DNS system, is the recursive DNS infrastructure, traditionally deployed by ISPs to serve their customers.
One way to think of the two pieces of the DNS is that the authoritative DNS (ADNS) is the Internet telephone book (a directory of DNS hostnames mapping to IP addresses) and the recursive DNS (RDNS) is like directory assistance, helping you look up an entry in the authoritative DNS.
There are tons of recursive DNS servers all over the world. In fact, on April 1st, 2012, Dyn’s IP anycast network communicated with nearly 3.2MM unique recursive DNS servers around the world. Every ISP in the world runs them for their customers. Enterprises need to run them to support their internal networks and there are third-party DNS options such as Internet Guide running all over the world.
These RDNS servers bridge the gap between the stub resolver software running on your Windows, Mac or Linux box, as well as the authoritative DNS servers, by traversing through the DNS from the root to the specific hostname being requested.
A single request from a stub resolver can translate to tons of DNS requests being made by the upstream recursive as the delegation of a domain is chased and the performance of the query’s corresponding ADNS servers can drastically affect the delay in the answer from getting through.
Some RDNS servers (approximately 80% of them) employ an optimization known as “round trip time (RTT) banding”. To explain, the RTT of a DNS query is the measurement of the delay between a DNS query being issued and the time the answer is received.
RDNS servers use this technique to determine which ADNS servers for a domain are the fastest responding, so that once an RDNS has queried all of the ADNS servers for a domain at least once, they can return to the fastest responding server for subsequent queries. It’s why using a fast IP anycast DNS network is so important as delays in populating the RTT table due to slow unicast ADNS servers can cause performance issues.
Lame & Sideways Delegations
Lame delegations can also cause significant delays in DNS resolution times for the end user, which affects their user experience. A lame delegation is a nameserver that doesn’t answer authoritatively for the domain in question or doesn’t answer DNS queries at all. The DNS timeouts in many RDNS implementations can be set between two and ten seconds, creating delay for your customers while RTT banding tries to do its job.
Here’s an example of a domain with a lame delegation:
Delegation at .com:
example.com. 172800 IN NS ns1.example.com
example.com. 172800 IN NS ns2.example.com
example.com. 172800 IN NS internal-ns.example.com
In this case, we’ll assume that ns1.example.com and ns2.example.com are the external nameservers for example.com and internal-ns.example.com is some kind of internal DNS server (not on the Internet) that got included in the delegation. When a RDNS is priming its RTT table (as described above), a query to internal-ns.example.com will timeout (2 – 10 seconds), delaying the response to a stub client waiting on a request. What a huge hit to performance!
Another DNS faux pas that can greatly affect performance is a sideways delegation, a situation where the NS records at the parent zone (i.e. .com) don’t match the NS records located at the apex of the zone in question (i.e. foo.com). This creates a situation where a RDNS can be confused about where to locate the authoritative data for the zone and the behavior of the DNS is actually undocumented in this situation.
In practice of working with customers, the resultant behavior has often been inconsistently performing DNS operations with even small intermittent outages occurring.
Here’s an example of a sideways delegation:
Delegation at .com:
example.com. 172800 IN NS ns1.example.com
example.com. 172800 IN NS ns2.example.com
NS records retrieved from ns1.example.com
example.com. 172800 IN NS nsa.example.com
example.com. 172800 IN NS nsb.example.com
It’s these types of sharp, pointy, edgy details of the DNS that illustrate the need for DNS operations to be left to the experts. We’re constantly on the lookout for lame delegations, sideways delegations or any issues that could affect the availability of our ADNS servers, which in turn, translate to fast, speedy responses for the RDNS servers used by our clients’ customers around the world.
With RTT banding, it’s why having consistent performing anycast DNS is so important on a worldwide scale. Don’t let these types of issues cause your domain latency or downtime!