We usually use domain names, such as taoshu.in, to access the Internet. But computers need to use IP addresses when they communicate. The core function of DNS is to save the mapping between domain names and IP addresses. Today, we will introduce how DNS works.

In DNS, the core concept is the domain name. A domain name is a way of organizing information hierarchically, somewhat similar to a computer’s folders. Take a Windws system for example, the path where user Demo’s desktop is located is C:\Users\Demo\Desktop. The first C: indicates that it is on the C drive, which is the top level. Then it is split by a backslash, with each level indicating a folder. So the files on the desktop are saved in the Desktop folder under the Demo folder in the Users folder under the C drive.

The way the folders are organized has the following characteristics.

  • There is a top-level folder (in this case, the C drive)
  • A splitter is used between folders to indicate the hierarchy
  • File names under different folders do not conflict with each other

Replace the C drive with the root, replace the backslash with a dot, and write the folder path backwards to make the domain name. Let’s take www.taoshu.in as an example to analyze the structure of the domain name. In fact, this domain name should be written as www.taoshu.in. in the DNS system, but in actual use, you can omit the last . . Let’s look from right to left.

All domain names are subdomains of the root domain. The root domain name is the last omitted period . . If we imagine the root domain as the C drive, then all domains are folders under the C drive. The direct subdomains under the root domain name are also called top-level domains, commonly used are com./net./org. etc. This category is called Top Level Domain (TLD). With the development of the Internet, the governing body also assigns the so-called Country Code top-level domain (ccTLD) to each country. tld is managed by the international Internet organization, ccTLD is managed by each sovereign country itself. China’s ccTLD is cn.. When choosing a domain name, generally choose a TLD, such as com.. Of course, there are some ccTLDs that have special meanings and are very much used, such as .io/.tv/.ai. Like this site, we chose the ccTLD domain name in. from India, to take the meaning of happiness in it.

The domain names we usually register are under the top-level domain, also called the first-level domain. For example, in the spirit of taoshu.in. is the first level domain name under in.. For ccTLDs, there are also two levels of first-level domain names, such as com.cn. Although example.com.cn contains three parts, it is still a first-level domain name.

Once a domain name is registered, we can set up a second level or deeper level domain name according to our actual situation, such as wwww.taoshu.in. which means a second level domain name for providing web services. If we want to provide mail service, we can set mail.taoshu.in..

DNS is a distributed system where each layer only cares about its own direct subdomains, regardless of the subdomains of the subdomains. Take the root domain . for example, its direct subdomains are top-level domains like com./net./org.. So only the IP addresses of the DNS servers of the top-level domain are stored on the servers of the root domain. This data is also called NS records. The file where the data is saved is usually called zone. As of today, the root DNS server has 1489 top-level domain records.

The other top-level domains are managed by their respective servers. For example, all records for first-level domains under in. are managed by India’s ccTLD servers, which we can view with the drill command.

1
2
3
4
5
6
7
➜ dns drill NS in.
in. 1268 IN NS ns1.registry.in.
in. 1268 IN NS ns2.registry.in.
in. 1268 IN NS ns3.registry.in.
in. 1268 IN NS ns4.registry.in.
in. 1268 IN NS ns5.registry.in.
in. 1268 IN NS ns6.registry.in.

By registering a domain name, I am gaining the right to write records at a higher level of name servers. For example, if I register a taoshu.in. domain name and support the fee, then I can write information about taoshu.in. in the domain name system of in. and specify the DNS server of the domain name. I now use Cloudflare to resolve.

1
2
3
➜ dns drill NS taoshu.in.
taoshu.in. 3090 IN NS earl.ns.cloudflare.com.
taoshu.in. 3090 IN NS mona.ns.cloudflare.com.

Here you can see the two Cloudflare servers. Once the taoshu.in. NS records written to the in. server, the subsequent management can be completed in the Cloudflare backend.

The above is the structure of the domain name and the relationship with the DNS server. That how to find the corresponding IP address through the domain name? Here we will analyze the query process.

The entire DNS system is divided into three parts.

  • A proxy server (stub resolver)
  • A recursive resolver
  • multiple authoritative name servers

The servers we mentioned above are all authoritative servers. Authoritative means that the final data is kept, and all DNS resolution results are based on the data on the authoritative server.

A proxy server is not a real server, but has a DNS resolution component on each computer system. There are various programs running on each computer that do not need to handle the DNS resolution process themselves, but simply send the requests to the DNS component of the operating system. The DNS component here is called a proxy server.

The recursive resolution server is the server that actually performs the DNS queries. Using www.example.com as an example, the entire query process is as follows.

sobyte

  1. the proxy server queries the recursive server for the A record of the domain name (i.e., the ipv4 address)
  2. The recursive server sends the query request as-is to the root domain’s server
  3. the root DNS server only keeps the information of the root domain name, it does not know the information of www.example.com, so it returns the NS record of the com. domain
  4. the recursive server forwards the query request to the domain name server of com. based on the returned NS record
  5. the com. name server also only records the NS server of www.example.com, so it returns the NS record
  6. the recursive server forwards the request to the name servers of example.com..
  7. In general, the domain name server of example.com. holds the www.example.com. information, so it returns the corresponding A record

If the name server of example.com only keeps the NS server information of www.example.com., then the whole process will continue. It is called a recursive query server because the whole process is recursive in layers.

Each server sends and receives messages to and from each other using port 53 of UDP, DNS query content is not encrypted, it is all plaintext communication .

From the query process, we can see that all domain name queries need to visit the server of the root domain first, and then get the servers of all levels of domain names. So how does the recursive service know the IP address of the root server? The answer is write dead server address in the code . There are a total of 13 logical root servers in the world.

Obviously, 13 servers are far from being able to cope with the DNS lookup requests of today’s Internet. The so-called Anycast technology has been used to deploy up to 1500 mirror servers around the world. The so-called Anycast is the use of the BGP protocol to broadcast the IP address of the root server across different networks. Requests for the root server can be sent to the nearest mirror server. Each server synchronizes the root domain data regularly. China has deployed five groups of F/I/J/K/L root mirror servers. All domestic users’ query requests to the root servers can be completed without going through the international Internet. However, the ones in China are mirror servers, which can only provide query function and cannot change the record content of the root domain name.

Now to summarize the system design of DNS.

  • Domain names are hierarchical, similar to file paths, but using . is split and arranged in reverse order
  • The DNS is a distributed system, with only sub-domain information recorded at each level of the name servers
  • The DNS uses the UDP protocol for plaintext communication
  • The DNS contains three types of servers: proxy/recursive/authoritative
  • Root servers are deployed globally using Anycast

These are the main elements of the DNS system, the oldest protocol on the Internet, which uses UDP for plaintext communication, and therefore the most problematic in terms of security and privacy protection. The Internet is rife with DNS-based content censorship, behavioral monitoring, domain name hijacking, website blocking, denial of service attacks, and other behaviors. The Internet Standards Organization has also been working to address various security issues with DNS. I will cover these later.