The Hypertext Transfer Protocol (HTTP) is today’s most widely used application layer protocol, drafted by Tim Berners-Lee at CERN in 1989, and has become the core of data transfer on the Internet. Over the past few years, HTTP/2 and HTTP/3 have also updated the existing protocol to provide more secure and faster transfers. The existing protocols have been updated to provide more secure and faster transfers. Most programming languages implement HTTP/1.1 and HTTP/2.0 in their standard libraries to meet the daily development needs of engineers, and the network library for the Go language that we are presenting today also implements both major versions of the HTTP protocol.
HTTP is an application layer protocol, and in general we use TCP as the underlying transport layer protocol to transfer packets, but HTTP/3 implements a new transport layer protocol, QUIC, on top of the UDP protocol and uses QUIC to transfer data, which also means that HTTP can run on both TCP and UDP.
Before analyzing the internal implementation principles, let’s take a look at some of the design of the HTTP protocol and the relationship between the hierarchy and modules within the standard library.
Requests and responses
The most common concepts in the HTTP protocol are the HTTP request and response, which can be understood as messages passed between the client and the server, where the client sends an HTTP request to the server, and the server receives the HTTP request and computes it and sends it to the client as an HTTP response.
Unlike other binary protocols, which are text transfer protocols, the HTTP protocol headers are all text data. The first line of the HTTP request header will contain the method, path, and protocol version of the request, followed by multiple HTTP protocol headers and the load carried.
HTTP responses also have a relatively similar structure, which also contains the protocol version, status code, response headers and load of the response, so we won’t go into that here.
The HTTP protocol currently runs mainly on the TCP protocol, which is a connection-oriented, reliable, byte-stream-based transport layer communication protocol. The data handed over by the application layer to the TCP protocol is not transmitted to the destination host as a message, but in some cases is combined into a data segment and sent to the destination host. Because the TCP protocol is byte-stream based, TCP-based application layer protocols all need to delineate the boundaries of their messages themselves.
The HTTP protocol actually implements both of these solutions, and in most cases the HTTP protocol adds Content-Length to the protocol header to indicate the length of the load, and the recipient of the message parses the header to determine the current The recipient of the message parses the protocol header to determine where the current HTTP request/response ends and separates the different HTTP messages, as illustrated by the following example of using Content-Length to delimit the message boundary.
When HTTP uses the Chunked Transfer mechanism, the HTTP header no longer contains the Content-Length, but uses the HTTP message with a load size of 0 as the terminator to indicate the message boundary.
The Go language wraps both HTTP client and server implementations in net/http, and to support better extensibility, it introduces the net/http.RoundTripper and net/http. The caller takes the request as an argument to get the response to the request, while net/http.Handler is mainly used by HTTP servers to respond to client requests: net/http.
The receiver of an HTTP request can implement the net/http.Handler interface, which implements the logic for processing HTTP requests. will get the HTTP response, write the data to the load and write the response header.
Both the client and the server face a two-way HTTP request and response, with the client constructing the request and waiting for a response, and the server processing the request and returning a response. HTTP requests and responses have more than one implementation in the standard library, and they all contain a hierarchy. net/http.RoundTripper in the standard library contains the hierarchy shown below.
Each implementation of the net/http.RoundTripper interface contains a procedure for making requests to the remote; the standard library also provides multiple implementations of net/http.Handler to provide different services for client HTTP requests.
Transactions and cookies are two of the most important modules we have available to us in the HTTP client package. In this section, we will analyze the principles of the client implementation, starting with the HTTP GET request and following the modules of building the request, transferring the data, getting the connection and waiting for the response. When we call net/http.Client.Get to send an HTTP, the following steps are performed.
- call net/http.NewRequest to construct the request based on the method name, URL and request body.
- call net/http.Transport.RoundTrip to open an HTTP transaction, obtain a connection, and send the request.
- waiting for a response in the net/http.persistConn.readLoop method of an HTTP persistent connection.
The client side of HTTP contains several important constructs, namely net/http.
- net/http.Client is the HTTP client, which defaults to the HTTP client using net/http.DefaultTransport.
- net/http.Transport is an implementation of the net/http.RoundTripper interface, whose primary role is to support HTTP/HTTPS requests and HTTP proxies.
- net/http.persistConn encapsulates a TCP persistent connection and is the handle (Handle) that we use to exchange messages with the remote.
Client net/http.Client is the higher level abstraction that provides some details of HTTP, including cookies and redirects; and net/http.Transport will handle the underlying implementation details of the HTTP/HTTPS protocol, which will include functions such as connection reuse, building requests, and sending requests.
Constructing a request
net/http.Request represents a request received by an HTTP service or sent by an HTTP client, and contains fields for the method, URL, protocol version, protocol header, and request body of the HTTP request, in addition to these fields, it holds a reference to the HTTP response at
NewRequest is a method provided by the standard library for creating requests. This method checks the HTTP request fields and assembles them into a new request structure based on the input parameters.
The request assembly process is relatively simple, it checks and verifies the input method, URL and load, however, after initializing the new net/http.Request structure, the process of handling the load is slightly more complex, we use different methods to wrap them into io.ReadCloser types depending on the type of load.
Opening a transaction
Once we have built the HTTP request using the standard library, we open the HTTP transaction to send the HTTP request and wait for a response from the remote, and after the following sequence of calls, we end up at the structure where the standard library implements the underlying HTTP protocol - net/http.Transport.
Transport implements the net/http.RoundTripper interface, which is the most important and complex structure in the entire request process, and sends HTTP requests and waits for responses in net/http. We can divide the execution of this function into two parts.
- Finding and executing a custom implementation of net/http.RoundTripper based on the protocol of the URL.
- fetching or initializing a new persistent connection from the connection pool and calling the connection’s net/http.persistConn.roundTrip to make the request.
We can register the net/http.RoundTripper implementation for different protocols by calling net/http.Transport.RegisterProtocol in the standard library’s net/http.Transport, and in the following code the corresponding implementation will be selected based on the protocol in the URL instead of the default logic.
By default, HTTP requests are handled using net/http.persistConn persistent connections, which first fetches the connection used to send the request and then calls net/http.persistConn.roundTrip.
net/http.Transport.getConn is the method to get the connection which will be used to send the request by two methods.
- call net/http.Transport.queueForIdleConn to wait in the queue for an idle connection.
- call net/http.Transport.queueForDial to wait in the queue for a new connection to be established.
Connections are a relatively expensive resource, and establishing a new connection before each HTTP request can consume more time and incur more overhead. Allocating and reusing resources through connection pooling can effectively improve the overall performance of HTTP requests, and most network library clients adopt a similar strategy for reusing resources.
When we call net/http.Transport.queueForDial to try to establish a connection with the remote, the standard library starts a new Goroutine internally to execute net/http.Transport.dialConnFor for connection establishment, and from the final call to net/http. dialConn we can find the TCP connection and the net library in the final call: net/http.
After creating a new TCP connection, we also create two Goroutines in the background for the current connection to read data from or write data to the TCP connection, respectively.
Waiting for requests
A persistent TCP connection implements net/http.persistConn.roundTrip to handle writing HTTP requests and waiting for the return of the response in the select statement.
Each HTTP request is cyclically written by net/http.persistConn.writeLoop in another Goroutine, which executes independently and communicates through a channel. net/http.Request.write composes TCP data segments according to the HTTP protocol based on the fields in the net/http. Request structure will compose TCP data segments according to the HTTP protocol.
When we call net/http.Request.write to write data to the request, it is actually written directly to the TCP connection in net/http.persistConnWriter, and the TCP stack takes care of sending the contents of the HTTP request to the target server:
Another read loop in a persistent connection, net/http.persistConn.readLoop, is responsible for reading data from the TCP connection and sending it to the caller of the HTTP request, but it is net/http.ReadResponse that is really responsible for parsing the HTTP protocol.
We can see the general framework of the HTTP response structure in the above method, which contains the status code, protocol version, request headers, etc. The response body is still parsed in the read loop net/http.persistConn.readLoop based on the HTTP protocol headers.
The Go language standard library, the net/http package, provides a very easy-to-use interface that allows us to quickly build new HTTP services using the functionality provided by the standard library as follows.
The main function above calls only two functions provided by the standard library, they are net/http.HandleFunc function for registering processors and net/http.ListenAndServe for listening and handling requests, most server frameworks will include these two types of interfaces for registering processors and handling external requests respectively, which A very common pattern, and we will also describe here how the standard library supports HTTP server implementations along these two dimensions.
The HTTP service is composed of a set of processors that implement the net/http.Handler interface and process HTTP requests by selecting the appropriate processor based on the request route.
When we call net/http.HandleFunc directly to register a handler, the standard library uses the default HTTP server net/http.DefaultServeMux to process the request, and this method will call net/http.ServeMux.HandleFunc directly
The above method converts the processor to the net/http.Handler interface type calling net/http.ServeMux.Handle to register the processor.
The route and corresponding processor are composed into net/http.DefaultServeMux, which holds a net/http.muxEntry hash that stores the mapping from URLs to processors, which is used by the HTTP server to find processors when processing requests.
The standard library provides net/http.ListenAndServe to listen for TCP connections and process requests. This function initializes an HTTP server, net/http.Server, with the incoming listen address and processor, and calls the server’s net/http.Server.ListenAndServe method This function initializes an HTTP server with the incoming listener address and handler, calls the net/http.
Server.ListenAndServe listens for TCP connections at the corresponding address and processes client requests via net/http.Server.
Serve listens for external TCP connections in the loop and calls net/http.Server.newConn for each connection to create a new net/http.conn, which is the server-side representation of the HTTP connection.
After creating a server-side connection, the implementation in the standard library creates a separate Goroutine for each HTTP request and calls the net/http.Conn.serve method in it. requests.
The above code snippet is our simplified connection handling process, which consists of reading the HTTP request, calling the Handler to process the HTTP request, and calling the completion of the request. The read HTTP request calls net/http.Conn.readRequest, which takes the HTTP request from the connection and constructs a variable net/http.response that implements the net/http.ResponseWriter interface, and any data written to this structure is forwarded to the buffer it holds in.
After parsing the HTTP request and initializing net/http.ResponseWriter, we can then call net/http.serverHandler.ServeHTTP lookup handler to process the HTTP request.
If the current HTTP server does not contain any processors, we use the default net/http.DefaultServeMux to handle external HTTP requests.
ServeMux is a multiplexer for HTTP requests that receives external HTTP requests, matches and calls the most appropriate processor based on the requested URL: net/http.
After a series of function calls, the above process culminates in a call to the HTTP server’s net/http.ServerMux.match, which traverses the previously registered routing table and matches it according to specific rules.
If the path of the request and the table entry in the route match successfully, we call the corresponding processor in the table entry, and the business logic contained in the processor builds the response corresponding to the HTTP request via net/http.ResponseWriter and sends it back to the client over a TCP connection.
The Go language HTTP standard library provides a very rich set of features. Many languages’ standard libraries provide only the most basic features, and the implementation of HTTP clients and servers often requires the use of other open source frameworks, but many Go language projects use the standard library directly to implement HTTP servers, which also illustrates the value of the Go language standard library from the side.