In reviewing the code that my colleagues have written to initiate external HTTP requests, I have rarely seen a more standard (or correct and safe) way to construct the URL of an HTTP request. What is standard practice, if you ask me? I probably can’t tell you exactly. However, I have a few simple criteria of my own.
- Protocol: Does the request work without
- Path: Does it stitch out
/at the end?
- Query: Do the query parameters handle the transcoding correctly?
What is a URL? The following structure is from the Go language URL official documentation.
Handling protocols correctly
url package does not support URLs without a protocol (Scheme), and since http internally also uses the
url package to parse, the following request is wrong.
The error is reported as follows.
I don’t know if this counts as a bug in the
url package, so I won’t theorize here, but I always make a habit of dealing with it first in the following way.
Note that I’m judging here by
:// and not
https://. There are several reasons for this.
Simplicity. No need to determine both
Don’t worry about case. The protocol (scheme) part of the URL is case-insensitive.
http://example.comare equivalent. If you insist on determining the prefix, you should also write it as follows.
This is too cumbersome to write! This is still the way to write the
Proper handling of paths
The part of
http://example.com/path/to/file.txt that looks like
/path/to/file.txt is called
Path, i.e. path. Understanding and constructing paths correctly is a major area of error.
The first and foremost problem is naming. Many people assume that API requests can only be sent to
/ paths, like
http://example.com/v1/posts, an API interface where the
/v1/posts part is fixed and the preceding
http://example.com is in the configuration file. So they name this part
host (or even more
host_port). At first glance I thought I could only match the
example.com part (because it’s called host, hostname or host_port).
So in case I want to test a proxied API someday and the prefix changes, for example, now it’s
http://example.com/proxy/v1/posts. Then the configuration file should now say
http://example.com/proxy in this section. Is this still called
Why am I bothering with this, you ask? I didn’t want to, I didn’t even think this kind of stuff could be a dispute for us. I thought everyone was following the specs.
Where is this scenario, you ask? All over the place, like Grafana using requests to data sources in Server mode (as opposed to Direct).
So what’s a good name for it? I’ve seen:
To have or not to have the final
Because his code is based on configuring and then appending (yes,
+) API paths, so.
- If the configuration is
http://example.com, then it will get:
http://example.com/v1/posts; * If the configuration is
http://example.com, then it will get
- If you configure
http://example.com/, then you get:
http://example.com//v1/posts; * If you configure
http://example.com/, then you get:
See? They simply can’t handle whether it ends with
/ or not, and at one point they even verbally asked you not to bring the final
The root cause of this is that their URLs are manually spliced.
Not all servers are compatible with automatically turning
/, and errors are inevitable.
So how to do it? Use the
path package. (This package conflicts with our common variable name
path, which is a bit unpleasant.)
There is a corresponding package called
filepath. The main difference between these two packages is that the former applies to forward-slash related paths, while the latter applies to OS related paths. For example,
/ separates paths on Linux and
\ separates paths on Windows. This is clearly stated in the package documentation.
Package path implements utility routines for manipulating slash-separated paths.
The path package should only be used for paths separated by forward slashes, such as the paths in URLs. This package does not deal >with Windows paths with drive letters or backslashes; to manipulate operating system paths, use the path/filepath package.
Package filepath implements utility routines for manipulating filename paths in a way compatible with the target operating >system-defined file paths.
The filepath package uses either forward slashes or backslashes, depending on the operating system. To process paths such as URLs that always use forward slashes regardless of the operating system, see the path package.
How do I use the
path package? Just one method:
func Join(elem ...string) string
Join joins any number of path elements into a single path, separating them with slashes. Empty elements are ignored. The result is Cleaned. However, if the argument list is empty or all its elements are empty, Join returns an empty string.
If you are using go1.18 or later, the
url library already comes with this capability, see: net/url: add
Path encoding problem
path.Join will automatically handle encoding issues:.
Many servers or more modern backends should now be able to handle unencoded characters correctly, and it is less common to use characters other than numeric letters in the API. The encoding problem is not particularly serious.
Do you think I
/v1/posts/new-posts wrote it wrong because I didn’t encode it myself? Sorry, no mistake. The
path.Join joins the paths (segments) before encoding, and the
url.String() method gets the final URL for transmission.
Handling query strings correctly
A query is simply the part of the URL that comes after the question mark. For example:
page_no=1&a=b is called a query (query or query_string).
I’m sure everyone has seen someone else manually splice in the code for this query, or written it themselves (I’m no exception).
The following are the results.
It’s hard to read so much hardcoding. I don’t think you’ve seen many URLs with spaces, have you? The ones with Chinese characters are probably not very standardized either, right? I’m very bitter.
The following is what I think is a more standard and safe way to write.
I don’t know if you have made similar mistakes, but I have made them many times before anyway, so I have a summary like this today.