Recently, while reading Go1.18 Release Notes, I found that the Title method of the strings, bytes standard library has been deprecated. Why is this?
Here is an example of the strings standard library. The strings.Title method does the following: maps all Unicode letters at the beginning of a word to its Unicode title case.
The example is as follows.
These words are converted to their upper case.
It may seem like everything is fine, but there are actually 2 obvious flaws at this stage.
- Does not handle Unicode punctuation correctly.
- Does not take into account the capitalization rules of specific human languages.
Let’s get into the details.
For the first question, the example is as follows.
Variable a conversion processing results in “Go.Go․go”, but should be “Go.Go․Go” according to the actual claim.
Language specific rules
For the second problem, the code is as follows.
In the Dutch word. “ijsland” should be capitalized as “IJsland”, but the result is converted to “Ijsland”.
This problem was discovered in 2013 from “strings: Title function incorrectly handles word breaks” and was flagged as an unplanned problem by Rob Pike, the father of the Go language.
Because of the Go1 compatibility guarantee treaty, this is “impossible” to fix, and once fixed it will affect the output of the function and is a destructive change.
However, it is possible to take another approach, which is “deprecated” as mentioned in this article. This is identified below.
Mark “Deprecated” on the function.
The corresponding Go documentation will collapse it and explicitly show the deprecation, and it is recommended to use the
golang.org/x/text/cases library directly to implement this functionality.
The new x/text/cases case is as follows.
Outputting multiple language conversions, our core focus is on the code associated with
cases.Lower(language.Und), which the library will use by calling.
The language of processing is specified in programming to address the claims of different human language symbols, different languages and capitalized words to avoid one-size-fits-all.
But this new “trap”, apparently, also introduces more complexity, saying that the good old “less is more…” , it’s worth considering the new cost when using the method.
Although only a small function, but also extends a lot of problems. Essentially, there are still cognitive limitations in the design.
bytes.Title functions are often misunderstood in practice as methods for converting initial capitalization, contrary to their design meaning.
Although in the end such misunderstanding brings better results compared to the defects, it is still a big problem for some special scenarios and language support.