I’ve been following the development of Rust as a language, but I haven’t actually used it. Recently, I was interested in writing something, so I implemented a basic module in both Rust and Go. This article is some of the results of this implementation.

I have a long experience with Go, so this article will be relatively accurate about Go, but I have been following Rust for a long time, but the code is basically the result of the last week or so, so my opinion may be biased.

This comparison is about implementing a string optimized for short strings. When deserializing, the unknown length of the string causes the deserialization to create a section of memory to hold the content based on the received string. In general, however, the length of the string at deserialization is small, and the resulting repeated requests to free memory can consume a lot of performance. The optimization for short strings is to allocate memory in advance for strings of a certain length, and if the deserialized length is less than this length, the allocated memory is used directly to store the content, and no more memory is requested, and there is no release process, thus increasing efficiency.

Since I am more familiar with Go, I first implemented this function using Go:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
const prealloc_size = 20
type String struct {
  data [prealloc_size]byte
  buf  []byte
}
func (s *String) fill(r io.Reader, n uint) error {
  if n > prealloc_size {
    s.buf = make([]byte, n)
  } else {
    s.buf = s.data[:n]
  }
  _, err := io.ReadFull(r, s.buf)
  return err
}
func (s *String) String() string {
  return *(*string)(unsafe.Pointer(&s.buf))
}

This code is fairly straightforward and will not be described too much. The only thing worth mentioning is the String.String() method, which uses unsafe to treat buf directly as a string to avoid reallocating memory. For details, see the official library’s strings.Builder.String() method.

Since we want to test performance, we must have the relevant performance test code. For the sake of brevity, we will not show the contents of the unit tests here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
const (
    shortStr = "short str"
    longStr  = "loooooooooonnnnnnnnnngggggggg string"
)
func BenchmarkShortString(b *testing.B) {
    b.StopTimer()
    var str String
    data := []byte(shortStr)
    r := bytes.NewReader(data)
    l := uint(len(data))
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        r.Reset(data)
        str.fill(r, l)
        str.String()
    }
}
func BenchmarkLongString(b *testing.B) {
    b.StopTimer()
    var str String
    data := []byte(longStr)
    r := bytes.NewReader(data)
    l := uint(len(data))
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        r.Reset(data)
        str.fill(r, l)
        str.String()
    }
}

The test results are as follows.

1
2
3
4
5
6
7
8
$ go test -bench . -benchmem
goos: darwin
goarch: amd64
pkg: github.com/googollee/rtnx/rtmp/amf
BenchmarkShortString-4      50000000            25.1 ns/op         0 B/op          0 allocs/op
BenchmarkLongString-4       20000000            67.0 ns/op        48 B/op          1 allocs/op
PASS
ok      github.com/googollee/rtnx/rtmp/amf  3.189s

As you can see, the short strings really did not reallocate memory in the test and showed a significant performance improvement compared to the long strings.

After that, the code was first translated directly to Rust, but there were a number of problems during the translation process.

The first problem was, of course, ownership! The buf in the Go structure actually has two kinds of references: either a reference to a section of internal data memory, or a reference to an allocated section of memory. Rust forces all references to be organized in a tree of ownership, and internal references are equivalent to a node referencing itself, with no way to form a tree of ownership. So the domain needs to be split into two, one indicating how long an address needs to be referenced when referencing data, and the other indicating a reference to a section of allocated memory, if necessary. The first attempt is as follows.

1
2
3
4
5
pub struct String {
    data: [u8; 20],
    heap: Option<Box<[u8]>>,
    len: usize,
}

Originally, I wanted to use Rust’s template to parameterize the length of pre-allocated memory. After trying for a long time, it turned out that Rust does not yet support constant templates, so the length of 20 is directly hard-coded here. The good thing is that later implementations will not directly rely on the number 20, so no further changes were made.

The Rust community uses something like String<[u8; 20]> to implement constant modal parameters. This is a workaround. There is now an official Rust [proposal] for constant templates (https://github.com/rust-lang/rust/issues/44580).

where data, as before, is a pre-allocated piece of memory. len is used to record, in the case of short strings, how long the contents of data were used. Since the reference relation is removed, there is no violation of Rust ownership.

But the next question is how heap should be defined; Rust states that all references must point to the corresponding reference object, and there can be no reference variables without references, so & T cannot be empty. And I want heap to be empty under normal circumstances. So Option<_> is used here to represent a type that is either None or has a value.

The next Box<_> indicates that it holds a type that is allocated on the heap. Unlike Go, Rust makes a very clear distinction between heap and stack variables, whereas Go, through compiler escape analysis, will decide for itself whether a variable needs to be placed on the heap and later reclaimed by GC. This reflects a completely different mindset between the two languages: Go attempts to reduce the amount of implementation details a programmer needs to know, providing as little but enough infrastructure as possible to ensure ease of use while guaranteeing performance, while Rust exposes a lot of details to the programmer and forces the programmer to follow the constraints and use them to make sure the program is correct. Since Rust exposes implementation details, experienced users can make more radical optimizations based on those details. Go, on the other hand, is very often limited by the compiler’s ability to implement (and in many cases is unaware that it can do) such optimizations. For example, Go often opens a section of temporary memory for storage when reading links.

1
2
3
4
5
6
func Handle(r io.Reader) {
  // ...
  var data [20]byte
  r.Read(data[:])
  // ...
}

where data is the temporarily allocated memory. Theoretically, this memory is fine if it is only used in Handle, only on the stack, and is automatically freed as Handle exits. But since the escape analysis cannot know if the reference data[:] was held at the time of r.Read(), here it will be assumed that data has escaped, so it will be allocated on the heap and lead to the involvement of gc later.

Going back to Rust, eventually, heap is defined as Option<Box<[u8]>>, indicating an [u8] that can be None or an allocation on the heap. The type [u8] is similar to Go’s slice from concept to implementation, and is a reference to an array. It is worth noting that [u8] cannot create a variable of [u8] directly on the stack since the length of the reference cannot be predicted at compile time.

Next is the fill() method.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
pub fn fill(&mut self, r: &mut impl io::Read, n: usize) -> io::Result<()> {
    let buf = if n > self.data.len() {
        let mut v = Vec::with_capacity(n);
        unsafe { v.set_len(n) };
        self.heap = Some(v.into_boxed_slice());
        match self.heap.as_mut() {
            Some(b) => b,
            None => panic!("should not here"),
        }
    } else {
        self.len = n;
        &mut self.data[0..n]
    };
    return read_full(r, buf);
}

Rust guarantees by default that variables must be initialized when used, and here Vec::with_capacity() allocates memory without determining the content of the data, which is subsequently filled with concrete data, so unsafe { v.set_len(n) } is used to force the use of the uninitialized memory. It is worth noting that after using v.into_boxed_slice() to give ownership of this memory to self.heap, there needs to be a way to retrieve the reference to this memory and assign it to the buf variable. The type of self.heap is Option<Box<[u8]>>, and by convention, you need to get the reference in the order Option<Box<[u8]>> -> Box<[u8]>> -> [u8] -> &mut [u8]. But there is a problem here, if it turns out to be Box<[u8]>, it proves that it is a variable type with its own life cycle, not a reference. So the order of taking references here is Option<Box<[u8]>> -> Option<&mut Box<[u8]>> -> &mut Box<[u8]>> -> &mut [u8]. And the first step is to get the reference via self.heap.as_mut().

Since Rust doesn’t have an official io.ReadFull(), here’s an implementation: as @upsuper reminds us, Rust officially has a similar implementation of std::io::Read::read_exact. However, we still use our own implementation here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
fn read_full(r: &mut impl io::Read, b: &mut [u8]) -> io::Result<()> {
    let mut i = 0;
    while i < b.len() {
        match r.read(&mut b[i..]) {
            Err(b) => return Err(b),
            Ok(n) => {
                i += n;
            }
        }
    }
    Ok(())
}

io::Result<()> is a very Rust-specific type. This type represents a type value that is either () or io::Error<_>. The () type is some sort of meta-type in Rust, and can be thought of as a nil type to some extent or to indicate a return value that does not need to be cared about. io::Error<_> is simply a representation of some kind of io error, the exact type of error being determined by the template type _. Contrast this with the Go type.

1
func ReadFull(r io.Reader, n int) (int, error)

Obviously, Go’s types cannot syntactically guarantee that ReadFull() returns both int and error. And returning two values at the same time can cause confusion for the caller: is this a normal return, or an error? Don’t think that because the error postconvention is commonly used in Go, this won’t happen. In fact, none of the Go standard libraries can prevent this from occurring.

When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call. An instance of this general case is that a Reader returning a non-zero number of bytes at the end of the input stream may return either err == EOF or err == nil. The next Read should return 0, EOF.

Whereas simple handling of this type of error can be handled directly with unwrap() (as will be shown later), the use of match here is a more fine-grained handling. Since match is an expression rather than a statement, the error can be handled in the following way.

1
2
3
4
5
6
7
let ret = match function() {
  Err(err) => {
    // handle err
    return Err(err);
  },
  Ok(r) => r,
}

Comparison with Go’s error handling.

1
2
3
4
5
ret, err := function()
if err != nil {
  // handle err
  return err
}

Rust error handling has the advantage of not having err variables defined all over the place, though Go code is much less voluminous. However, with the help of Rust’s macros, if you don’t need to specifically handle returning the error err, but rather return it directly, Rust can be abbreviated as follows

1
let ret = function()?

This makes it look much better than Go. Even the expected error handling in Go2 are no better than this.

Next is the performance testing part. I don’t know how Rust understands the engineering thing, but when I was experimenting anyway, the Cargo project supported the bench command in the stable branch, but relied on a testing package from nightly to work. The current Rust project is full of these strange features that are claimed to be provided by stable but rely on nightly, making it a pain to build the complete project. Compared to the very stable testing.B feature that comes with Go, this aspect of Rust doesn’t feel at all like a well-planned project that has been claiming to be stable for three years.

In the end, the entire performance test was done using the Criterion project, at the cost of having the performance test code in a different directory than the String definition code, and no way to reference private methods of type String. This is the reason why fill() is defined as pub above. The test code is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#[macro_use]
extern crate criterion;
extern crate rtnx;
use criterion::Criterion;
use rtnx::rtmp::*;
fn string(c: &mut Criterion) {
    c.bench_function("short str", |b| {
        let mut str = String::new();
        let data: &'static str = "short str";
        b.iter(|| {
            str.fill(&mut data.as_bytes(), data.len()).unwrap();
            str.string();
        })
    });
    c.bench_function("long str", |b| {
        let mut str = String::new();
        let data: &'static str = "loooooooooooooooooonnnnnnnnnnnnng str";
        b.iter(|| {
            str.fill(&mut data.as_bytes(), data.len()).unwrap();
            str.string();
        })
    });

Rust itself implements Read for the &[u8] type, so you can use data.as_bytes() as a Read directly. Since fill() requires &mut Read, &mut data.as_bytes() will be used here. And 'static means static lifecycle, i.e. constant string.

Test results.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ cargo bench
    Finished release [optimized] target(s) in 0.27s
     Running target/release/deps/rtnx-b3fb2c92c52ff0ff
running 1 test
test rtmp::amf::string_test::test_string ... ignored
test result: ok. 0 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out
     Running target/release/deps/rtnx-b073d7725e43e2cb
running 1 test
test rtmp::amf::string_test::test_string ... ignored
test result: ok. 0 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out
     Running target/release/deps/string-e1abdea77166d91b
Gnuplot not found, disabling plotting
short str               time:   [8.8030 ns 8.8762 ns 8.9604 ns]                       
                        change: [-2.0804% +0.0430% +1.9924%] (p = 0.96 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe
long str                time:   [29.893 ns 30.005 ns 30.156 ns]                      
                        change: [-0.1391% +0.6297% +1.4001%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
Gnuplot not found, disabling plotting

Comparison with the results of previous Go tests.

language short string long string
Rust 8.8762ns 30.005ns
Go 25.1ns 67.0ns

Rust takes less than half the time of Go in both cases, and Rust really beats Go in terms of performance.

There are a few things not mentioned above.

First, there is concurrency. go comes with concurrency, and you can simply use go/chan to achieve efficient and resource-inefficient concurrency. rust does not have this support at the language level, but with its flexible type system, it provides the infrastructure for concurrency with the Future type. On top of that, tokio implements a goroutine-like set of concurrency mechanisms. Since Go’s type system is too simple, there is no way to implement concurrency as a library. This is something that Rust does beautifully.

Regarding packages, Rust is full of unnecessary complexity compared to Go’s directory-based packages, where mod name can introduce another compilation unit, name, either as a file name.rs in the same directory or as a file name/mod.rs in a subdirectory. The mod itself also generates a namespace of name, so if you want to refer to symbols in a package, either introduce the symbols via use or give the namespace with name:: as the symbol prefix. This makes it extremely complicated when trying to split the code in a package into different files. Basically, to split files and still be able to refer to each other, you need to write code like the following.

1
2
3
4
mod part1;
mod part2;
pub use self::part1::*;
pub use self::part2::*;

This also causes the private symbols in part1 and part2 to be invisible to each other. The mod/use feature seems flexible, but in practice it ends up forcing the contents of a package to be written to a single file. This is much harder to understand and use than Go’s natural “one package, one directory” format. Rust 2018 also tries to improve this problem, as detailed in Path clarity.

Rust’s control over compilation units relies on macro directives to do this. For example, if a part of the code is said to be compiled only at test time, you need to write:

1
2
3
4
#[cfg(test)]
mod tests {
  // ...
}

This way the code inside mod tests will only be compiled at test. Go uses a file suffix to make this distinction. For example, xxx_test.go only compiles on tests by default. Here I think the Go approach is more natural. You can tell which compile control is in a package and which file will be used in which state by looking directly at the file name. Similarly xxx_unix.go/yyy_windows.go also provides convenience for maintenance. rust can only know the compilation rules by looking at them file by file if there is no proper tool.

When it comes to tools, the quality of the Rust tool chain is dismal compared to that of Go. The performance tests mentioned above are one example. a large number of Rust tools are currently in preview or only available nightly (it seems that clippy has just recently entered into stable preview status). the Rust community hopes that 2018 will provide a stable productivity version Rust 2018, but now that 2018 is halfway through, there is still some syntax work that has not been merged into stable. this has to be remembered when C++ 98 was dragged to C++ 00 and then to C++ 0x. hopefully Rust will The Rust community has set the release of the 2018 version for October, and the projects are now in line with the expected progress. We hope to see a more productive stable release by the end of this year, not just of the language features but also of the key toolchain.

Go’s current GOPATH model is probably only good for Google itself. The good thing is that Go modules have been arrow on the string, so you won’t have to put up with such a hard-to-use feature soon.

Rust and Go are two completely different languages, with Go trying to optimize the development process, reduce the number of concepts exposed to programmers, and have a stable enough interface. Rust, on the other hand, exposes all the features to the programmer as a choice, and the language community is responsible for the core compiler, solidifying best practices into language features, and leaving the toolchain to the community for refinement.

I recommend that all Go programmers give Rust a try and experience how the compiler forces the details of the program to be understood. These details are the complexity that Go tries to hide, but understanding them can be very helpful in writing programs, even Go programs. On the other hand, Go is by far the better choice for companies that want to build a “hard core”, assemble a team of different skill levels, leverage the strength of the existing community, and simplify engineering complexity.