The role of move semantics

Look directly at an example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include <iostream>

struct S {
  S() { std::cout << "S()\n"; }
  S(const S&) { std::cout << "S(const S&)\n"; }
  ~S() { std::cout << "~S()\n"; }
};

S foo() {
  return S();
}

int main() {
  foo();
}

Compile with the following command.

1
clang++ demo.cc -std=c++03 -fno-elide-constructors

The results are as follows.

1
2
3
4
S()
S(const S&)
~S()
~S()

Notice that we have one call to the constructor, one call to the copy constructor, and two calls to the destructor. This is because we return as a value, first constructing a temporary S object, then copying it to the function’s return value, and finally destructing this temporary object in sequence with the function’s return object. As you can imagine, this is a lot of overhead if S is a very large object.

We modify the code as follows.

1
2
3
int main() {
  S s = foo();
}

The result after compiling and running is as follows.

1
2
3
4
5
6
S()
S(const S&)
~S()
S(const S&)
~S()
~S()

You can see that we have made one more copy construct, which means that we have copied the value returned by the function to s in main. This is obviously absurd, as there are two non-essential copies in such a piece of code. Couldn’t we just initialize the object constructed in return S() to s in main?

With the introduction of move semantics in C++11, we can add a move constructor to S.

1
S(S&&) { std::cout << "S(S&&)\n"; }

Compile with the following command.

1
clang++ demo.cc -std=c++11 -fno-elide-constructors

The results are as follows.

1
2
3
4
5
6
S()
S(S&&)
~S()
S(S&&)
~S()
~S()

You can see that all the previous copy constructors have been replaced by the move constructor. If we customize our move constructor so that it uses all the resources of the original object directly, while setting the resources held by the original object to null, it is equivalent to “stealing” the resources of the original object, making the original expensive copy operation a cheap move operation.

For example, class S holds a pointer to a block of heap memory.

1
2
3
4
struct S {
  const char* data;
  int len;
}

For its move constructor, we can have the following implementation.

1
2
3
4
S::S(S&& other) {
  data = std::exchange(other.data, nullptr);
  len = std::exchange(other.len, 0);
}

This is equivalent to surrendering the previous resource, and the original object goes into a default destructible state.

However, it is worth noting that we do not say that the object constructed in return S() is directly initialized to s in main, only that the original copy constructor becomes the move constructor. And copy elision, which will be mentioned below, will be further optimized.

When the move constructor is called

In simple terms, the move constructor is called when a “temporary variable” needs to be copied. For example.

1
S s = S();

Compile with the following command.

1
clang++ demo.cc -std=c++11 -fno-elide-constructors

The results are as follows.

1
2
3
4
S()
S(S&&)
~S()
~S()

Because the S() to the right of the equal sign is created temporarily and is destroyed when it runs to the next line. So when we copy it, we can move it and take all its resources for our own use, since it’s not used later anyway, right?

Right value and std::move

This “temporary variable” is called an rvalue. Right values are a type of value categories, which in turn are a property of expressions. It is important to recognize that value categories are properties of expressions, for example.

1
int x = 42;

Here, if we want to discuss the value class of the above code it is not valid because it is a declaration. The C++ standard definition of an expression is as follows.

An expression is a sequence of operators and operands that specifies a computation. An expression can result in a value and can cause side effects.

Here are a few examples.

1
2
3
4
5
42 // integer literal
"Hello, world!" // string literal
nullptr // nullptr literal
x = 42 // binary expression
x // declaration reference expression

These are all expressions, note that they do not end with ;, otherwise they become statements.

An expression will either return a value value or it will have side effects. For example, with 42 above, "hello, world!" and x both return a value, while x = 42 has a side effect. The value category then describes some property of the value (value).

It is difficult to give a clear and understandable definition of a right value, but for now we can think of it as a “temporary” value. Copying such a right value calls the move constructor, creating room for us to reduce unnecessary copies.

So can we move a non-right value?

Suppose we have the following code.

1
2
3
S s1{};
s1.do_something();
S s2 = s1; // If s1 is no longer needed, can all its resources be passed on to s2?

The answer is yes, we can turn it into a right-valued reference.

1
S s2 = static_cast<S&&>(s1);

The standard library provides a library function std::move() to encapsulate this type conversion, so we can write it as follows.

1
S s2 = std::move(s1);

We can see that std::move() doesn’t do any “move”, it’s just a type conversion that gives the move constructor a chance to be called. If we were to implement the move constructor as a copy as well, then even move would not have any performance improvement!

Also note that s1 does not end its life after being moved, but enters a state called valid but not specified. It cannot be used again unless it is assigned a new value. Using a value that has already been moved is a common error in C++ programming, called use after move.

Value categories

Above we very briefly introduced one of the value categories - right value (rvalue), below we discuss in detail all the value categories in C++.

All value categories in C++

You can see that the C++ value classes are actually very confusing, glvalue actually overlaps with rvalue in part. There are three value categories.

  • lvalue locator value
  • prvalue pure readable value
  • xvalue expiring value

The specific classification criteria are complex and can be found at Value categories. But in brief.

  • All named expressions are lvalue
  • all string literals, such as "Hello, world", are lvalue
  • All non-string literals, such as 0, true or nullptr, as prvalue
  • all temporary values without names, especially objects returned as values, as prvalue
  • all objects of std::move(), for xvalue

After C++17, the full type of prvalue can be automatically converted to the same type of xvalue. This means that we can pass prvalue as xvalue even if it doesn’t have a copy constructor or a move constructor. For example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
struct S {
  S() = default;
  S(const S&) = delete;
  S(S&&) = delete;
};

S foo() {
  return S(); // It is legal after C++17.
}

void bar(S s) {}

int main() {
  bar(foo()); // It is legal after C++17.
}

Right-valued references and overloading resolutions

In line with the concept of right-valued (rvalue), the C++ standard has introduced the concept of right-valued references.

1
int&& rvalue_ref = 42;

Right-valued references are not normally used on their own, but can be understood as a subsidiary concept of right-valued. One of the points to note is that right-valued references can only be bound to right values. Here is how it affects the overloading resolution (the numbers represent the order in which they are called)

Right-valued references are not normally used on their own, but can be understood as a subsidiary concept of right-valued. One of the points to note is that right-valued references can only be bound to right values. Here is how it affects the overloading resolution (the numbers represent the order in which they are called)

Call f(X&) f(const X&) f(X&&) f(const X&&)
f(value) 1 2 no no
f(const_val) no 1 no no
f(X{}) no 3 1 2
f(std::move(value)) no 3 1 2
f(std::move(const_val)) no 2 no 1

To summarize, functions with right-valued references as function arguments can only accept non-const right-valued references, while constant left-valued references can accept any value, but receiving right-valued references has the lowest priority and can be considered as a kind of fallback.

The above table also contains a constant right-value reference, but it is syntactically meaningless and can be ignored, and we should not create a constant right-value reference at any time. Because right-valued references are born to “move”, to serve the purpose of transferring resources, and with the addition of “constants”, i.e. immutable, they are meaningless.

Copy elision

copy elision is an optimization by the compiler to eliminate unnecessary copy or move operations. Its two cases are more commonly known as NRVO (Named Return Value Optimization) or RVO (Return Value Optimization). After C++17, this optimization is obligatory, or guaranteed by the standard to happen.

In a return statement, copy elision must occur when the operand is a prvalue of the same type as the return type of the function (ignoring CV qualification), generally referred to here as RVO.

1
2
3
4
5
6
S foo() {
  return S(); // RVO, which will definitely happen here after C++17.
}
int main() {
  S s = foo(); // copy elision, which will definitely happen here after C++17.
}

Compile with the following command.

1
clang++ demo.cc -std=c++17

The output is as follows.

1
2
S()
~S()

You can see that we have implemented to initialize the object constructed in return S() directly to s in main and no more temporary quantities are generated! This is why the previous article was compiled with -fno-elide-constructors, because compilers nowadays have this optimization, and we can only see the effect of move semantics if we turn it off.

You can see that we have implemented to initialize the object constructed in return S() directly to s in main, without any temporary variables being created anymore! This is why the previous article was compiled with -fno-elide-constructors, because compilers nowadays have this optimization, and we can only see the effect of move semantics if we turn it off.

The following cases are not guaranteed to occur, but are generally implemented by compilers.

1
2
3
4
5
6
7
S foo() {
  S s{};
  return s; // NRVO Not guaranteed but will generally happen here.
}
int main() {
  S s = foo(); // copy elision It will definitely happen here after C++17.
}

The output is as follows.

1
2
S()
~S()