C++ identifies a type as immutable with the keyword const. This is actually quite easy to understand. However, for C++, there is a lot to discuss about simple concepts. Let’s look at a problem.

Problem

We know that const can be used to modify a member function to indicate that the function cannot modify the data of the class. Suppose a class has a member T *p of type pointer, and we want to get a reference to the object pointed to by p using the get() method. If get() is modified by const, what type should it return, T& or const T&?

1
2
3
4
5
6
class C {
public:
    ??? get() const { return *p; }
private:
    T *p;
};

It may be natural for many people to think that const T& should be returned, because get() should not change the data. Indeed, this is how many classes handle it. For example, the standard library’s sequential containers all have front methods that return a reference to the first element in the container. For example, vector<int>::front().

1
2
3
4
5
vector<int> v = {1, 2, 3};
v.front() = 10; // int &

const vector<int> cv = {1, 2, 3};
v.front() = 10; // error: assignment of read-only location. const int &

You can see that the non-const version returns int& , while the const version returns const int& .

Let’s look at another example. A standard library iterator, such as vector<int>::iterator , overrides the dereference operator operator*() . So what is its return type?

1
2
3
vector<int> v = {1, 2, 3};
const auto i = v.begin();
*i = 10; // int &

It returns int& instead of const int&, even though the operator*() is the const version.

Reference types, top-level const and bottom-level const

First of all, we know that C++ types are divided into value types and reference types . For reference types, such as pointers, there are two levels of const: top-level const and low-level const . The top-level const means that the variable itself is immutable.

1
2
3
4
int a, b;
int *const p = &a;
p = &b; // error: assignment of read-only variable
*p = 10; // ok

The underlying const means that the value of the variable reference is immutable.

1
2
3
4
int a, b;
const int *p = &a;
p = &b; // ok
*p = 10; // error: assignment of read-only location

When assigning or initializing variables, top-level const can be implicitly added or removed, bottom-level const can be implicitly added, but not removed.

1
2
3
4
5
6
7
int *p;
int *const q = p; // int* -> int *const
p = q; // int *const -> int*

const int *cp;
cp = p; // int* -> const int*
p = cp; // error error: invalid conversion from 'const int*' to 'int*'

If a member function of a class is modified by const, then the this pointer of the function is a bottom-level const, i.e., const T *this. Then all members accessible through the this pointer, i.e. all members accessible by the function, are top-level const.

In the example at the beginning of this article, get() is modified by const, and the type of p accessed in get() would be T *const p . The compiler doesn’t prevent us from modifying the value of a pointer member in a const member function, so why should some classes be prohibited from doing so, while others allow it?

Reference type or value type

If a class has a pointer type member T *p , then when we copy the object of the class, do we copy the pointer itself or the value pointed to by the pointer?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
class C {
public:
    C(const C &c) : p(c.p) { } // or
    C(const C &c) : p(new T(*c.p)) {}

    C &operator=(const C &c) {
        if (&c == this) return *this;
        p = c.p;
        return *this;
    } // or
    C &operator=(const C &c) {
        if (&c == this) return *this;
        delete p;
        p = new T(*c.p);
        return *this;
    }

private:
    T *p;
};

C++ allows the developer to control the behavior of the object when it is copied. We can copy just the pointer, so that it points to the same object before and after the copy, or we can copy the value pointed to by the pointer, hiding the fact that the class has reference members from the user.

When we copy the value pointed to by the pointer, the class appears to be a value type . For example std::vector, its memory is dynamically allocated, and the vector object itself only records a pointer to the allocated memory. But when we copy vector, we copy all the objects it contains. So it is a value type for the user.

Since it is a value type, there is only one level of const, the top level const. So when a vector is a const, vector::front() should also return a reference to the const. The class is responsible for passing the top-level const to the bottom-level.

When we just copy the pointer itself, the class looks like a reference type . For example vector::iterator, which contains a pointer to an element in vector. When copying the iterator, only the pointer itself is copied, and the iterator before and after the copy points to the same element. So it is a reference type to the user.

So even if the iterator itself is const, operator*() will not return a reference to const because the top-level const will not be passed to the bottom-level. How do I set the bottom const of an iterator? vector provides two classes, vector::iterator and vector::const_iterator . The latter always returns a reference to const, whether the iterator itself is const or not, because it is underlying const.

C++ is powerful

Back to the question at the beginning of this article. The standard answer is that whether to return const T& or T& depends on how we define the class. If the copy control function of class C copies (or moves) the value pointed to by p, then const T& should be returned; if it just copies the pointer itself, then T& should be returned.

To summarize more generally, for classes that contain members of reference types (e.g., pointers, smart pointers) to be considered value types, then

  • the copy control function needs to copy the data referenced by the member of the reference type
  • Both const and non-const versions of the method to access the referenced data should be provided
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class C {
public:
    C(const T &t) : p(new T(t)) {}
    C(const C &c) : p(new T(*c.p)) {}
    ~C() { delete p; }
    C &operator=(const C &c) {
        if (&c == this) return *this;
        delete p;
        p = new T(*c.p);
        return *this;
    }

    T &get() { return *p; }
    const T &get() const { return *p; }
private:
    T *p;
};

Conversely, if it is treated as a reference type, then

  • the copy control function copies the reference type member itself
  • The underlying const should be set in some way (e.g., templates)
  • for methods that access the referenced data, only the const version is required. If the underlying non-const is used, the referenced data is allowed to be modified.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
template <typename T> class C {
public:
    C(T *p) : p(p) {}
    C(const C &c) : p(c.p) {}
    C &operator=(const C &c) {
        if (&c == this) return *this;
        p = c.p;
        return *this;
    }

    T &get() const { return *p; }
private:
    T *p;
};

Of course, if the class is a singleton class like so-and-so manager or a non-copyable class, you don’t need to think about it so much, just deal with it as needed.

Unlike other languages (Java, Go, Python), C++ classes can be both value and reference types, depending on how the developer designs them. C++ expects developers to use custom types as well as built-in types, so it provides a number of mechanisms for operator overloading, copy control, etc. This makes C++ classes very powerful and complex. This requires that we understand these concepts, not just remember what uses const has.