Property Reflection in C++

1 Preface

Java/Go languages have built-in reflection mechanism to support getting information about classes/methods/properties at runtime. Reflection mechanism has many application scenarios, such as the most common data serialization, if there is no reflection mechanism, either based on code generation, such as protobuf; or is a line of hand-written code that is pure manual work.

C++ does not have a built-in set of reflection mechanisms, and implementing reflection mechanisms is a frequent topic of discussion in C++. Considering C++ has a strong meta-template coding capability, it is accurate to say that C++ is somewhat introspective during compilation, provides some is_xxx function collections and template extraction capabilities, but does not support dynamic reflection because it does not have a runtime environment like Java/Go languages, and is very weak and difficult to use during compilation.

There are a number of open source third-party reflection libraries for C++, and they are implemented in one of two ways.

Compile-time reflection: it traverses the class/property/method at compile time and supports user filtering, provides static additional meta information for the class/property/method, and saves them for the runtime API.
Runtime reflection: provide a set of management framework to register and manage class/property/method information at runtime, and then get the corresponding information from the framework when used

Some time ago, I was trying to locate a problem with a C++ product (20 years old), and I found that there was a shadow of runtime reflection in the code, but the code did not refine the concept of reflection, and the code was fragmented.

2 Case

Configuration management is the most commonly used capability in each system. The case code is configuration client code where the client interacts with the configuration center and needs to support the following minimum functions.

Reading of the configuration
Configuration change refresh

Seems like a very simple requirement, but when all business-level configuration items of the product are managed through it, the configuration data is large, plus some quality attributes (real-time change notification, full synchronization within seconds) the requirements are different, or look at the code first. To simplify the problem statement, the code implementation has been heavily streamlined (the original code implementation is too complex) to restore the most original design ideas.

design

Design of the above class diagram.

interface IConfig: provides the configuration initialization, update interface
ValueInfo: describe the meta-information of each configuration item, such as name, type, the use of offset later, in addition to the class named ValueInfo is not considered, should be called ConfigItemMeta can be better
abstract class ConfigBase: key method InitValueInfoMap, initialize all the configuration items defined; GetValueInfo returns the configuration object all the configuration items collection
ConfigManager: responsible for interacting with the configuration center, managing all IConfig instances, synchronizing configuration items from the configuration center to IConfig instances at regular intervals (or receiving change events through in then synchronized)
Configuration object: ConfigA, ConfigB are specific configuration class definition, it combines a set of similar configuration items, its member variables are configuration items.
ServiceLogic: To simplify the description, I will use a generic class ServiceLogic business logic, business logic classes will aggregate specific configuration objects, using the above configuration items, a configuration object can be used by multiple business logic within

We can still see his original design intent.

For the sake of abstraction, configuration synchronization and notification should not be aware of the figurative ConfigA, ConfigB member variable names, otherwise it is necessary to transform IConfig down to the specific ConfigA, one by one configuration item assignment, which has the disadvantage of leading to two-way dependency, the second is that the code is not universal, pure manual work, and maintenance is also troublesome. GetValueInfo method GetValueInfo method plays a key role to get the configuration item definition
For the sake of performance, the business logic sees the figurative ConfigA, ConfigB, and can directly cfgAtr->m_userName to get the value of the configuration item, rather than a generic configuration table through the configuration ID to query the configuration value, which has the disadvantage of involving a lookup, and the second is that it involves type conversion

The question is how to update the configuration value to the corresponding member variable by getting the configuration definition? In reality, there is also the problem of multi-threaded concurrency, assuming that there is only one thread processing the business logic and the configuration update at the same time. If it is a Java language, it is easy to update the values through reflection. The original authors took advantage of the memory layout feature of C++ objects, as seen in the implementation of the InitValueInfoMap method of each class.

void ConfigA::InitValueInfoMap() 
{
    // Pseudocode implemented by source code, the code can be optimized and encapsulated as a function, the registration of a member variable requires only one function call
    ValueInfo valueInfo1 = new ValueInfo;
    valueInfo1->name = "UserName";
    valueInfo1->valueType = ValueType.String;
    valueInfo1->offset = (char*)&m_userName - (char*)static_cast<IConfig*>(this); // There are alerts, please ignore them first
    (*m_valueInfos)[valueInfo1.name]= valueInfo1;

    // Omit the other member variables and add m_valueInfos
}

ValueInfo.offset is a record of the offset of each configuration item (member variable) to the IConfig pointer.

ConfigBase::Update(string name, string value) 
{
    // Pseudocode
    ValueInfo value = *(GetValueInfo()->find(name));
    *((string*)((char*)static_cast<IConfig*>(this) + value->offset)) = value; // For other types, you must also convert the type
}

At this point we have probably figured out that this is actually dynamic reflection, registering first, and then assigning values to member variable addresses based on the registration information. When ConfigManager receives the configuration change, it calculates the address of the member variable of the subclass by the offset of the variable in the base class, and then assigns the value.

Another question is why not record the member address pointer directly instead of the offset? It should still be for performance, ConfigA class can have multiple object instances, and no matter how many instances, according to the memory layout it is known that the offset of member variables will not change. The actual code of ConfigBase::m_valueInfos is defined as a member variable of the class (not static), but it points to a static global variable in ConfigBase.cpp, which should be designed to record the member address pointer at the earliest, and the performance optimization is to record the offset later. And also note in the code that the InitValueInfoMap method only needs to be called once, and performance is reduced from X seconds to XX milliseconds.

3 Extended reading

3.1 Offsets

Suppose there is a structure Point { int x; int y; } and the method to get the offset of member variables is as follows.

Method 1

int Point::* offsetX = &Point::x;
int Point::* offsetY = &Point::y;

long offsetX_ = reinterpret_cast<long>(*(void**)(&offsetX));
long offsetY_ = reinterpret_cast<long>(*(void**)(&offsetY));

Method 2

1
2
3

// 即 ((int)&((structure*)0)->member);
long offsetX = reinterpret_cast<long>(&(reinterpret_cast<Point*>(0))->x); 
long offsetY = reinterpret_cast<long>(&(reinterpret_cast<Point*>(0))->y);

To further understand the memory layout of C++ objects, it is recommended to read “Exploring C++ Object Model in Depth”. Especially when it comes to considering functions, considering inheritance, multiple inheritance combination is too complicated. Back to the case above, calculating the offset relative to the base class (interface) with address translation is hard to understand, and bare pointer operations can be risky.

3.2 Runtime

Some open source implementations of runtime reflection, which are implemented in a similar way, mainly depending on the completeness of its implementation are.

Let’s look at the first open source, using the sample example below. See the code is not feeling with the case of the type of code ideas, but the case of the management of properties rubbed in the code of the configuration class, without extracting a simple class property management framework.

#include <rttr/registration>
using namespace rttr;

struct MyStruct { MyStruct() {}; void func(double) {}; int data; };

RTTR_REGISTRATION
{
    registration::class_<MyStruct>("MyStruct")
         .constructor<>()
         .property("data", &MyStruct::data)
         .method("func", &MyStruct::func);
}

These open source libraries they have a common feature: there is a container or registration manager for the registration and storage of various information about the type.

3.3 Compile-time

Starting from C++11, some Attributes are added, which are used to give the compiler some extra information to generate some optimization or specific code, and also to give some hints to other developers. For example.

[[carries_dependency]] C++11, Let compile-time skip unnecessary memory fence instructions
[[noreturn]] C++11, function will not return
[[deprecated]] C++11, Warning that the function will be deprecated
[[fallthrough]] C++17, The hints used in switch can be dropped directly, without break
[[nodiscard]] C++17, indicates that the modifier cannot be ignored and can be used to modify a function to indicate that the return value must be processed
[[maybe_unused]] C++17, Hint that compiler modifications may not be used at the moment to avoid warnings

Unfortunately, these are only for compiler use, C++ does not have a runtime environment, and it is not possible to use them at runtime, like Java’s Annotion and Go’s Tag. But the takeaway for us is that it is also possible to define Attribute for some methods or member variables, and let the compiler generate code with the corresponding metadata information that can be used at runtime.

The custom Attribute needs to be interpreted by the compiler based on it, and after a search, Github does have an interpreted wrapper library for C++ ASTs cppast and can be used with Clang.

Static reflection was also proposed to the C++ standards organization in 2014 Static reflection, and after a preliminary review of the documentation, the technical specifications are defined in Chapter 4 :

Conceptual types of metadata objects
Conceptual conversion C++ specific code
Reflection functions and operators

4 Conclusion

In this paper, we have understood the author’s design intent by reducing the original design of configuration management, which makes use of C++ member variable offsets to facilitate uniform update operations on configuration items. This leads to a walk-through of the C++ reflection mechanism, taking a look at runtime and compile-time reflection. If you are thinking about a reflection framework, you can take a look.

Table of Contents