In the Python programming language, most classes are used to generate objects: when you call these classes, they return an Instance of the class to the caller. For example, if you call a Student class that defines the various actions required of a student, we get a new instance of Student, which seems very natural.

One of the core ideas of the OOP language is inheritance, i.e. if I had several classes like Student and wanted to add some common functionality to them at the same time, we would choose to create base classes and then modify the definitions of those classes to make them subclasses of that base class. And Metaclass, as the name implies, means “class that creates classes” - we use it when we want to have more control over the creation process of a series of classes.

This article is another blog post in the Chinese Internet world that explains metaclasses in Python. If this is the first article you’ve seen on the subject, or if you’re jumping from another article, I hope this article opens up a whole new perspective on metaclasses for you.

Base classes define the generic functionality of a set of classes, while metaclasses control the creation of a set of classes.

The difference between Metaclass and Baseclass

Start with inheritance

To understand the meaning of Metaclass, let’s start with Inheritance. As a fundamental concept provided by almost every object-oriented programming language, the inheritance operation means that one object (subclass) acquires a set of characteristics from another object (base class).

For example.

1
2
3
4
5
6
7
@dataclass
class People(object):
    age: int

@dataclass
class Student(People):
    grade: int

In the above example, all People objects have the age property. The subclass Student, in addition to the age property, also has the grade property. This is inheritance.

1
2
>>> Student(11, 6)
Student(age=11, grade=6)

Starting with Python 2.3, the Python programming language uses a method known as the “C3 algorithm” to determine a class’s MRO (Method Resolution Order). The MRO of a class determines the inheritance order of the class and the order in which we should look up the base class when we call a method.

1
2
>>> Student.__mro__
(<class '__main__.Student'>, <class '__main__.People'>, <class 'object'>)

The MRO of a class is a property of the class itself, not a property of an instance of the class.

1
2
3
4
>>> Student(11, 6).__mro__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Student' object has no attribute '__mro__'

Class creation

Sometimes we want to control the creation process of a class. The simplest example is that we want a call to the Student class’s factory, StudentFactory, to return a Student object instead of a StudentFactory object, as we define it.

1
2
3
4
5
class StudentFactory(object):
    def __new__(cls, *args, **kwargs):
        obj = object.__new__(Student)
        obj.__init__(*args, **kwargs)
        return obj

In the above example, we override the __new__ method of the StudentFactory class so that whenever we pass a student’s properties into the StudentFactory class, the class returns a newly created instance of Student.

1
2
>>> StudentFactory(11,6)
Student(age=11, grade=6)

Similarly, for the People method we can construct a similar factory class to enable fast construction of People-type objects. Let’s put the two Factory classes together.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class PeopleFactory(object):
    def __new__(cls, *args, **kwargs):
        obj = object.__new__(People)
        obj.__init__(*args, **kwargs)
        return obj

class StudentFactory(object):
    def __new__(cls, *args, **kwargs):
        obj = object.__new__(Student)
        obj.__init__(*args, **kwargs)
        return obj

The part with differences is already reflected in bold font. In fact, we can see that the difference between the two classes is really small. If there are not one or two but ten or twenty similar factory classes, we have to write a lot of repetitive code. Is there any way to generate similar factory classes in bulk? The answer is to use metaclasses .

Using metaclasses

The basic idea of using metaclasses is to first find the pattern of the class that needs to be built. As in the above example, the factory class for each dataclass differs only in the type variable in the __new__ function, but all the rest is the same. Thus, the task we need to accomplish with the constructed metaclass MetaFactory is clear.

  • For each concrete factory class that specifies a metaclass as MetaFactory, you need to define a variable of your own, let’s call it _instance. This variable should be the object that this specific factory class uses to produce. For example.

    1
    2
    
    class StudentFactory(metaclass=MetaFactory):
        _instance = Student # StudentFactory 用来生产 Student
    
  • We want to get the object (Student) of the corresponding instance type when we call a concrete factory class (like StudentFactory). As we learned above, this is done by overriding the __new__ function of the StudentFactory class.

  • MetaFactory needs to overload its own __new__ function for each factory class.

At this point, the role of the metaclass MetaFactory is clear: it is responsible for generating concrete factory classes based on requirements. The “requirements” are the _instance objects that we define in each concrete factory class. A metaclass is a class that has the ability to generate a concrete class.

One small thing

Wait! We still have one little problem left to solve: above, we used the object.__new__(Student) method to create a Student object . For the metaclass, it needs to create classes instead of class objects. How should this be done?

We already know that there is a simple way to create classes.

1
2
3
4
class This_Is_A_Class(object):
    a = "xxx"
    def who_am_i(self):
        return "I am a class"

In the above class definition, which we are already very familiar with, it is actually composed of three parts marked with underscores.

  • The name of the class: This_Is_A_Class, a string
  • the list of base classes of the class: in this case the class object, which may be omitted depending on the actual situation, or there may be multiple classes (i.e. multiple inheritance), so the list should be an ordered unmodifiable sequence (tuple)
  • Class properties: Whether the class defines variable properties (here a="xxx") or class methods (who_am_i), these elements are part of the class properties. Because we can access these elements directly through the class_name. property name method to access these members. The list of properties is a key-value pair, represented using Python’s dictionary type

In fact, Python provides us with another way to create classes. The above method is equivalent to.

1
2
3
4
5
6
7
8
This_Is_A_Class = type(
    "This_Is_A_Class",
    (object, ),
    {
        "a": "xxx",
        "who_am_i": lambda self: print("I am a class"),
    },
)

The built-in function type(name, bases, namespaces) takes three arguments: name is a string representing the name of the class to be defined; bases is a tuple representing a list of base classes; namespaces is an attribute (namespace) of the class; this function returns the instance of the class itself and assigns it to This_Is_A_Class.

Regardless of which of the above methods is used to define the class, the following code is executed with the same effect.

1
2
>>> This_Is_A_Class
<class '__main__.This_Is_A_Class'>

All can return the class itself (and not an instance of the class).

Factory metaclasses

Let’s return to the MetaFactory class discussion from earlier. metaclass must have the ability to return a concrete class, which is achieved through inert evaluation with the __new__ method of the metaclass: when a concrete class with a specified metaclass is to be used, the following parameters are passed to the __new__ method of the metaclass, if the concrete class itself has not already been computed method.

1
__new__(name, bases, namespaces)

These are the parameters needed to call the type() function to create a new class. But the factory class must change namespaces before creating the new class: it needs to override the corresponding __new__ method for the concrete class. The full MetaFactory definition looks like this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class MetaFactory(object):
    def factory_new(cls, *args, **kwargs):
        obj = object.__new__(cls._instance)
        obj.__init__(*args, **kwargs)
        return obj
    
    def __new__(cls, name, bases, namespaces):
        assert namespaces["_instance"]
        namespaces["__new__"] = cls.factory_new
        return type(name, bases, namespaces)

where __new__ is the constructor of the metafactory class itself, and factory_new is the __new__ function provided by the metafactory class for the concrete factory class, just like the __new__ function in StudentFactory that we implemented in the first half of the article, but with the type part replaced with cls._ instance to accommodate the different concrete types.

Note in particular that if the metaclass inherits from the type class, the last line in the __new__ method needs to be written as return type.__new__(cls, name, bases, namespaces), otherwise there will be some strange problems, don’t ask me how I know this.

The complete program code is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class MetaFactory(object):
    def factory_new(cls, *args, **kwargs):
        obj = object.__new__(cls._instance)
        obj.__init__(*args, **kwargs)
        return obj

    def __new__(cls, name, bases, namespaces):
        assert namespaces["_instance"]
        namespaces["__new__"] = cls.factory_new
        return type(name, bases, namespaces)

@dataclass
class People(object):
    age: int

@dataclass
class Student(People):
    grade: int

class PeopleFactory(metaclass=MetaFactory):
    _instance = People

class StudentFactory(metaclass=MetaFactory):
    _instance = Student

Let’s call StudentFactory and verify that.

1
2
>>> StudentFactory(11, 6)
Student(age=11, grade=6)

That’s so cool!

Summary

Metaclasses are classes that are used to generate concrete classes. In contrast to inheritance, which adds base functionality to a class, metaclasses control the class creation process. According to the Python documentation, some possible use cases for metaclasses include: enumeration types, logging classes, type checkers, automatic delegation, proxy patterns, framework building, and automatic resource locking/synchronization logic.