Descriptors are an advanced concept in Python and the basis for many of Python’s internal mechanisms, so this article will go into them in some depth.
Definition of a descriptor
The definition of a descriptor is simple; a Python object that implements any of the following methods is a descriptor.
__get__(self, obj, type=None)
__set__(self, obj, value)
The meanings of the parameters of these methods are as follows.
selfis the currently defined descriptor object instance.
objis the instance of the object that the descriptor will act on.
typeis the type of the object on which the descriptor acts (i.e., the class it belongs to).
The above methods are also known as descriptor protocols, and Python will call a method at a specific time with parameters passed in according to the protocol; if we don’t define the method with the agreed-upon parameters, the call may go wrong.
The role of descriptors
Descriptors can be used to control the behavior of access to properties, enabling functions such as calculating properties, lazy loading of properties, property access control, etc. Let’s start with a simple example.
In the example we created an instance of the descriptor and assigned it to the
x property variable of the
Foo class. Now when you visit
Foo.x, you’ll see that Python automatically calls the
__get__() method of the descriptor instance to which the property is bound.
Next, instantiate an object foo and access the x property through the foo object.
The corresponding methods defined by the descriptor are also executed.
If we try to assign an
x to a
foo object, the
__set__() method of the descriptor is also called.
Similarly, if we define the
__delete__() method in the descriptor, this method will be called when
del foo.x is executed.
The descriptor will be called during the property lookup by the
. dot operator during property lookup, and is only valid when used as a class variable.
If assigned directly to an instance property, the descriptor will not take effect.
If the descriptor is accessed indirectly with
some_class.__dict__[descriptor_name], the protocol method of the descriptor is also not called, but the descriptor instance itself is returned.
Types of descriptors
Descriptors can be further divided into two categories depending on the implemented protocol methods.
- If either
__delete__()method is implemented, the descriptor is a data descriptor (
- If only the
__get__()method is implemented, the descriptor is a non-data descriptor.
There are differences in the presentation behavior of the two.
- A data descriptor always overwrites the attributes in the instance dictionary
- while non-data descriptors may be overridden by attributes defined in the instance dictionary
In the example above we have shown the effect of a data descriptor, next we remove the
__set__() method to implement a non-data descriptor.
bar.__dict__ does not have an attribute with key
foo.x will have the same behavior.
But if we modify the
__dict__ of the
bar object directly by adding the
y property to it, the object property will override the
y descriptor defined in the
Bar class, and accessing
bar.y will no longer call the
__get__() method of the descriptor.
In the data descriptor example above, access to the
x property is always controlled by the descriptor, even if we modify
In the following we will describe how these two differences are implemented.
The key to descriptor-controlled property access is what happens between the execution of
foo.x and the time the
__get()__ method is called.
How object properties are stored
In general, object properties are saved in the
- According to the Python documentation,
object.__dict__is a dictionary or other mapped type object that stores the (writable) properties of an object.
- Most custom objects, with the exception of some built-in Python objects, will have a
- This attribute contains all the properties defined for that object, and
__dict__is also known as a
Let’s continue from the previous example.
When we access
foo.x, how does Python determine whether to call the descriptor method or get the corresponding value from
__dict__? One of the key roles is played by the
. is the dot operator.
How object properties are accessed
The lookup logic for the dot operator is located in the
object.__getattribute__() method, which is called on the object every time the dot operator is executed on the object. cPython implements this method in C. Let’s look at its equivalent Python version.
Understanding the above code, we can see that when we access
object.name, the following procedure is performed in turn.
- First look for the
nameproperty from the class
objbelongs to, and if the corresponding class variable
cls_varexists, try to get the
__get__property of the class
- If the
__get__property exists, that means that
cls_varis (at least) a non-data descriptor. If so, the
__get__method defined in the descriptor is called, passing in the current object
objand the current object’s class
objtypeas arguments, and the result of the call is returned, and the lookup is complete, with the data descriptor completely overrides access to the object itself,
cls_varis a non-data descriptor (or possibly not a descriptor), it will try to find the
nameproperty in the object’s dictionary
__dict__and return the value corresponding to that property if it is present.
- If the
nameattribute is not found in obj’s
cls_varis a non-data descriptor, the
__get__method defined in the descriptor is called, passing the appropriate arguments as above and returning the result of the call.
cls_varis not a descriptor, return it directly.
- If it’s not found at the end, raise an
In the above process, when we get the
name attribute from the class
objtype to which
obj belongs, if it is not found in
objtype we will try to find it in the parent class it inherits from, depending on the return of the
cls.__mro__ class method in the following order.
Now we know that descriptors are called in the
object.__getattribute__() method depending on different conditions, and this is how descriptors work to control access to attributes. If we overload the
object.__getattribute__() method, we can even cancel all descriptor calls.
In fact, the attribute lookup does not call
object.__getattribute__() directly; the dot operator performs the attribute lookup via a helper function.
Therefore, if the result of
obj.__getattribute__() raises an exception and the
obj.__getattr__() method exists, the method will be executed. If the user calls
obj.__getattribute__() directly, the complementary lookup mechanism of
__getattr__() will be bypassed.
If you add this method to the
This behavior is only valid if the
__getattr__() method is defined in the class to which the object belongs. Defining the
__getattr__ method in the object, i.e. adding the attribute in
obj.__dict__, is not valid, and the same applies to the
Python’s internal descriptors
In addition to some custom scenarios, Python’s own language mechanism makes extensive use of descriptors.
We won’t go into the specific effects of
property, but the following is a common use of syntactic sugar.
property itself is a class that implements a descriptor protocol, which can also be used in the following equivalent ways.
In the above example
property(getx, setx, delx, "I'm the 'x' property.") creates an instance of the descriptor and assigns it to
x. The implementation of the
property class is equivalent to the following Python code.
property stores read, write, and delete functions within the dictionary of the descriptor instance, and then determines whether the corresponding function exists when the protocol method is called to achieve control over reading, writing, and deleting of properties.
Yes, every function object we define is a non-data descriptor instance.
The purpose of using descriptors here is to allow the functions defined in the class definition to become bound methods when called through the object.
The method is called by automatically passing the object instance as the first argument, which is the only difference between a method and a normal function. Usually we specify this formal parameter as
self when defining the method. The class definition of the method object is equivalent to the following code.
It takes a function
func and an object
obj in the initialization method and passes
func when it is called.
Let’s take a practical example.
As you can see, when calling
f through a class property, the behavior is a normal function that can pass any object as a
self parameter; when accessing
f through an instance property, the effect becomes a bound method call, so that the bound object is automatically passed as the first parameter in the call. Obviously creating a
MethodType object when accessing a property through an instance is exactly what we can achieve with the descriptor.
The concrete implementation of the function is as follows.
Defining a function by
def f() is equivalent to
f = Function(), i.e. creating an instance of a non-data descriptor and assigning it to the
When we access this property via a class method, the call to the
__get__() method returns the function object itself.
When we access this property through an object instance, the
__get__() method is called to create a
MethodType object initialized with the above function and object.
To recap, functions have a
__get__() method as objects, making them a non-data descriptor instance so that they can be converted to binding methods when accessed as properties. Non-data descriptors will be converted to
f(obj, *args) by the instance call
obj.f(*args) and to
f(*args) by the class call
classmethod is a variant implemented on top of function descriptors and is used as follows.
The equivalent Python implementation is as follows, which will be easy to understand with the above caveat.
@classmethod returns a non-data descriptor that implements the conversion of
f(type(obj), *args) by instance calls and
f(*args) by class calls.
The effect of the
staticmethod implementation is that, whether we call it by instance or by class, we end up calling the original function.
The equivalent Python implementation is as follows.
Calls to the
__get__() method return the function object itself, which is stored in
__dict__, and therefore do not trigger further descriptor behavior for the function.
@staticmethod returns a non-data descriptor that implements the conversion of calls to
obj.f(*args) via instances to
f(*args) and calls to
cls.f(*args) via classes to