The last article on Pin was a shallow introduction to what Pin is all about and why it is needed, but it is still not enough to master that part of knowledge, so this article hopes to systematically sort out the knowledge points related to Pin, so I named the title “Rust Pin Advanced”.
Pin API Anatomy
To understand Pin in depth, it is essential to be familiar with all of its methods. Excluding the nightly API, Pin has 13 methods in total.
|  |  | 
These methods can be divided into two broad categories.
- Pin<P> where P: Deref
- Pin<P> where P: DerefMut
As mentioned in the previous article, Pin is generally represented as Pin<P<T>> (P is the abbreviation for Pointer and T is the abbreviation for Type), so the content wrapped in Pin can only be a smart pointer (any type that implements the Deref trait can be called a smart pointer), and has no meaning for other ordinary types. Since &T and &mut T implement Deref and DerefMut respectively, Pin<&'a T> and Pin<&'a mut T> are considered special implementations of these two classes respectively.
At first glance, these 13 methods look a bit haphazard, but they are actually very well designed, to the point of symmetry. By function, these methods can be divided into 5 major categories, each of which is subdivided into 2 to 3 categories according to mutability or compliance with the T: Unpin restriction. Variable versions end in mut, because unsafe versions that do not conform to the T: Unpin restriction contain unchecked.
| Functions | Methods | Remarks | 
|---|---|---|
| Construct Pin | new()/new_unchecked() | Distinguish between safe and unsafe versions by whether they satisfy the T: Unpinrestriction. | 
| Convert Pin type | as_ref()/as_mut() | Converts &/&mut Pin<P<T>>toPin<&/&mut T>. | 
| Get the borrow of TinsideP<P<T>> | get_ref()/get_mut()/get_unchecked_mut() | consume ownership and get the borrow of Tinside. There are two versions by mutability. Since&mut Tis the “root of all evil”,get_mutalso distinguishes between safe and unsafe versions according to whether or not they satisfy theT: Unpinrestriction. | 
| Consume Pin ownership and get the pointer inside P | into_inner()/into_inner_unchecked() | Distinguish between safe and unsafe versions by whether they satisfy the T: Unpinrestriction. Also, to avoid conflicts withP’s own into class methods, these APIs are designed as static methods that must be called withPin::into_inner(), notpin.into_inner(). | 
| Pin projection | map_unchecked()/map_unchecked_mut() | Usually used for Pin projection. | 
There are only two methods left that are not categorized in the table above, and they are also relatively simple, namely
- Pin::set()- Sets the new- Tvalue in- Pin<P<T>>.
- Pin<&mut Self>::into_ref()- Converts- Pin<&mut T>to- Pin<&T>.
It is worth noting that the implementation of new() and new_unchecked(), get_mut() and get_unchecked_mut(), into_inner() and into_inner_unchecked() are actually identical, the only difference is that the safe version has the Unpin restriction.
|  |  | 
Why should there be a distinction between safe and unsafe versions of the same code? To answer this question, we have to go back to the nature of Pin. The essence of Pin is to ensure that the memory address of T in Pin<P<T> is not changed (i.e., not moved) under safe Rust unless T satisfies T: Unpin. The essence of ensuring that the memory address of T is not changed is to avoid exposing T or &mut T (“the root of all evil”). If you expose T, you can just move it; if you expose &mut T, the developer can call methods like std::mem::swap() or std::mem::replace() to move T. Another thing is that the boundary between safe and unsafe in Rust must be very clear and unambiguous. So as long as you don’t satisfy T: Unpin, then any method that needs to construct Pin<P<T>>, expose T or &mut T should be unsafe.
| Satisfy T: Unpin | Not Satisfy T: Unpin | |
|---|---|---|
| Construct Pin | safe | unsafe | 
| Exposure T | safe | unsafe | 
| Exposure &T | safe | safe | 
| Exposure &mut T | safe | unsafe | 
For example, into_inner_unchecked() returns P, but it is indirectly exposing T and &mut T. Because you can easily get T or &mut T with *P or &mut *P. And you construct Pin<P<T>> as if you were promising to abide by the Pin contract, but this step is clearly a violation of that contract.
Why is Pin::get_ref() safe? Because it returns &T, and there’s no way to move it: the std::mem::swap() class method only supports &mut T, and the compiler will error you if you dereference &T. (Thanks again rustc) Another thing to emphasize is the type of internal mutability. For example, for RefCell<T>, Pin<&mut RefCell<T>>.into_ref().get_ref() returns &RefCell<T>, while methods like RefCell<T>::into_inner() can get T and move it. But that’s okay, because the contract of Pin<P<T>> is to ensure that T inside P is not moved, and here P is &, and T is RefCell, not T inside RefCell<T>. This is fine as long as there is no additional Pin<&T> pointing to T inside RefCell<T>, but you’ve actually eliminated that possibility automatically when you construct RefCell<T>. Because the argument to RefCell::new() is value: T, which already moves T in.
Similarly,
Pin<&mut Box<T>>guarantees thatBox<T>itself is not moved, not theTinsideBox. To ensure thatTinsideBox<T>is not moved, just usePin<Box<T>>.
Pin additional attributes
#[fundamental]
Traits marked with the #[fundamental] attribute are not subject to the orphan rule. So you can give Pin<P<T>> impl your local trait.
#[repr(transparent)
#[repr(transparent)] This property allows Pin to have the same ABI layout as the pointer field inside, which can be useful in FFI scenarios.
The
#[repr(transparent)]attribute is now stable. This attribute allows a Rust newtype wrapper (struct NewType<T>(T);) to be represented as the inner type across Foreign Function Interface (FFI) boundaries.
Traits implemented by Pin
Let’s take a look at what traits Pin implements that are of interest.
Unpin
|  |  | 
Since Unpin is an auto trait, Pin<P<T> will also achieve Unpin if it satisfies P: Unpin. And almost all Ps will be Unpin, so Pin<P<T>> will almost always be Unpin. This implementation is important, especially if the T in question is a Future. It doesn’t matter if your Future satisfies Unpin or not, after you wrap it in Pin<&mut ... >, it’s a Future that satisfies Unpin (because Pin<P> implements Future, as we’ll see later). Many asynchronous methods may require your Future to satisfy Unpin before they can be called, and the Future returned by the async fn method obviously does not satisfy Unpin, so you often need to pin this Future to it. For example, use the macro tokio::pin!().
Also, it needs to be emphasized again that
- Pinitself is not- Unpinhas nothing to do with whether- Tis- Unpinor not, only with- P.
- Pinhas nothing to do with whether- Pis- Unpinor not, it has to do with- T.
The above two sentences are a bit confusing, but after you figure it out, you won’t be confused about many pin scenarios.
Deref and DerefMut
These two traits are critical to Pin. Only when Deref is implemented is Pin<P> a smart pointer, so that the developer can seamlessly call the methods of P. It is important to note that DerefMut is implemented for Pin<P<T>> only if T: Unpin is satisfied. This is because one of the responsibilities of Pin<P<T>> under Safe Rust is to not expose &mut T without satisfying T: Unpin.
In addition, after implementing these two traits, you can dereference &T and &mut T respectively, but there is a difference between this dereference and get_ref() and get_mut(). Take &T for example, suppose there is let p = Pin::new(&T);, dereference p to get &T: let t = &*p;, here the lifecycle of &T is actually equal to the lifecycle of &Pin::new(&T). And Pin::new(&T).get_ref() gets the lifecycle of &T and the lifecycle of Pin itself are equal.
Why is this the case? Let’s look at the syntactic sugar of dereferenced smart pointers after we expand it.
The code for Pin’s Deref implementation is: Pin::get_ref(Pin::as_ref(self)), while the code for Pin::as_ref() is as follows. By comparison, you can see that the lifecycle of &T obtained by dereferencing is indeed different from that obtained by get_ref().
Another thing worth noting is that Pin::as_ref() and Pin::as_mut() will dereference self.pointer, which actually calls its deref() or deref_mut() methods. These two methods are implemented by P itself, so there is a possibility of a “malicious implementation” of T move here. But this “malicious implementation” will be ruled out by Pin’s contract: this is caused by your own “malicious implementation”, not by using Pin.
The documentation for
Pin::new_unchecked()makes a point of emphasizing this point. By using this method, you are making a promise about the P::Deref and P::DerefMut implementations, if they exist. Most importantly, they must not move out of their self arguments: Pin::as_mut and Pin::as_ref will call DerefMut::deref_mut and Deref::deref on the pinned pointer and expect these methods to uphold the pinning invariants.
|  |  | 
In the above example, we construct a Pin<Boz<Unmovable>>, and then call the as_mut() method to dereference this Boz, which has a “malicious” DerefMut implementation that moves away this Unmovable. But I obviously have it Pin in place.
Future
Pin also implements Future, which is closely related to Unpin, so we’ll cover that in the next section.
Unpin and Future
One of the big things that confuses beginners about Rust’s pinning API is the introduction of Unpin, which can often be confusing, so it’s important to get a thorough understanding of Unpin, and in particular its relationship to Future.
As mentioned before, Unpin is an auto trait, and almost all types implement Unpin, including some types you didn’t realize. For example.
- &T: impl<'a, T: ?Sized + 'a> Unpin for &'a T {}
- &mut T: impl<'a, T: ?Sized + 'a> Unpin for &'a mut T {}
- *const T: impl<T: ?Sized> Unpin for *const T {}
- *mut T: impl<T: ?Sized> Unpin for *mut T {}
- Other, including Box,Arc,Rc, etc.
Note that here they are Unpin regardless of whether T satisfies T: Unpin or not. The reason for this has already been stated: The ability of Pin to pin T has nothing to do with whether P is Unpin or not, but only with T.
As mentioned in the previous article, only std::marker::PhatomPinned, which contains the type PhatomPinned, and
.awaitthe structure that follows the desyntactic sugar is!Unpin, which is not repeated here.
Unpin is a safe trait
Another important feature: Unpin is a safe trait, which means you can implement Unpin for any type under safe Rust, including your Future type.
We prepare two assert functions in advance, which will be used later.
|  |  | 
If you want to poll this Dummy future in another Future it’s no problem at all. The futures crate even provides a series of unpin versions of methods to help you do this, such as FutureExt::poll_ unpin().
You can see that this is &mut self, not self: Pin<&mut Self>.
However, the pin projection scenario requires special attention, if you have a field of type !Unpin, you can’t implement Unpin for this type. See the official website Pinning is structural for field for details.
Why Future can be Unpin
Some people may ask, “Wasn’t Pin originally designed to solve the problem of self-referencing structures that don’t get moved when implementing Future? Why is it possible to implement Unpin for the Future type? The reason is this: if you implement Future as a self-referential structure, then of course it can’t be Unpin, but otherwise it’s perfectly fine to implement Unpin. The example above, and many third-party libraries’ Future types, do not have self-referential structs, so you can move with confidence, so it can be Unpin. Another advantage is that you can use the safe version of the Pin::new() method to construct Pin to poll future, without having to deal with unsafe.
Pin’s Future implementation
The reason we moved here to talk about the Future implementation of Pin is that 1.56 has a PR #81363 that removes the P: Unpin restriction. Let’s first look at why we need to implement Future for Pin, and then analyze why the Unpin restriction can be let go here.
|  |  | 
The reason for implementing Future for Pin is simply to make it easier to call poll(), especially in the pin projection scenario. Since self of poll() is of type Pin<&mut Self>, you can’t call poll() directly with future.
You have to construct a Pin<&mut Dummy> before you can call poll(). After implementing Future for Pin, you can just write: Pin::new(&mut dummy).poll(ctx), otherwise you need to write Future::poll(Pin::new(&mut dummy), ctx).
Again, let’s see why P::Unpin is not needed here. First, the purpose of this method is to poll P::Target, a Future, and the Self of the poll() method is Pin<P<T>> and self is Pin<&mut Pin<P<T>>> (note that there are two layers of Pin here). We need to safely convert Pin<&mut Pin<P<T>>> to Pin<&mut T>> in order to call poll() on P::Target. It’s easy to get Pin<&mut T>, there’s Pin::as_mut(), and both versions end up calling as_mut(), so there’s no problem here. But the signature of as_mut() is &mut self, which means we have to get &mut Pin<P<T>> first. If we reduce Pin<&mut Pin<P<T>>> to the basic form Pin<P<T>>, then &mut is the P and Pin<P<T>> is the T. To get &mut Pin<P<T>>> from Pin<&mut Pin<P<T>> is actually to get &mut T from Pin<P<T>>. Both get_mut() and get_unchecked_mut() methods are satisfied, the only difference is the Unpin restriction, which is where that PR change comes in. Without the Unpin restriction, we would have to use the unsafe version of get_unchecked_mut(). But it’s completely safe here, because we call as_mut() as soon as we get &mut Pin<P<T>>, and we don’t move it. So the previous P: Unpin is redundant. For more details, see the documentation and source code comments for Pin::as_deref_mut().
Why Unpin constraints are needed
As mentioned above, some asynchronous-related APIs require your type to meet Unpin in order to be called. As far as I can tell, these APIs fall into three general categories.
- Scenarios that require &mut future. **For example, tokio::select!(), a macro that requires yourFutureto satisfyUnpin.
- The AsyncRead/AsyncWritescenario. **For example, the method tokio::io::AsyncWriteExt requires yourSelfto satisfyUnpin.
- Futureitself is- Unpincompliant and does not want to deal directly with- Pin. **The- FutureExt::poll_unpin()method mentioned above falls into this category.
Class (2) is mainly related to self of AsyncRead / AsyncWrite which requires Pin<&mut Self>, there are quite a few discussions about this in the community, not the focus of this article, check the following information if you are interested.
- futures-rs: Should AsyncRead and AsyncWrite take self by Pin?
- tokio: Should AsyncRead/AsyncWrite required pinned self?
- Tokio’s AsyncReadExt and AsyncWriteExt require Self: Unpin. Why and what to do about it?
Second, tower is also considering whether to add Pin<&mut Self>: Pinning and Service.
Regarding class (1), the main reason is that the implementation of Future for &mut Future specifies the need for F: Unpin.
So it comes down to figuring out why we need Unpin here. Let’s start with a scenario where we have a future that we need to keep polling in a loop, but Future::poll() consumes ownership of self every time it is called. So we need to mutably borrow this future to avoid consuming ownership of future. But after &mut future there is a risk of moving the future (“the root of all evil”), so either your future is Unpin or you have to pin it and borrow it mutably (i.e. &mut Pin<&mut future>). And it just so happens that Pin<P> where P: DerefMut implements Future! (as mentioned in the previous section) and Pin<P> also satisfies Unpin! It’s so perfect that we can just implement Future for &mut F, as long as F satisfies Future + Unpin. The advantage of this is that if your future satisfies Unpin, then you can just poll it multiple times in the loop and not worry about the move; if your future doesn’t satisfy Unpin, that’s fine, just pin it. For example, in the following example, because tokio::time::Sleep doesn’t satisfy Unpin, you need to pin it with tokio::pin!() before you can compile it.
|  |  | 
By the same token, the implementation of Future for Box<F> also requires Unpin.
Other scenarios that require Pin
I often encounter people asking questions like “Do I need to use Pin to solve this scenario?” I look at the question and see that it has nothing to do with Pin, so I reply with this classic quote.
Rust Community Quote: Whenever you wonder if Pin could be the solution, it isn’t.
The Pinning API is designed for generality, not just to solve the problem of self-referential struct move in asynchronous, but also for other scenarios where Pin is needed.
Intrusive collections
Intrusive collections is another application scenario for Pin. The documentation for Pin mentions the example of intrusive doubly-linked list, but it is similar for other intrusive data structures (e.g. intrusive single-linked tables). However, the documentation is only a few sentences, which is not very good, so I will briefly summarize it here.
First of all, you need to understand what intrusive collections are. Almost all the data structures we use in collections are non-intrusive, such as the standard library Vec, LinkedList and so on. The characteristic of non-intrusive type collections is that the elements in the collection are completely decoupled from the collection itself, the collection does not need to care what type each element is, and the collection can be used to hold elements of any type. However, a collection of type intrusive is a completely intrusive collection, where the prev or next pointer is defined on top of the element.
Using C++ as an example, a non-intrusive doubly linked list can be defined like this
And the intrusive version needs to be written like this
The pseudo-code for the Rust version of intrusive would probably also look like this.
You can see that the biggest difference between the two is whether the pointer is placed on top of the collection or on top of the element. The two types of collections have their own advantages and disadvantages, while the intrusive type has the advantage of better performance and the disadvantage of not being generic and requiring repeated definitions of collections for different elements. Related knowledge is not the focus of this article, for more details you can take a look at the following information.
- Invasive containers provided by Google Fuchsia
- Intrusive linked lists
- Safe Intrusive Collections with Pinning
So why do intrusive collections need to use pins? The reason is that elements have a prev or next pointer to each other, so if one element in the middle moves, the pointer address of the other elements to it will be invalid, resulting in unsafe behavior. Rust has a library called intrusive-collections that provides many intrusive collection types, and Tokio also defines intrusive collections, and no doubt they all use pins.
Other
In fact, as long as we need to deal with the scenario of preventing being moved, theoretically we need to use Pin to solve it. I can’t think of any other cases for now, so I’ll add them later if I find any new ones, or if you know of others, please let me know.
Summary
This article is a little long, so let’s summarize.
- The API for Pinis very well designed, even full of symmetry, and its methods can be roughly divided into 5 categories. It involvesUnpinand&mut Twhich can be subdivided into safe and unsafe.
- #[fundamental]and- #[repr(transparent)]of- Pinare important, but you generally don’t need to care about it.
- The traits implemented by Pinneed to focus onUnpin,Deref/DerefMutandFuture, and understanding them will allow you to fully masterPin.
- Unpinand- Futureare very closely related.- Unpinis a safe trait that can theoretically be implemented arbitrarily, and- Futurecan also be- Unpin. Some asynchronous APIs may require- Unpinrestrictions, and the reason for it needs to be understood, not just used.
- Pinis a generic API, and there will be other scenarios that require- Pinin addition to- async / await, such as intrusive collections.
The Pin projection, mentioned several times in the article, is not expanded, so we will discuss it in detail in the next article. See you soon!