You have heard that the Rust language has Safe as one of its features, but due to the sad fact that hardware is Unsafe, all “safe” must be encapsulated in “Insecure” must be encapsulated on top of “Insecure”. This leads to the fact that “Safe” in the full sense of the word is very difficult to achieve and extremely limited in functionality.

So let’s see where Rust’s Safe boundary lies.

What does Rust consider not to be “unsafe”?

What is safe for Rust I believe we all understand and won’t go into it here; in fact, there are some behaviors that, while we would consider to be unexpected or even unsafe, do not occur in Rust as follows.

  • Deadlocks
  • memory and resource leaks
  • exiting without executing destructions
  • Exposing a random base address due to a pointer leak
  • Integer overflows
  • Logic errors

The first four are well understood, especially the memory leak, which is mentioned in The Book (and see that the standard library’s std::mem:leak is not unsafe); the two problems discussed here are, integer overflow and logical errors in particular.

Integer overflows

If a piece of code contains an arithmetic overflow, it’s the programmer’s fault. In the following discussion, we need to distinguish between arithmetic overflows and wrapping arithmetic. The former is wrong, while the latter is expected.

When the programmer enables the debug_assert! assertion (e.g., compilation in debug mode), the compiler inserts a dynamic check at runtime and panic if an overflow occurs. other types of builds (e.g., in release mode) may cause panic or do nothing in case of an overflow.

In the case of implicit wrapper overflows, implementers must provide well-defined (even if still considered wrong) results by using the overflow convention for two’s complement.

The Rust standard library provides methods for integer types that allow the programmer to explicitly perform wrapper arithmetic. For example, i32::wrapping_add provides two’s complement, wrapping addition.

The standard library also provides a Wrapping<T> type that ensures that all standard arithmetic operations for T have wrapping semantics.

See RFC 560 for error conditions, rationale, and more details on integer overflows.

Logical errors

Secure code can have additional logical constraints that cannot be checked at compile time or at run time. If a program breaks such a constraint, its behavior may be unspecified, but will not result in undefined behavior. This may include panic, incorrect results, unintended aborts, or dead loops. This behavior may also vary between different runs, builds, or kinds of builds.

For example, implementations of Hash and Eq require that equal values must have equal hashes. Another example is data structures like BinaryHeap, BTreeMap, BTreeSet, HashMap, and HashSet, which define constraints against the modification of objects in their Keys. Violation of such constraints is not considered unsafe, however the behavior of the program is unpredictable and may be abnormal at any time.

What Rust considers “undefined”

Undefined Behaviour is an interesting definition, sort of an old friend of programmers writing C and C++, and even a lot of code will rely on undefined behaviour.

Rust code is incorrect if it has any of the behaviors in the following list, including code in unsafe. unsafe only means that avoiding undefined behavior is the programmer’s responsibility; it does not change any requirement that Rust programs never cause undefined behavior. In other words, there should be no undefined behavior, whether or not unsafe is used.

When writing unsafe code, it is the programmer’s responsibility to ensure that any safe code that interacts with unsafe code cannot trigger these behaviors. Unsafe code that satisfies this property is said to be sound (sound) to any safe caller; unsafe code is unsound if it can be abused by safe code to exhibit undefined behavior.

Be aware that the following list is not exhaustive. There is no formal model of Rust’s semantics for the allowed and disallowed behaviors in unsafe code, so there may be more behaviors that are considered unsafe. The list below is just the undefined behaviors we have identified. Before writing unsafe code, please read the Rustonomicon.

  • Data races

  • Execute an dereference expression (*expr) on a dangling or unaligned raw pointer, even in the context of an address expression (e.g. addr_of!(&*expr)).

  • breaks the pointer aliasing rule. &mut T and &T follow LLVM’s scoped noalias model, unless &T contains an UnsafeCell<U>.

  • Modify immutable data. all data in const entries are immutable. In addition, all data that is shared by a reference or owned by an immutable binding is immutable unless it is contained in an UnsafeCell<U>.

  • Invoke undefined behavior via the compiler’s built-in directives.

  • Execute code compiled for a platform feature not supported by the current platform (see target_feature, which usually results in SIGILL).

  • Call a function with an incorrect call statute (ABI) or unwind a function with an incorrect unwind ABI.

  • Generate an invalid value, even in private and local fields. A value is “generated” when it is assigned to or read from a place, passed to a function/primitive operation, or returned from a function/primitive operation. The following values are invalid.

    • A value in a bool other than false (0) or true (1).

    • A discriminator in an enumeration that is not included in the type definition.

    • An empty fn pointer.

    • A value in char that is surrogate or higher than char::MAX.

    • ! (all values are invalid for this type).

    • An integer, floating-point value, or raw pointer from uninitialized memory, or uninitialized memory in str.

    • A reference or Box<T> that is dangling, unaligned, or points to an invalid value.

    • A flood reference, Box<T>, or invalid metadata in a raw pointer.

      • If a dyn Trait pointer/reference points to a vtable that does not match the vtable of the corresponding Trait, then the metadata of the dyn Trait is invalid.
      • If the length of a Slice is not a valid usize (e.g., a usize read from uninitialized memory), then the metadata of the Slice is invalid.
    • An invalid value for a type with a custom invalid value (which is a bit hard to understand), such as NonNull<T> and NonZero* in the standard library.

Note: Uninitialized memory is also implicitly invalid for any type with a restricted set of valid values. In other words, the only cases where reading uninitialized memory is allowed are within unions and in padding (the gap between fields/elements of a type).

Note: Undefined behavior affects the entire program. For example, calling a function in C that exhibits undefined behavior in C means that your entire program contains undefined behavior, which also affects Rust code. And vice versa, undefined behavior in Rust can adversely affect the code executed by any FFI calls in other languages.

Dangling pointers

A reference/pointer is dangling if it is null, or if all the addresses it points to are not legal addresses (such as malloc-allocated memory). The range it points to is determined by the value of the pointer and the size of the type being pointed to (using size_of_val). Thus, if the range pointed to is empty, dangling is the same as nonempty.

Note that slices and strings point to their entire range, so they cannot be very long. The length of the memory allocation and the length of the slice and string cannot be larger than isize::MAX bytes.


Reference https://www.purewhite.io/2021/08/11/rust-considered-unsafe-undefined/