about
10/21/2024
Post: Rust Data

Rust Data

bind, copy, borrow, move, clone, mutate

About
click header to toggle Rust Explorer

Synopsis:

The Rust type system plays an important rule in guaranteeing memory and thread safety, using copy and move semantics, index bounds checking, and enforcing a "no shared concurrent mutability" invariant for references. In this post we summarize Rust data declarations, then focus on copy, move, and clone operations.
Rust's data operations are similar to, but simpler and safer than, corresponding C++ operations. Like C++, variables are instantiated on definition and deallocated when the thread of execution leaves the scope where they were defined. However, Unlike C++, Rust range checks all indexes at run-time; and its build process performs static analysis of all references to maintain a "no shared mutability" invariant and that no references outlive their referend's lifetime. Rust data types are partitioned into non-overlapping Copy and Move categories. The allowed operations are based on a single-owner policy.

1. Our goal is to understand the terms:

- Bind: Associate an identifier with a value
- copy: Bind to a copy of the value of a blittable type, executed implicitly by compiler generated code, by copying bytes from source to destination location. Fast if the data is defined in a few bytes of storage, like the primitive types.
- borrow: Create a named reference (pointer with special syntax and semantics) to an identifier's location. Pointers used for borrows must satisfy Rust's ownership rules, discussed in the Ownership Bite. Borrows are the only pointers that can be dereferenced in safe code.
- Move: Transfer ownership of a type's resources, usually executed implicitly. Accomplished by creating, for the destination, a pointer to the source's resources (data allocated on the heap), and invalidating the source instance. That's fast, copying only a few bites.
- Clone: Create a copy of a non-blittable type, invoked by program code. Slower than Move.
- mutate:  Change the value associated with a mutable identifier.

2. Rust Types

First, we need to say a few words about Rust types:
A type is set of allowed values and operations that are legal for the set.
The Rust language defines a rich set of primitive types:
- bool
- char (utf-8)
- integers: i8, i16, i32, i64, isize, u8, u16, u32, u64, usize
- floats: f32, f64
- aggregates: array: [T;N], slice: [T], str: literal string "....", tuple: (T1, T2, ...), struct { T1, T2, ... }
Assuming that T is a primitive type these each occupy a contiguous block of memory. The size of each depends on the type. They all can be copied by a memcpy operation, e.g., they are blittable, again, provided that T is primitive. Rust uses the terms "Copy trait" for blittability. We say the Rust primitives are all Copy types. The Rust Libraries define a large set of non-primitive types:
- String a stack-based object holding a collection of utf-8 chars in the heap
- Vec<T> very like a String, but holding a heap-based collection of an arbitrary type, T
- VecDeque<T> stack-based object holding a heap-based collection of T objects with efficient access to both front and back
- Map<K, V>: an associative container holding key-value pairs in the heap
- many more
Values of generic types T, K, and V are fixed size constant for each type. String is composed of characters that have sizes determined by their content, varying between 1 and 4 bytes, since they are utf-8 values. The types, listed above, each occupy more than one contiguous block of memory and so cannot be blitted. They can be moved, but not copied. If they implement the Clone trait - all types above do - they can be explicitly copied by invoking their clone operation. We say they are Move types. User defined types can be either Copy or Move depending on whether they occupy one contiguous block of memory or not. No type can be both.
The Rust Story contains more details on Rust types with examples: Rust types

3. Binding to a Value

Bind - associate an identifier with a memory location
  • A type is a set of legal values with associated operations.
  • Every identifier has a type:
    let k: i32 = 42;
    let signifies a binding is being created. i32 is the type of a 32 bit integer. 42 is a value placed in the memory location associated with k
  • Type inference:
    let k = 42;
    This binding is legal and has the same meaning as the previous binding. In lieu of other information, Rust will assign the type i32 to any unadorned integral value that can be correctly written to a 32 bit location.
  • Code can explicitly specify the type of a value, e.g.:
    let m = 42u8;
    Usually a Rust developer will choose either of the first two definition styles.

4. Binding to an identifier

Binding to an identifier has several forms:
  • let j:i32 = k;  // makes copy for j because k is blittable
  • let l = &k;       // l makes a reference to k, called a borrow
  • let s:String = "a string".to_string();
  • let t = s;         // moves s into t, e.g., transfers ownership as s is not blittable
Both sides of a binding expression must be of the same type. Rust will not implicitly convert an instance to another type except for dereferencing smart pointer types.
 

5. Assignment

As in many languages, assignment in Rust is an expression like:
  • x = y  // copy if x and y are Copy types, y is valid after assignment
  • t = s  // move if s and t are Move types, s is invalid after assignment
Both sides of an assignment expression must be of the same type. We usually encounter assignments in statements, e.g., expressions terminated with the semicolon character.

6. Copy and Borrow

Figure 1. CopyType Copy
Borrows are non-owning references pointing to an identifier.
  • Copies happen implicitly when an identifier is bound to a Copy type:
    let i = 3; let j = i;   // copy
    or when one copy type is assigned to another:
    j = i + 1;   // copy
  • Borrows happen when binding references to other identifiers:
    let r = &i   // borrow;
    A reference, like &i, is just a pointer to the memory location bound to i. It cannot be reset, and is subject to Rust ownership rules, which are discussed in this Rust Bite.

7. Copy, Move, and Clone Traits

Figure 2. Str Copy
Figure 3. String Move Figure 4. String Clone
Traits are like interface contracts. They specify a behavior that a type must implement if it has that trait. Copy types are types that implement the Copy trait.
  • To be eligible for the Copy trait they must be blittable.
  • The str type represents immutable literal strings. Each is stored in static memory with program lifetime. They are always accessed with a reference, e.g., s:&str = "a literal string". So the reference is copied, as shown in Figure 2.
Move types are types that don't implement the Copy trait.
  • Move types are non-blittable, with one exception, mutable references.
  • Adding the Drop trait makes a Move type, even if blittable.
  • When the thread of execution leaves a scope all move types, declared in that scope, are dropped, returning their resources with Drop::drop(). That's similar to a C++ destructor invocation.
Clone types are types that implement the Clone trait.
  • Data types with the Clone trait provide a clone() member function that creates a new instance of the type that has the same structure and copies of any resources held by the cloner.
  • Examples of Clone types are the collections, e.g., Strings, Vecs, VecDeques, Maps, ...

8. Move and Clone

  • A move transfers a Move type's heap resources to another instance of that type.
    • The String, s, shown in Figure 3. is moved to t with the statement:
      • let t = s;   // s is now invalid
    • Move transfers ownership of resources.
  • A clone copies a Move type's heap resources to a new instance of that type.
    • The String, s, shown in Figure 4. is cloned with the statement:
      • let t = s.clone();   // s is still valid
    • Clone operation copies resources to the target.

9. Mutation

By default, Rust data is immutable - it can't be changed. Code has to opt in to mutation in order to change the value of an identifier. We do that with the mut qualifier.
  • Immutable data:
    let i = 1;
    // i += 1; won't compile
  • Mutable data:
    let mut j = 1;
    j += 1; // compiles since j is mutable
Mutability of data is an important part of Rust's ownership policies, designed to ensure memory safety.

10. Traits Preview:

A trait specifies one or more function signatures that a type with that trait is obligated to implement. Marker traits, like copy, are an exception, declaring no signature. But they affect code generated by the compiler. Traits are used to constrain generic parameter types and to support dynamic dispatching in polymorphic designs. std::marker::Copy and std::clone::Clone are Traits defined by the standard library. There are many more, some of which we will meet in Rust Bites. Move is not one of them. I've included Move in this table because in some ways it acts like a trait. The fact is that data is moved when constructing or assigning if and only if it is not Copy. So I think of Move as a trait even though you cannot use it to constrain generic types (more on that in the Functs Rust Bite).
Trait Applies to: Examples Consequences
Copy Single contiguous memory block
==> blittable
ints, floats, aggregates of Copy types Copys value from one memory location to another. Source valid after copy
"Move" non-contiguous block
==> not blittable
Strings, Vecs, VecDeques, ...
stack-based aggregates managing instances in the heap
Transfers data ownership to another identifier.
Source invalid after move
Attempt to use "Moved" variable is compile error.
Clone most types Structs, Strings, Vecs, VecDeques, ... Makes copy of resources for another identifier. Source valid after clone

11. Formatting Data

Much of the data generated in Rust code will windup being formatted for display or for building strings. We do that with print! and format! macros, that take format specifications like this: Format Specifiers: let arg1 = "abc"; let arg2 = 123 let s: String = format!("\n {:?} and {}", arg1, arg2); The last line contains a format operation using the format! macro. The "{xx}" are placeholders for format specifications. The first, {:?} specifies the value of arg1 is to be formatted using the debug specifier, e.g., a relatively simple format known to most Rust library and user-defined types. The second, {} specifies that a custom display format is to be used - because it doesn't hold a specifier. There are a lot of predefined specifiers and options described in std::fmt crate. We will use these as needed without further clarification.

12. Conclusions:

The notions of Move, Copy, Clone, Borrow, and Drop are central to the Rust memory model and Rust's guarantees about memory and data race safety. You can see more about that in the ownership Rust Bite. Binding and mutation provide Rust's interface to data values. Combining all of these things allows Rust to support a scope-based value model for data without the complexity of building multiple constructors (copy, move, conversion), user-defined assignment operators, and destructors.

13. Exercises:

Note:
In order to build and run with cargo from the Visual Studio Code terminal you need to open VS Code in the package folder for the code you want to build and run. That's the folder where the package cargo.toml file resides.
  1. Create an instance of a blittable type and show when it is copied.
    • Can you prove that it was copied?
  2. Create an instance of a non-blittable type and show when it is moved.
    • Can you prove that it was moved?
    • Can you show that the moved-from is invalid?
  3. Repeat the second exercise but clone the non-blittable type instead of moving it.
    • Can you show that the cloner is still valid?

14. Solution for Exercise #1

Solution
  Addresses are different, values are the same => copy.   voila!
Note that lines 5 and 6 are all you need to complete the main question. The remaining lines address the "Can you prove ..." addendum.
  Next Prev Pages Sections About Keys