about
05/04/2022
RustBites - Data Operations
Code Repository Code Examples Rust Bites Repo

Rust Bite - Data Operations

bind, copy, borrow, move, clone, mutate

In this bite, we focus on basic data operations:

1. Our goal is to understand the terms:

- Bind: Associate an identifier with a value
- copy: Bind to a copy of the value of a blittable type, executed implicitly by compiler generated code, by copying bytes from source to destination location. Fast.
- borrow: Create a named reference (pointer with special syntax and semantics) to an identifier's location. Pointers used for borrows must satisfy Rust's ownership rules, discussed in an upcoming Bite. Borrows are the only pointers that can be dereferenced in safe code.
- Move: Transfer ownership of a type's resources, usually executed implicitly. Accomplished by creating, for the destination, a pointer to the source's resources (data allocated on the heap), and invalidating the source instance. Fast.
- Clone: Create a copy of a non-blittable type, invoked by program code. Slower than move.
- mutate:  Change the value associated with a mutable identifier.

2. Rust Types

First, we need to say a few words about Rust types:
A type is set of allowed values and operations that are legal for the set.
The Rust language defines a rich set of primitive types:
- bool
- char (utf-8)
- integers: i8, i16, i32, i64, isize, u8, u16, u32, u64, usize
- floats: f32, f64
- aggregates: array: [T;N], slice: [T], str: literal string "....", tuple: (T1, T2, ...), struct { T1, T2, ... }
Assuming that T is a primitive type these each occupy a contiguous block of memory. The size of each depends on the type. They all can be copied by a memcpy operation, e.g., they are blittable, again, provided that T is primitive. Rust uses the terms "Copy trait" for blittability. We say the Rust primitives are all Copy types. The Rust Libraries define a large set of non-primitive types:
- String a stack-based object holding a collection of utf-8 chars in the heap
- Vec<T> very like a String, but holding a heap-based collection of an arbitrary type, T
- VecDeque<T> stack-based object holding a heap-based collection of T objects with efficient access to both front and back
- Map<K, V>: an associative container holding key-value pairs in the heap
- ...
Values of generic types T, K, and V are fixed size constant for each type. String is composed of instances of the Char type that have sizes determined by their content, varying between 1 and 4 bytes, since they are utf-8 values. The types, listed above, each occupy more than one contiguous block of memory and so cannot be blitted. They can be moved, but not copied. If they implement the Clone trait - all types above do - they can be explicitly copied by invoking their clone operation. We say they are Move types. User defined types may be either Copy or Move depending on whether they occupy one contiguous block of memory or not. No type can be both.
The Rust Story contains more details on Rust types with examples: Rust types

3. Binding to a Value

Bind - associate an identifier with a memory location
  • Remember, a type is a set of legal values with associated operations.
  • Every identifier has a type:
    let k: i32 = 42;
    let signifies a binding is being created. i32 is the type of a 32 bit integer. 42 is a value placed in the memory location associated with k
  • Type inference:
    let k = 42;
    This binding is legal and has the same meaning as the previous binding. In lieu of other information, Rust will assign the type i32 to any unadorned integral value that can be correctly written to a 32 bit location.

4. Binding to an identifier

Binding to an identifier has several forms:
  • let j:i32 = k; // makes copy for j because k is blittable
  • let l = &k;    // l makes a reference to k, called a borrow
  • let s:String = "a string".to_string();
  • let t = s;     // moves s into t, e.g., transfers ownership as s is not blittable
Both sides of a binding expression must be of the same type. Rust will not implicitly convert types.
 

5. Assignment

As in many languages, assignment in Rust is an expression like:
  • x = y  // copy if x and y are Copy types, y is valid after assignment
  • t = s  // move if s and t are Move types, s is invalid after assignment
Both sides of an assignment expression must be of the same type. We usually encounter assignments in statements, e.g., expressions terminated with the semicolon character.

6. Copy and Borrow

Figure 1. CopyType Copy
Borrows are non-owning references pointing to an identifier.
  • Copies happen implicitly when an identifier is bound to a Copy type:
    let i = 3; let j = i;   // copy
    or when one copy type is assigned to another:
    j = i + 1;   // copy
  • Borrows happen when binding references to other identifiers:
    let r = &i   // borrow;
    A reference, like &i, is just a pointer to the memory location bound to i. It cannot be reset, and is subject to Rust ownership rules, which we will discuss soon, in a later Bite.

7. Copy, Move, and Clone Traits

Figure 2. Str Copy
Figure 3. String Move Figure 4. String Clone
Traits are like interface contracts. They specify a behavior that a type must implement if it has that trait. Copy types are types that implement the Copy trait.
  • To be eligible for the Copy trait they must be blittable.
  • The str type represents immutable literal strings. Each is stored in static memory with program lifetime. They are always accessed with a reference, e.g., s:&str = "a literal string". So the reference is copied, as shown in Figure 2.
Move types are types that don't implement the Copy trait.
  • Move types are non-blittable, with one exception.
  • Adding the Drop trait makes a Move type, even if blittable.
  • When the thread of execution leaves a scope all move types, declared in that scope, are dropped, returning their resources with Drop::drop(). That's similar to a C++ destructor invocation.
Clone types are types that implement the Clone trait.
  • Data types with the Clone trait provide a clone() member function that creates a new instance of the type that has the same structure and copies of any resources held by the cloner.
  • Examples of Clone types are the collections, e.g., Strings, Vecs, VecDeques, Maps, ...

8. Move and Clone

  • A move transfers a Move type's heap resources to another instance of that type.
    • The String, s, shown in Figure 2. is moved to t with the statement:
      • let t = s;   // s is now invalid
    • Move transfers ownership of resources.
  • A clone copies a Move type's heap resources to a new instance of that type.
    • The String, s, shown in Figure 3. is cloned with the statement:
      • let t = s.clone();   // s is still valid
    • Clone operation copies resources to the target.

9. Mutation

By default, Rust data is immutable - it can't be changed. Code has to opt in to mutation in order to change the value of an identifier. We do that with the mut qualifier.
  • Immutable data:
    let i = 1;
    // i += 1; won't compile
  • Mutable data:
    let mut j = 1;
    j += 1; // compiles since j is mutable
Mutability of data is an important part of Rust's ownership policies, designed to ensure memory safety.

10. Traits Preview:

A trait specifies one or more function signatures that a type with that trait is obligated to implement. Marker traits, like copy, are an exception, declaring no signature. But they affect code generated by the compiler. Traits are used to constrain generic parameter types and to support dynamic dispatching in polymorphic designs. std::marker::Copy and std::clone::Clone are Traits defined by the standard library. There are many more, some of which we will meet in later Bites. Move is not one of them. I've included Move in this table because in some ways it acts like a trait. The fact is that data is moved when constructing or assigning if and only if it is not Copy. So I think of Move as a trait even though you cannot use it to constrain generic types (more on that in the Functs Bite).
Trait Applies to: Examples Consequences
Copy Single contiguous memory block
==> blittable
ints, floats, aggregates of Copy types Copys value from one memory location to another. Source valid after copy
"Move" non-contiguous block
==> not blittable
Strings, Vecs, VecDeques, ...
stack-based aggregates managing instances in the heap
Transfers data ownership to another identifier.
Source invalid after move
Attempt to use "Moved" variable is compile error.
Clone most types Structs, Strings, Vecs, VecDeques, ... Makes copy of resources for another identifier. Source valid after clone

11. Formatting Data

Much of the data generated in Rust code will windup being formatted for display or for building strings. We do that with print! and format! macros, that take format specifications like this: Format Specifiers: let arg1 = "abc"; let arg2 = 123 let s: String = format!("\n {:?} and {}", arg1, arg2); The last line contains a format operation using the format! macro. The "{xx}" are placeholders for format specifications. The first, {:?} specifies the value of arg1 is to be formatted using the debug specifier, e.g., a relatively simple format known to most Rust library and user-defined types. The second, {} specifies that a custom display format is to be used - because it doesn't hold a specifier. There are a lot of predefined specifiers and options described in std::fmt crate. We will use these as needed in each of the remaining Bites without further clarification.

12. Conclusions:

The notions of Move, Copy, Clone, Borrow, and Drop are central to the Rust memory model and Rust's guarantees about memory and data race safety. We will see more about that in the ownership Bite. Binding and mutation provide Rust's interface to data values. Combining all of these things allows Rust to support a scope-based value model for data without the complexity of building multiple constructors (copy, move, conversion), user-defined assignment operators, and destructors.

13. Exercises:

Note:
In order to build and run with cargo from the Visual Studio Code terminal you need to open VS Code in the package folder for the code you want to build and run. That's the folder where the package cargo.toml file resides.
  1. Create an instance of a blittable type and show when it is copied.
    • Can you prove that it was copied?
  2. Create an instance of a non-blittable type and show when it is moved.
    • Can you prove that it was moved?
    • Can you show that the moved-from is invalid?
  3. Repeat the second exercise but clone the non-blittable type instead of moving it.
    • Can you show that the cloner is still valid?

14. Solution for Exercise #1

Solution
  Addresses are different, values are the same => copy.   voila!
Note that lines 5 and 6 are all you need to complete the main question. The remaining lines address the "Can you prove ..." addendum.
  Next Prev Pages Sections About Keys