about
Bits Data Rust
11/25/2023
0
Bits: Rust Data Types
types, initialization, ownership, borrowing
Synopsis:
This page demonstrates simple uses of the most important Rust types. The purpose is to quickly acquire some familiarity with types and their uses.- Rust provides two categories of types: copy and move. Copy types occupy a single contiguous block of stack memory. Move types have control blocks in stack memory used to manage resources in the heap.
- Primitive types and aggregates of primitive types are copy. Construction, assignment, and pass-by-value copies the source's value to the destination.
- All other types are move. Construction, assignment, and pass-by-value move ownership of the value's heap resources to the destination. This makes the source invalid. Attempting to use a "moved" variable is a compile error.
- The source and destination of copy and move operations must have exactly the same type.
- More details about copy and move operations can be found in Rust Bite - Data Operatons, including diagrams and code examples.
- Most move types provide clone operations that support making copies, but using code must explicitly call clone().
- Rust supports making fixed references to either copy or move types. These may refer to instances in a function's stack memory, in the native heap, or in the program's static memory.
- Rust references are constrained by Rust ownership rules to support memory and data race safety by construction. All values in memory have a single owner, responsible for its creation, access, and deallocation.
- Ownership can be borrowed by creating a reference. Program code may create an arbitrary number of non-mutable references to some variable, but may only create a single mutable reference that borrows exclusive access to the variable.
- Rust's type system plays a dominant role in providing memory and data race safety.
- Here we begin to see significant differences between the languages, especially when comparing statically typed languages like C++, Rust, and C#, with dynamically typed languages like Python and JavaScript.
Rust Types Details
Table 1.0 Rust Types
Type | Comments | Example |
---|---|---|
-- Integral types ---- | ||
values true and false | ||
|
signed and unsigned integer types | |
-- Floating point types ---- | ||
values have finite precision, and may have approximate values | ||
-- literal string types -- | ||
A literal string |
let second = ls.chars().nth(1); |
|
Slice of a literal string | ||
-- Aggregate types ---- | ||
An array of N elements all of type |
let first = arr[0]; |
|
Slice of array of elements of type |
let second = arrs[1]; |
|
collection of heterogeneous types accessed by position |
let third = tu.2; |
|
Result enum holds result |
||
Option enum holds optional value |
||
-- Std::library types ---- | ||
Expandable collection of utf-8 characters allocated in the heap | ||
Expandable generic collection of items of type |
||
Expandable double-ended generic collection of items of type |
v.push_front(1.5); ... |
|
Associative container of Key-Value pairs, held in a table of bucket lists. |
map.insert("zero", 0); ... |
|
LinkedList, BTreeMap, HashSet, BTreeSet, BinaryHeap, ... |
The Rust std::library defines types for threading and synchronization, reading and writing to streams, anonymous functions, ... |
Crate std |
-- User-defined Types -- | ||
User-defined types | Based on structs and enums, these will be discussed in the next Bit. |
Rust Type System Details
Table 2. Rust Copy and Move Operations
Operation | Example | Copy type | Move type |
---|---|---|---|
Construction | |||
Assignment | |||
Pass-by-value |
Table 3. Rust Copy and Move Types
Property | Members | Notes |
---|---|---|
Primitive types | integers, floats, literal strings | These are all copy types |
Aggregate types | arrays, slices of arrays, tuples, Result<T,E>, Option<T> | These are copy types if their members are copy, otherwise they are move types |
std::library collection types | String, Vec<T>, VecDeque<T>, HashMap<K,V>, ... | These are all move types |
Table 4. Rust Type System Attributes
Static typing | All types are known at compile time and are fixed throughout program execution. |
Inference | Compiler infers types in expressions if not explicitly annotated. Occasionally inference fails and explicit annotation is required. |
Strong typing |
Types are exhaustively checked and there are very few implicit conversions.
|
Algebraic data types |
Types created using enums and structs. Unlike other languages, Rust enums can
hold named discriminants with associated data of arbitrary type. That combined with Rust's
matching operations simplify state and error handling. Examples: Matching:
|
Generics |
Generics provide types and functions with unspecified parameters,
supporting code reuse and abstraction
over types. Generic parameters are specified at the call site, often bound by constraints,
e.g., The function |
Traits | Traits are similar to Java and C# interfaces. They defined shared behavior that types can implement, supporting abstraction over behavior. Traits define behavior by declaring trait specific functions. A trait may, but need not, define its function's contents. Generics and Traits are covered here. |
1.0 Initialization
1.1 Initializing language-defined types
/*-- initialize language defined types --*/
let i = 42i8; // data type specified
let b = true; // data type inferred
let f:f32 = 3.14159; // variable type specified
let c:char = 'z';
let sl:&str = "a literal string";
let second = sl.chars().nth(1);
let st = "an owned string".to_string();
let fourth = st.chars().nth(3);
let arr:[i32; 3] = [1, 2, 3];
let first = arr[0];
let tp: (i32, f64, char) = (1, 2.5, 'a');
let second = tp.1;
let iref: &i8 = &i;
let aref: &[i32; 3] = &arr;
Output
-------------------------
create and initialize
-------------------------
--- initialize language-defined types ---
--- i8 ---
i, i8
value: 42, size: 1
--- bool ---
b, bool
value: true, size: 1
--- f32 ---
f, f32
value: 3.15927, size: 4
--- char ---
c, char
value: 'z', size: 4
--- &str ---
ls, &str
value: "literal string", size: 16
second, core::option::Option<char>
value: Some('i'), size: 4
--- alloc::string::String ---
st, alloc::string::String
value: "an owned string", size: 24
fourth, core::option::Option<char>
value: Some('o'), size: 4
--- [i32; 3] ---
arr, [i32; 3]
value: [1, 2, 3], size: 12
first, i32
value: 1, size: 4
--- (i32, f64, char) ---
tp, (i32, f64, char)
value: (1, 2.5, 'a'), size: 16
second, f64
value: 2.5, size: 8
--- &i8 ---
iref, &i8
value: 42, size: 8
--- &[i32; 3] ---
aref, &[i32; 3]
value: [1, 2, 3], size: 8
second, i32
value: 2, size: 4
--- &str ---
lscs, &str
value: "iter", size: 16
second, core::option::Option<char>
value: Some('t'), size: 4
--- &[i32] ---
sla, &[i32]
value: [2, 3], size: 16
second, i32
value: 3, size: 4
1.11 Specifying locations
/*-- static, stack, and heap allocations --*/
static PI:f64 = 3.1415927; // static memory
// static address: &PI = 0x7ff730eb7668
------------------------------------
let f:f64 = 3.1415927; // stack memory
// stack address: &f = 0x40a1cff8c0
------------------------------------
let s:&str = "3.1415927"; // stack memory
// stack address: &str = 0x40a1cff910
------------------------------------
let g = box::new(3.1415927f64); // heap memory
// heap address: &*g = 0x1b78cc05bd0
1.2 Initializing standard library types
/*-- standard library types --*/
let v:Vec<i32> = vec![1, 2, 3, 2, 1];
// size: 24 bytes, control block holds
// ptr to ints on heap, length, capacity
------------------------------------
let st:String = "an owned string".to_string();
// size: 24 bytes, control block holds
// ptr to chars on heap, length, capacity
------------------------------------
let mut vecdeq = VecDeque::<f64>::new();
vecdeq.push_front(1.0);
vecdeq.push_front(1.5);
vecdeq.push_front(2.0);
vecdeq.push_front(1.5);
vecdeq.push_front(1.0);
------------------------------------
let mut map = HashMap::<&str,i32>::new();
map.insert("zero", 0);
map.insert("one", 1);
map.insert("two", 2);
Output
--- initialize std::lib types ---
vec, alloc::vec::Vec
value: [1, 2, 3, 2, 1], size: 24
st, alloc::string::String
value: "an owned string", size: 24
vecdeq, alloc::collections::vec_deque::VecDeque
value: [1.0, 1.5, 2.0, 1.5, 1.0], size: 32
map, std::collections::hash::map::HashMap<&str, i32>
value: {"one": 1, "zero": 0, "two": 2}, size: 48
2.0 Safety and Ownership
- Read and write operations happen only within program allocated memory.
- References are guaranteed to refer to valid targets.
- Threads may share data only within the confines of a lock. Each thread must acquire a lock to access data and then release it to enable other threads acess.
2.1 Single Ownership Policy
- Binding a name to a value assigns ownership of the value to the named variable. Only the owner has the authority to modify, use, or deallocate the value. No other variable can access or modify the value without borrowing the priviledge to do so from the owner.
-
A collection like
Vec<T> owns its members. In the code segment below,v owns the vector, and the vector owns the elements[1, 2, 3] . Program code can access them by taking a reference:let v = vec![1, 2, 3]; let rv = &v3[1];
2.2 Borrows - Rust References
-
References are used to borrow access to a variable from its owner, e.g.,
let rv = &v . Rust enforces strict rules for the use of references in order to guarantee memory and thread safety. -
- Non-mutable references can share read-only access to a value. Multiple shared references to a single owned value are valid.
- Mutable references, often called exclusive references, provide exclusive access to the value. No other references can access the value until the end of the mutable reference's lifetime. That usually occurs when the mutable reference goes out of scope.
- These rules apply to use, not declaration.
-
This is a strong constraint that may occasionally affect the way programs are designed.
No shared mutability ensures that references to a collection, like
Vec<T> , don't dangle when the collection reallocates its resource memory to provide more capacity.Example: Attempt to mutate collection with active reference
show_op("attempt to mutate Vec while immutable ref exists"); let mut v3 = vec![1, 2, 3]; let rv3 = &v3[1]; // ok to declare v3.push(4); // v3 will reallocate if capacity is 3 println!(" rv3: {rv3:?}"); // not ok to use
Example: Compiler error message
C:\github\JimFawcett\Bits\Rust\rust_data > cargo run Compiling rust_data v0.1.0 (C:\github\JimFawcett\Bits\Rust\rust_data) error[E0502]: cannot borrow `v3` as mutable because it is also borrowed as immutable --> src\main.rs:417:3 | 416 | let rv3 = &v3[1]; // ok to declare | -- immutable borrow occurs here 417 | v3.push(4); // v3 will reallocate if capacity is 3 | ^^^^^^^^^^ mutable borrow occurs here 418 | println!(" rv3: {rv3:?}"); // not ok to use | ------- immutable borrow later used here For more information about this error, try `rustc --explain E0502`. error: could not compile `rust_data` (bin "rust_data") due to previous error
2.3 Copy, Move, and Clone Operations
- Copy of primitives and Move are efficient because they copy only a few bytes. Clone is expensive for a collection as it copies both the collection control block and all its heap resources.
- Some Rust types implement the Copy trait. All primitives and aggregations of primitives are "copy" types. All other types are "move".
-
For all copy types, construction, assignment, and pass-by-value results in data copy
with no transfer of ownership, so the source is still valid.
The
str type represents a constant literal string in static memory. Allstr 's are accessed by a reference,&str . The reference&str is copy. So copying an&str results in a second reference pointing to the original address.Figure 1. Str Copy Example: &str copy
let lstrs = "literal string"; // copy type let lstrd = lstrs; println!(" source: {lstrs:?}, address: {:p}", lstrs); println!(" destin: {lstrd:?}, address: {:p}", lstrd); nl(); println!(" Note: a literal string is a fixed value in memory. All access occurs through a reference, so copies just copy the reference. Both variables point to the same address." );
Example: Results of &str copy
--- direct copy &str --- source: "literal string", address: 0x7ff7e742b590 destin: "literal string", address: 0x7ff7e742b590 Note: a literal string is a fixed value in memory. All access occurs through a reference, so copies just copy the reference. Both variables point to the same address.
-
For all move types ownership will be transferred to another variable by construction,
assignment, or pass-by-value as a function argument. A "moved-from" variable
is invalid. Any use of that will result in a compile error.
Move is a very efficient operation, copying only a few bites instead of the entire object.
When a copy is needed, the developer can choose to explicitly clone the original value.
Figure 2. String Move Example: String Move
let s = String::from("a string"); let addrvec = &s; let addrzero = std::ptr::addr_of!(s.as_bytes()[0]); println!("address of s: {:p}", addrvec); println!("address of first byte of s's char buffer: {:p}\n", addrzero); let t = s; // move let addrvec = &t; let addrzero = std::ptr::addr_of!(t.as_bytes()[0]); println!("address of t: {:p}", addrvec); println!("address of first byte of t's char buffer: {:p}\n", addrzero);
Example: Results of String Move
--- demo_move for String --- --- let s = String::from("a string") --- address of s: 0x25758ff558 address of first byte of s's char buffer: 0x1725ee5dcd0 --- let t:String = s; // move --- address of t: 0x25758ff600 address of first byte of t's char buffer: 0x1725ee5dcd0 Note: s and t are unique objects that share same buffer but now, s is invalid
-
Move types usually implement the
Clone trait with the functionclone() . Calling clone makes an independent copy of the source. The original and clone are independent entities with different owners. They have the same values immediately following the clone operation, but may mutate to different values, independently.Figure 3. String Clone Example: String Clone
let s_src = String::from("a string"); let s_src_addr = &s_src; let s_src_bufaddr = std::ptr::addr_of!(s_src.as_bytes()[0]); println!(" s_src: {:?}, address: {:p}", s_src, s_src_addr); println!(" s_src_bufaddr: {:p}", s_src_bufaddr); let s_cln = s_src.clone(); let s_cln_addr = &s_cln; let s_cln_bufaddr = std::ptr::addr_of!(s_cln.as_bytes()[0]); println!(" s_cln: {:?}, address: {:p}", s_cln, s_cln_addr); println!(" s_cln_bufaddr: {:p}", s_cln_bufaddr); nl(); println!(" Note: s_src and s_cln have different addresses and their buffers have different addresses. So they are unique entities.");
Example: Results of String Clone
--- string clone --- s_src: "a string", address: 0x7610d0f390 s_src_bufaddr: 0x229473682f0 s_cln: "a string", address: 0x7610d0f448 s_cln_bufaddr: 0x22947375d00 Note: s_src and s_cln have different addresses and their buffers have different addresses. So they are unique entities.
2.4 Indexing
The array, array slice, and all of the std::library sequential collections support indexing. Rust tracks index operations in real time and, if an out-of-bounds index is used, the current thread of execution will "panic" causing an orderly thread termination before any reads or writes are executed. This prevents a memory access vulnerability common to other languages.2.5 Analysis and Display Functions
Analysis & Display Function Code
#![allow(unused_mut)]
#![allow(dead_code)]
#![allow(clippy::approx_constant)]
/* rust_data::bits_data_analysis.rs */
/*-----------------------------------------------
Note:
Find all Bits code, including this in
https://github.com/JimFawcett/Bits
You can clone the repo from this link.
-----------------------------------------------*/
use std::fmt::Debug;
/*-- show_type --------------------------------------
Shows compiler recognized type and data value
*/
pub fn show_type<T: Debug>(t: &T, nm: &str) {
let typename = std::any::type_name::<T>();
print!(" {nm}, {typename}");
println!(
"\n value: {:?}, size: {}",
// smart formatting {:?}
t, std::mem::size_of::<T>()
// handles both scalars and collections
);
}
/*---------------------------------------------------
show string wrapped with long dotted lines above
and below
*/
pub fn show_label(note: &str, n:usize) {
let mut line = String::new();
for _i in 0..n {
line.push('-');
}
print!("\n{line}\n");
print!(" {note}");
print!("\n{line}\n");
}
pub fn show_label_def(note:&str) {
show_label(note, 50);
}
/*---------------------------------------------------
show string wrapped with dotted lines above
and below
*/
pub fn show_note(note: &str) {
print!("\n-------------------------\n");
print!(" {note}");
print!("\n-------------------------\n");
}
/*---------------------------------------------------
show string wrapped in short lines
*/
pub fn show_op(opt: &str) {
println!("--- {opt} ---");
}
/*---------------------------------------------------
print newline
*/
pub fn nl() {
println!();
}
/*---------------------------------------------------
Show initialization of Rust's types
*/
fn show_formatted<T:Debug>(t:&T, nm:&str) {
show_op(std::any::type_name::<T>());
show_type(t, nm);
}
3.0 Epilogue
Most of the operations for language-defined and std::library types dicussed in this page will be covered again for user-defined types in Bits_ObjectsRust.4.0 References
Link | Comments |
---|---|
Crate std | Rust documentation about primitive and library defined types. |
fat pointers | Pointer to slices or trait objects contain an address and a length for these dynamically sized types. |
Rust Type System #1 | Ownership, aliasing, lifetime |
Rust Type System #2 | Algebraic types, generic associated types |
Rust Type System #3 | Generic container types, interior mutability |