Idioms and Patterns Story
Object Models
Home  Repo
E  C  Sects  Bites 
Object Models
Tooling

Top, Bottom

Object Models

Comparisons of C#, C++, and Rust


"I get the impression that those who have experience of languages like C, C++, even assembler, get the idea of Rust more easily. They have already spent years chasing bugs caused by memory usage errors, null pointers, data races in threaded code etc. They appreciate how Rust is helping with all those issues."
ZiCog

1. Prologue

In this story we look at common idioms and patterns used to build programs and software systems. Each will be implemented in the three programming languages: C#, C++, and Rust. That may help developers familiar with one of the languages to start learning the others. It also provides a useful cache of techniques to help remember key implementation strategies. In order to write code effectively, you need to have an initial design and a good mental model of what your program will do when it executes programming language statements. That understanding helps you predict, with some precision, context appropriate sequences of statements to implement your ideas. Without that understanding you don't know what to write. You will frequently find that each language implements an idiom in different ways, sometimes radically different. Each of them is built around a different object model that governs how code you write executes in practice. There are other differences discussed below and in the idiom and pattern implementations in following pages.

2. Models:

The two most important models describe how a program uses memory, and how it creates and manipulates objects residing in memory. Most modern imperative languages have approximately the same memory model. There are, however, significant differences in how these languages use objects. The intent of this page is to help you understand differences in object models for C#, C++, and Rust. We first review a memory model, common to all three of these languages.
Definition: Memory Model
Programs for the languages we consider here divide memory into three parts:
  1. Static memory holds its contents for the entire lifetime of a program. Static memory holds compiled code and values declared as "static" which are initialized only once.
  2. Stack memory is allocated when a thread of execution enters a program scope and is deallocated when the thread leaves that scope. Each scope is defined by a code block enclosed by braces: '{' and '}'.
  3. Heap memory is allocated (from a process heap) when a program requests storage for some type and lives until the program releases that storage or the program terminates.
Definition: Object Model
Instances:
An instance is a typed memory location initialized with some value.
Objects:
An object is an instance of a non-primitive type.
Types:
A type is a pattern for creating instances. If the type is defined by the language we usually use the term primitive, e.g., int or double. Otherwise we just use the term type, e.g., struct, vector. Non-primitive types are defined by standard libraries and developers. All three of these languages support development of new types, e.g., person, order, graph, ... To indicate that a is an instance of type A we use the notation:
a ε A
An Object Model describes:

  1. Types:
    A type is defined by a memory configuration with specified set of allowed operations. Examples: integers, floating point numbers, vectors, ... An instance results from instantiating a type, i.e., declaring a name associated with the type and allocating and initializing the memory location it represents.

  2. Execution environment:
    Defines where objects are hosted, how their values are accessed, how operations on objects are processed, how errors are handled, and what information about objects are available to programs.

  3. Scope:
    Program scopes are defined by code blocks enclosed in braces, e.g., '{' and '}'. They come in two flavors: compile-time only and run-time scopes.
    Compile time scopes define names and access priviledges for namespaces, structs, classes, and enums. They affect what program statements are allowed to compile, but have no run-time affects. Run-time scopes are defined by functions, methods, lambdas, and control statements. They have both compile-time and run-time affects. Whenever a thread of program execution enters a scope, a new allocation of stack memory, called a stack-frame, is acquired to store local variables and calling parameters for methods, functions and lambdas. When the thread of execution leaves that scope the memory is returned to its process for reuse.

  4. Object layout:
    Primitive types, defined by the language, occupy a contiguous block of memory and may be copied and assigned. A copy or assignment results in two named memory locations each with the same value immediately following the operation, subject to cache flushing. Objects are instances of types defined by the language libraries and by developers. They often occupy two or more disjoint regions of memory, linked internally by addresses, usually with a control block in static memory, and data stored in a process heap. Assignment and copy operations on these objects have semantics defined differently by each of the languages we consider here.

  5. Object relationships:
    Figure 1. Object Relationships
    There are four relationships between objects for all three languages. C++ offers an infrequently used fifth relationship:
    • Inheritence:
      Base types declare methods and associated functions. They may, but often don't, supply implementations. Base methods may be declared virtual. That causes compilation to layout a virtual function pointer table used for dispatching methods dynamically, i.e., decide at run-time which derived class override to invoke. Derived classes inherit from a base and are obligated to supply implementations for every undefined method declared in the base. A base class becomes part of the memory footprint of every class that derives from that base.
    • Composition:
      Include in the composing class instance an instance of a composed class. This supports factoring complex operations into sets of simpler parts. The composed instance becomes part of the memory footprint of the composing class; and is constructed as part of the composer construction.
    • Aggregation:
      Include in the aggregating class a reference to an aggregated class. The aggregated class instance is created, at run-time, by one of the aggregating class methods. An aggregated class instance does not become part of the footprint of the aggregating class, and is not inherently created as part of the aggregating class construction.
    • Using:
      A reference to a used class is passed as an argument in a using class method. The used class is not created by the user and is not part of the using class footprint.
    • Friendship:
      The C++ language allows a class to grant friendship to a non-class function, or to another class. Friendship allows the friend access to private data of the granting class. This is primarily used to support overloading insertion and extraction operators that are used for, but are not part of the granting class.

  6. Instance management: Managing instances consists of:
    • Binding
      Binding associates a name with a newly created typed memory location. let dst:D = D::new() // Rust
      D dst // C++
      D dst = new D() // C#
      C# and C++ allow only one binding of a specific name within the same scope. Rust allows rebinding the same name multiple times.
    • Assignment
      The destination of an assignment aquires the same state as the source. dst = src The semantics of assignment are quite different for each of the three languages (see below).
    • access
      The ability of one instance to read contents of another instance. Rules for access are determined by ownership policies, class declarations, and location. The rules differ significantly for the three languages (see below).
    • mutation
      The ability to modify contents of an accessible instance. Mutation rules are determined by ownership policy, binding syntax, and class declaration.
    • destruction
      Return resources attached to an instance to the program's process. Rules for destruction are determined by the execution environment. C# uses garbage collection; C++ and Rust use scope-based deallocation.

  7. Operations: Operations are operators, functions, methods, and lambdas. The biggest differences for the three languages are which operations can be overloaded.
    • operators
      Operators are functions that have an alternate symbolic representation, e.g., +, -, =, ...
    • functions
      Functions are named code blocks that accept zero or more parameters and may return a value of some type.
    • methods
      Methods are functions that are bound to a class and have access to the class's member data.
    • associated methods
      Associated methods are bound to a class, but do not have access to its member data.
    • lambdas
      Lambdas are anonymous types that can be defined locally within functions and methods. They compile to class instances with a function that contains the lambda code block and member data that is captured from the scope in which the lambda was defined.

3. Object Model Details:

Topic C# C++ Rust
Types Value types:
bool, int, char, double, enum, struct, ..
Reference types:
string, class, interface, array, delegate, ..
Collections are reference types:
ArrayList, Hashtable, Stack, Queue, Dictionary, SortedList, ..
Primitive types:
bool, int, unsigned int, char, wchar_t, float, double, ..
Aggregate types:
array, struct, class, enum
Collections:
vector, deque, set, map, unordered_set, unordered_map, ..
Primitive types:
bool u8, i8, .. u64, i64, f32, f64
Aggregate types:
array, struct, enum
Collections:
Vec, VecDec, HashSet, HashMap, ..
Execution environment: Common Language Runtime (CLR), a stack-based virtual machine. Runs byte-code. Manages all reference types on the heap, including every library and user defined type. Type resources are eventually garbage collected. Compiles to native code. Provides initialization and termination code that handles its process. All instances, of any type, may reside in stack, heap, or static memory. Type resources returned implicitly by type's destructor (primitive values are discarded with stack frame). Compiles to native code. Provides initialization and termination code that handles its process. All instances, of any type, may reside in stack or heap memory. Type resources returned implicitly by drop function (primitive values are discarded with stack frame).
Object Relationships: inheritance:
Reference types only. Base class embedded within memory footprint of derived.
composition:
Only value types, embedded in composer.
aggregateon:
Reference type created at run-time and stored seperately.
using:
Non-owning, given access to used object as reference argument of users member function.
inheritance:
Structs and classes only. Base class embedded within memory footprint of derived.
composition:
All types, embedded in composer.
aggregateon:
All types, created at run-time and stored seperately.
using:
Non-owning, given access to used object as reference argument of users member function.
inheritance:
Structs may inherit only Traits (similar to an interface). Traits are usually function declarations without implementation.
composition:
All types, embedded in composer.
aggregateon:
All types, created at run-time and stored seperately.
using:
Non-owning, given access to used object as reference argument of users member function.
Object Layout
Topic C# C++ Rust
Instance management C# CLR creates reference type instances on its managed heap. Instances are eventually removed by garbage collection. CLR creates exceptions and supports event notification. Provides detailed type information from reflection. Reference types do not provide value type behavior. Copy and assignment do not result in independent instances. These operations result in multiple references to the same heap-based object. C# reference types cannot be value types, but value behavior can be approximated in the following way. The C# string class has no interface for mutation. To change the character sequence associated with some named string reference, a new string instance is created with capacity to hold the desired change, and uses the original string's character sequence to build the new character sequence. Note that this is a new instance, and the old is readied for garbage collection. C++ supports creation of instances in stack, heap, or static memory. Instance resources are returned when an instance goes out of scope, by calling its destructor. That always happens when a thread of execution leaves the scope where an instance was defined. C++ supports function and operator overloading for instance management, e.g., copy and move constructors, copy and move assignment operators, and destructors. Appropriate definition of these operators enable value behavior, e.g., creation and assignment of instances that are independent replicas of the source instances. If bases and members of a class have correct copy, assignment, and destruction semantics, a class does not need to provide these operators - compiler generated versions work correctly. These conditions are satisfied if class members are limited to primitive types, strings, and collection types provided by the standard template library (STL). Rust supports creation of instances in stack or heap memory. Instances are removed by a drop operation that is implicitly called when a thread of execution leaves the scope in which the instance was defined. Rust does not support operator overloading, so implict value operation is supported only for primitive types. For other types copy and assignment results in a move operation which transfers ownership of source instance's resources to the destination instance and invalidates the source. This is usually a very efficient operation. Rust does support clone operations which a program calls explicitly, resulting in an independent replica of the source instance.
Topic C# C++ Rust
Operations: All C# operations are processed by the CLR stack-based virtual machine. For example, an add operation starts by loading the first addend from memory and pushing onto CLR stack. It then loads the second addend and pushes it on the stack. Then an add operation runs that pops the two arguments and adds them and pushes the result onto the stack. The operation completes by popping the result and storing in memory. At a more abstract level, in C#, all operations are methods of a class. A method may be resolved at compile time and invoked directly, or, if virtual, may be dispatched dynamically via a virtual function pointer table. If the called method is known at compile-time, C#'s compiler will optimize away dynamic dispatching, binding directly to the known method. The value of every variable of non-primitive type is a reference to an instance on the CLR heap. For example, every declaration of non-primitive class member data is a declaration of a reference to an instance of that data, stored on the managed heap, not the instance itself. The assignment operator operator cannot be overloaded in C#. Assignments are applied to references to class instances, with the result that the source and destination references refer to the same instance. C# supports definition of lambdas within a member function's code block. Lambdas are compiled into a class with a method that captures the lambda code block, and, if the code block uses data defined prior to the lambda definition in the same parent scope, then that data becomes member(s) of the compiler generated class. All C++ operations are processed directly in native code. For example, an add operation loads both arguments into a register, adds them, and stores the result into the specified location. C++ operations may be free functions, static functions bound to a class, or member functions with access to the class's member data. Non-static member functions may be declared virtual, resulting in the generation of a virtual function pointer table. That enables, for derived classes, dynamic dispatch where the function is called via a pointer typed as a pointer to a base. If the base pointer is bound to a derived class instance, the call resolves to the function defined by the class of that instance. The name of every non-pointer or non-reference variable is the name of an instance of some type. C++ supports overloading the assignment operator to assign the state of a source instance to the state of a destination instance. But these instances are independent. Immediately following assignment they have the same state, but may then evolve independently. C++ also supports the definition of lambdas. A lambda is compiled into a class which overloads the operator(), a method that captures the lambda code block, and, if the code block uses data defined prior to the lambda definition in the same parent scope, then that data becomes member(s) of the compiler generated class. All Rust operations are processed in native code. Rust assignment operations and pass by value function calls result, for primitive types, in copy operations. For non-primitive types these operations result in moves, e.g., transfers of ownership from the source to the destination. That is efficient, but invalidates the source. To avoid invalidation one may assign or pass a clone of the source instance. Rust does not support overloading of any operation, e.g., no overloading of operators, of functions, or of methods. Rust qualifies its operations with Ownership Rules. Every object instance has a unique owner. The owner creates its associated instance with a binding statement. When the owner goes out of scope the instance's resources are returned to the process with an implicit drop operation, much like a C++ destructor. An instance has to be qualified with the mut keyword to be mutable. Reference views of an instance can be created and, if mutable, a reference is allowed to mutate the instance it views. When references are active, the owner of the instance is not allowed to mutate its data, until all references become inactive by going out of scope. That's what happens when you pass an argument to a function by reference. When the call returns, the reference is no longer active. Rust also supports the definition of lambdas, in a manner very similar to that used by C++.

4. References

Reference Link Description
Why Rust? - Jim Blandy Monograph on why Rust is important
Rust users forum Answers to a broad range of questions from beginner to advanced.
A half-hour to learn Rust Code fragments with commentary that cover most of the Rust ideas.
Rust by Example Rust docs - walkthrough of syntax
Rust Cookbook Rust docs - a collection of example projects using the Rust libraries and external crates
A Gentle Introduction To Rust Read early in your Rust travels.
The Rust Book Rust docs - walkthrough of syntax
Rust cheat sheet Quite extensive list of cheats and helpers.
Rust Containers Container diagrams
The Rust Reference Book Rust's approximation of a language standard. Clear and well written, with a glossary at the end. Its github site shows that changes are still being actively incorporated.
RIP Tutorial on Rust Comprehensive coverage from RIP, the stackoverflow archive
Learning Rust ebook Comprehensive coverage from RIP, stackoverflow archive
Rust - Awesome Book lots of interesting discussions from RIP, the Stackoverflow Archive
Shared reference &T and exclusive reference &mut T More accurate description than immutable reference and mutable reference
Getting Started with Rust on Windows and Visual Studio Code Install Rust, Verify, Configure Visual Studio Code, Create Hello World, Create Build Task, Configuring Unit Tests, Configure Debugging,
rust-lang.org home page Links to download and documentation
(video) Rust - Crash Course | Rustlang Code demo using basic types.
Tutorial - tutorialspoint.com Tutorials for most of the Rust parts with code examples.
Blog - Pascal's Scribbles Pascal Hertleif - Rust contributor
Blog - Barely Functional Michael Gattozzi - Rust contributor
Blog - S.Noyberg Examples of aysnc/await, tokio, lifetime, ...
Compilation details kmcallister.github.io
   

5. Epilogue

The next page discusses the tools you need to build and run projects written in C#, C++, and Rust. The pages after that provide sequences of code examples for idioms and principles in each of the three languages cited here, e.g. C#, C++, and Rust. Object model differences will often be pointed out in comments within the code blocks.