about

10/18/2022

C++ Story Data

C++ Story Code

Chapter #3 - C++ Data Types and Data

sizes, initialization, standard types

3.0 Data Prologue

This chapter focuses on data, e.g., types, type qualifiers, type conversions, data values, initialization, value conversions, and memory layout.

Most of the code examples used in this Chapter have complete working demonstrations in CppStoryRepo. You may wish to download and run the code while you are studying this Chapter.

Quick Starter Example - Easy Data In & Out with std::tuple

This example uses std::tuple, a simple data aggregator, introduced with C++11. It holds an arbitrary number of items, each with arbitrary types. We could have declared the tuple as shown in the output pane, below.

In this example we simply initialize it with literal data items, as shown in the code pane, and allow template type deduction¹ to determine types of our tuple's items.

Code: Gathering Data with std::tuple

std::tuple tp{ 1, 1.5, "two" }; /* easy in */ auto [ta, tb, tc] = tp; /* easy out */ displayType( tp, " <= std::tuple tp{ 1, 1.5, \"two\" }", false ); displayValues(ta, tb, tc); /* code only compiles with C++17 option */

Output

class std::tuple<int,double,char const *> <= std::tuple tp { 1, 1.5, "two" } 1 1.5 two

std::tuple holds a linear sequence of instances. In the first line of code std::tuple tp has been initialized with a brace initializer. In the second line each item is extracted with a structured binding.

That declared one variable for each item in the tuple, with type deduced with template-style type deduction induced by the auto declaration. So, our code has an int ta, a double tb, and a const char * tc.

Note: structured binding only works for containers holding public data, as is the case for std::tuple, std::pair, and structs with public data members.

Conclusion:

With additions to C++ since C++11 it is easy to gather data into collections and easy to extract that back into its individual items.

Template type deduction will be covered in Chapter6 - templates.

This chapter presents the first part of a two-part story about the C language-like facilities provided by C++, excluding classes and templates. The second part is about operations provided by the language, with the same exclusions.

We will, of course, cover classes and templates in a "modern C++" part, starting with Chapter 4.

3.1 Types

C++ is a statically typed language. All data is defined with a type and name, using a declaration statement, and the language enforces a definition first rule: variable use must occur after its definition, in compiler scan order.

Type

A set of allowed values
Operations that may be applied to these values

A collection of fundamental types are built into the C++ language.

Fundamental Data Types

integral: byte, bool, int, char, char16_t, char32_t, wchar_t
floating point: float, double
derived types: array, pointer, reference

These are augmented with an open-ended collection of library and user defined types:

User defined types

struct S { ... }
class C { ... }
enum class E { ... }
alias: using [NewName] = [TypeName]

3.1.1 Type Qualifiers

The C++ language defines qualifiers that modify how an instance of a type may be used: const, constexpr, and volatile. It also defines qualifiers that affect storage of data: short, long, signed, unsigned, and extern.

const qualifier - const int i;

const

The const qualifier affects what operations apply and what methods may be applied to a data item.

const prepended on a type declaration const int ci; makes ci immutable
function and method parameters void f(const std:string& s) prevent's the user variable s from being changed in f
const postpended to a method declaration and definition void X::f(int i) const { ... } prevents changes to an instance, x ε X, when invoked as x.f(i);
const applied to pointers const X* pX = &x prevents x from being changed via pX, e.g., *pX = x2; will fail to compile X* const pX = &x prevents pX being assigned a new address

Code example:

void f(const std::string& s) { if(s.size() > 0) { char first = s[0]; // ok // s[0] = 'z'; -- fails to compile since s is const } }

Code example:

X x{3}; const X* pX = &x; X x1 = *pX; // ok *pX = 2; // fails to compile

short and long Qualifiers - short int;

short and long

Qualifiers short and long affect amount of storage allocated and range of values a qualified data item may hold.

short prepended on a type definition for int short int i; guarantees that sizeof(short int) <= sizeof(int);
long prepended on a type definition for int or double long double d; guarantees that sizeof(double) <= sizeof(long double) <= sizeof(long long double)

Code example:

short int iArrs[5]{ 1, 2, 3, 4, 5 }; // size of iArrs is 10 bytes int iArr[5]{ 1, 2, 3, 4, 5 }; // size of iArr is 20 bytes long int iArrl[5]{ 1, 2, 3, 4, 5 }; // size of iArrl is 20 bytes long long int iArrll[5]{ 1, 2, 3, 4, 5 } // size of iArrll is 40 bytes

signed and unsigned Qualifiers - unsigned int;

signed and unsigned

Qualifiers signed and unsigned affect placement of ranges of values an int may hold.

signed (default) may be prepended on a type definition for int or char signed int i; acceptable values of i occupy a range symetric around the value 0
unsigned may be prepended on a type definition for int or char unsigned int j; acceptible values of j lie in a range bounded at the bottom with 0

Code example:

signed int i = -25; // range is -2,147,483,648 to 2,147,483,647 unsigned int j = 15; // range is 0 to 4,294,967,295 unsigned int k = UINT_MAX; // UINT_MAX is defined in <limits> k + 1 == 0; // value roll-over, statement is true, no compiler warning

static Qualifier - static int;

static

The static qualifier allocations storage in static memory.

static prepended on a local type definition (in a function) static int i = 3; instructs the compiler to place storage for this named entity in static memory initialized to the value 3. That means that it will be initialized exactly once, no later than the first time the function is entered. It will not be re-initialized again on subsequent entries so its value persists across function calls, only changing when the function's code assigns to it.
static prepended on a member variable of a class or struct places that variable in static memory, so every instance of the class or struct shares the same value.

extern Qualifier - extern int;

extern

The extern qualifier disables allocation of storage.

extern prepended on a type definition at namespace or global scope extern int i; provides a declaration, but instructs the compiler that another compilation unit (.cpp file) provides storage so it should not be provided in this compilation unit.

inline Qualifier - inline const int = 3;

inline

The inline qualifier for global variables was introduced with C++17.

inline prepended on a type definition in a header file inline const double pi = 3.14159; instructs the linker to provide storage only once, even if the header file is included in multiple cpp files used to build one image. This eliminates one source of multiple definition linker errors.

volatile qualifier - volatile int i;

volatile

The qualifier volatile affects what compiler optimizers may do with a data storage. Note that, unlike Java and C#, volatile qualification does not affect flushing of the cache for reads and writes.

volatile prepended on a type definition volatile int i; instructs compiler that variable i may be changed by an event external to the program and so should not be optimized to a compile-time constant.

constexpr qualifier - constexpr int N = 10;

constexpr

Qualifer constexpr makes initialized value known at compile-time.

constexpr prepended on a const integral type definition initialized with constant instance constexpr int N = 10; allows N to be used for compile time constructs like native arrays and templates. we will discuss use of constexpr with an if selection when we discuss template meta programming.

Code example:

constexpr unsigned int N = 10; double dArr[N] { 0.5, 1.0, 1.5 }; // compiles because N and initializers are known at compile-time

3.1.2 Type Specifiers

Since C++11, the language has provided special declarations, auto and decltype that use type deduction to establish the type of an expression. One important use is to provide a declaration for a lambda. Lambda types are constructed by the compiler and the names of those types are not public. Auto is a type placeholder for that construction. The decltype specifier evaluates the type of an expression which then may be used to declare a variable of that type.

auto and decltype specifiers - auto d{ 1.5 };

auto

auto is not a declaration qualifer, it is a declaration.

auto as a type definition auto r{ 3 }; auto s{ 1.5 } instructs compiler to use type deduction to suppy the type, so r is type int and s is type double.
auto is frequently used to deduce the type of a function return value or as the type of a variable enumerated in a range-based for: auto r = f();
for(auto item : container) { ... } auto type deduction is similar to the type deduction process used for templates.

Code example:

std::vector<int> vecInt { 1, 2, 3 }; for(auto& item : vecInt) { item *= 2; }

decltype

Like auto, decltype is a declaration.

decltype as a type definition decltype(make_lambda()) x; x is a lambda of the same type as created by the function make_lambda

Code example:

decltype(vecInt) vecInt2; // results in std::vector<int> instance, named vecInt2, // with default initialization

3.2 Exploring Types

Occasionally it is important to know the sizes of variables in our designs. For example, if we need to serialize an object into a byte array to pass over a socket, we need to know the size of the object. Also, knowing the sizes of types may help us understand how they work. That's illustrated in the lambda example, below.

The operator sizeof can be applied to a type or an expression:

size_t sz = sizeof(arg);

It returns the size of the arg parameter in bytes for all types except references. For references sizeof(X&) returns the size of the referenced type, not the size of the reference. The size of a reference equals the size of a pointer, which, on my work machine, is 4 bytes, e.g., 32 bits. You can verify this by constructing a class Xref with a single reference member and evaluating sizeof(Xref).

The operator typeid can be applied to a type or an expression as:

std::type_info ti = typeid(CT)

where CT is any complete type. In order to use typeid you need to include <typeinfo>. There is much less information in type_info instances like ti than you would get from reflection results with languages like C# and Java.

However, type_info has useful methods: operator==(const type_info&) and operator!=(const type_info&) to query whether two instances belong to the same type, and name(), so we can discover the full type name of some instance.

Example: Using sizeof and typeid to explain how lambdas work

Here is a template function, useful for exploring types, that uses both sizeof and typeid:

 template<typename T>
  void displayType(const T& t, const std::string& msg = "", bool showSize = true)
  {
    std::cout << "\n  ";
    if (showSize)
      std::cout << sizeof(t) << " = size of ";
    std::string typeName = typeid(t).name();
    if (typeName.size() > 75)
      typeName = typeName.substr(0, 75) + "...";
    std::cout << typeName;
    if (msg.size() > 0)
      std::cout << msg;
  }

It provides one way to begin understanding how lambdas work:

  int i{ 42 };
  auto f = [i](const std::string& s) {  
    std::cout << s << i;
  };
  f("\n  The meaning of life is ");
  displayType(f);

Here, f is a lambda, declared with auto because we don't know the lambda type. the [i] part of the expression says that this lambda captures the value of i, e.g., 42. The (const std::string& s) defines a string parameter, passed to the lambda code by const reference. The body { ... } simply displays the parameter and the captured integer value. The output of this code is:

                  The meaning of life is 42
  4 = size of class <lambda_1bed100c18f4584a5d93f1a5d7e27280>

So, the displayType function has informed us that the lambda is a class, it disclosed the lambda's type name, obviously not intended for human consumption, and showed us that its size is just the size of the captured int value. It's highly likely that the compiler created a functor, something like this:

                  class lambda_1bed1... {
  public:
    void operator()(const std::string& s) {
      std::cout << s << i;    
    }
  private:
    int i = 42;
  };

So lambdas can capture data, be passed into or returned from functions, taking their data with them (as long as they hold a value, not a reference). We say that the lambda and its local scope are a closure. A lambda can use any data in its closure as part of its operation. Essentially, a lambda has two sets of inputs: from its capture list [i, j, x] which come from the lambda's closure, and (T1 arg1, T2 arg2, ... ) which are passed in from the scope where the lambda is invoked. The lambda's designer provides the capture inputs and the lambda user provides the arguments.

In the next section, we continue exploring types by looking at their layout in memory.

3.3 Type Structure

Fig 1. String Layout

There are also issues of type structure we need to think about. The fundamental types place all their value information in one place, in the stack frame where they are declared. However, not all types are structured that way. A std::string, for example, stores its control state, iterator, character count, etc., in the stack frame where it is declared, but all its user state, e.g., its character sequence, is stored in the heap. The sizeof operator measures only the stack allocation. So, for example, it will return the same size for an empty string and a string with 100 characters. For all of the STL sequential containers with contiguous heap memory, e.g., string, vector, and deque, you can call their capacity() method to measure that allocation. Here's an example:

std::string str("content of a std::string"); std::cout << "\n " << str; std::cout << "\n string size = " << (str.size() + 1) * sizeof(char); std::cout << "\n allocation size = " << str.capacity() * sizeof(char);

When this code executes you get:

content of a std::string string size = 25 allocation size = 31

For all the other containers measuring heap allocation is much more effort, but the ideas will be about the same. You would have to iterate through the container, measuring each of the elements, accessed by the container's iterator. Even the fundamental types differ in structure:

All of the fundamental types except floats have a simple structure consisting of a sequence of bytes in one location.
The float type partitions its storage into:
- 1 bit sign
- 8 bit exponent
- 23 bit fraction
The double type has the partitions:
- 1 bit sign
- 11 bit exponent
- 52 bit fraction

This float/double structure has consequences:

The majority of values don't have an exact representation. All of the intergers have exact floating representations, but numbers like 1.0/3.0 have an infinite number of trailing digits, e.g., 0.33333333333333...., and will be stored imprecisely.
So these values have granularity with differences within granules that can't be represented.
The size of the granularity depends on the magnitude of the float. The fractional value of one float may be the same as the fractional value of another, but if the exponent of one is larger than the other, the absolute values of granularity will differ.
It is entirely possible for two floats that are conceptually distinct to have the same floating representation, so you need to be careful with equality comparisons.
Repeating operations on a float in a loop may exhibit increasing error in computed results due to the accumulation of granularity errors.

Here's data from an experiment with floats:

--- demo float granularity --- float f = 1.00000 f = (f + 1.0)/3.0 ==> 0.666667 f = (f + 1.0)/3.0 ==> 0.555556 f = (f + 1.0)/3.0 ==> 0.518519 f = (f + 1.0)/3.0 ==> 0.506173 f = (f + 1.0)/3.0 ==> 0.502058 f = (f + 1.0)/3.0 ==> 0.500686 f = (f + 1.0)/3.0 ==> 0.500229 f = (f + 1.0)/3.0 ==> 0.500076 f = (f + 1.0)/3.0 ==> 0.500025 f = (f + 1.0)/3.0 ==> 0.500008 Reversing process: f = 3.0 * f - 1.0 ==> 0.500025 f = 3.0 * f - 1.0 ==> 0.500076 f = 3.0 * f - 1.0 ==> 0.500229 f = 3.0 * f - 1.0 ==> 0.500686 f = 3.0 * f - 1.0 ==> 0.502057 f = 3.0 * f - 1.0 ==> 0.506170 f = 3.0 * f - 1.0 ==> 0.518510 f = 3.0 * f - 1.0 ==> 0.555531 f = 3.0 * f - 1.0 ==> 0.666594 f = 3.0 * f - 1.0 ==> 0.999782

There are many domains where float granularity matters:

Scientific computing, e.g., processing models for particles, gravity,
super-conductors, gene expression, climate dynamics, ...
Processing financial models for derivative and currency trading
Graphics processing using projections and ray-tracing
Medical imaging
Radar and Sonar signal processing
Finite-element analysis for fluid flow
Navigation computing for aircraft and spacecraft
Autonomous vehicle control.
Neural network modeling
Marketing models
Control of machine tools used to create manufactured products
....

The point is that you may well, during the course of your career,
work in domains where floating point granularity matters.

3.4 Managing C++ Type System with Casts

Cast operations are a mechanism for mapping an instance of one type into an instance of another type or modifying the type rules applied to a given instance. Since the introduction of C++11 we've had four modern cast forms:

static_cast Syntax	static_cast Symantics
T₂ t₂ = static_cast<T₂>(t₁);	Convert t₁ ε T₁ to an instance of T₂ usually by constructing a new T₂ instance using data from t₁.

Example: static_cast<T>

 /*--------------------------------------------------------------------------------------
  purpose of static_cast is to create a new instance of a destination type, based on
  data stored in the source type.
 --------------------------------------------------------------------------------------*/  

  int i = 1.75;  // i = 1, with compiler warning of loss of significance
  int j = static_cast<int>(1.75);  // j = 1, no compiler warning

const_cast Syntax	const_cast Symantics
T& tRef = const_cast<T&>(t);	Strip off const qualifier from a const type. Intent is to pass const variable to platform API functions which can't inform compiler that they won't change variable, but designer knows they won't.

Example: const_cast<T>

 /*--------------------------------------------------------------------------------------
  purpose of const_cast is to allow passing const data to functions that won't change
  value even though not declared as const functions, OS API calls for example.
 --------------------------------------------------------------------------------------*/   

  void mockAPIfunction(std::string* pStr) {
    std::cout << "\n  inside mock API function: " << *pStr;
  }

  void demoConstCast(const std::string& str) {

    displaySubtitle("const_cast");

    std::cout << "\n  " << str;
    //mockAPIfunction(&str);  fails to compile since str is const

    /*--- useful operation using sRef ---*/

    std::string& sRef = const_cast<std::string&>(str);
    // created non-const reference to const str

    mockAPIfunction(&sRef);  // succeeds since sRef is not const

    /*--- evil operation on sRef, violates contract of function interface ---*/
    /* don't do this */
    sRef = "changed";
    std::cout << "\n  " << str;  // now has changed value
    std::cout << "\n";
  }

  const_cast
 ------------
  const string
  inside mock API function: const string   
  changed

This illustrates that the compiler will let you change the source of the const_cast but you should never do that.

dynamic_cast Syntax	dynamic_cast Symantics
Derived* pDer = dynamic_cast<Derived*>(pBase) if(pDer != nullptr) { pDer->Derived::memFun(); }	Intent is to call, starting with Base pointer, Derived member function, not in Base interface. If Base pointer does point to instance of the requested type, the cast returns pBase address, typed as a Derived pointer, otherwise it returns nullptr.

Example: dynamic_cast<T>

 /*--------------------------------------------------------------------------------------
  dynamic_cast grants access to derived class interfaces starting with base pointer
 --------------------------------------------------------------------------------------*/   

  class Base {
  public:
    virtual ~Base() {}
    virtual void say() {
      std::cout << "\n  hello from Base::say() via " << typeid(*this).name();
    }
  };

  class Derived1 : public Base {
  public:
    virtual ~Derived1() {}
    void say1() {
      std::cout << "\n  hello from Derived1::say1() via " << typeid(*this).name();
    }
  };

  class Derived2 : public Base {
  public:
    virtual ~Derived2() {}
    void say2() {
      std::cout << "\n  hello from Derived2::say2() via " << typeid(*this).name();
    }
  };

  auto putline = [](int n=1) { 
    for(int i=0; i<n; ++i)
      std::cout << std::endl; 
  };

  void demoDynamicCast() {

    displaySubtitle("dynamic_cast");

    std::cout << "\n --- calls from objects ---\n";
    Base b; b.say();
    putline();
    Derived1 d1; d1.say(); d1.say1();
    putline();
    Derived2 d2; d2.say(); d2.say2();
    putline();

    std::cout << "\n --- call via base pointer ---\n";
    Base* pBase = &d1; pBase->say(); // pBase->say1(); not accessible from B*
    putline();

    std::cout << "\n --- call via dynamic_cast derived pointer ---\n";

    Derived1* pDer1 = dynamic_cast<Derived1*>(pBase);
    if (pDer1) {
      pDer1->say1();
    }
    putline();
  }

  dynamic_cast
 --------------

 --- calls from objects ---
  hello from Base::say() via class Base

  hello from Base::say() via class Derived1
  hello from Derived1::say1() via class Derived1

  hello from Base::say() via class Derived2
  hello from Derived2::say2() via class Derived2    

 --- call via base pointer ---
  hello from Base::say() via class Derived1

 --- call via dynamic_cast derived pointer ---
  hello from Derived1::say1() via class Derived1

reinterpret_cast Syntax	reinterpret_cast Symantics
T_dst* pT_dst = reinterpret_cast<T_dst*>(pT_src);	Apply type rules of T_dst to t_src ε T_src. Usually used for packing data into byte arrays and unpacking back to original type.

Example: reinterpret_cast<T>

 /*--------------------------------------------------------------------------------------
  purpose of reinterpret_cast is to apply new type rules to an existing instance.
  - packing double's bytes into byte array
  - unpacking byte array into another double
  - illustrates how data might be marshalled over a socket channel, where the
    byte array pretends to be the socket channel
  --------------------------------------------------------------------------------------*/    

  void demoReinterpretCast() {

    displaySubtitle("reinterpret_cast");

    double d1{ 3.5 };
    double d2;
    size_t Max = sizeof(d1);

    /* create byte array on heap referenced by std::unique_ptr<std::byte> */

    std::unique_ptr<std::byte> pBuffer(new std::byte[Max]);  // owning pointer
    std::byte* pBuffIndex = pBuffer.get();                   // non-owning pointer

    /* pack double d1 into byte array */

    std::byte* pByteSrc = reinterpret_cast<std::byte*>(&d1);
    std::byte* pSrcIndex = pByteSrc;  			   // non-owning pointers

    for (size_t i = 0; i < Max; ++i) {
      *pBuffIndex++ = *pSrcIndex++;
    }

    /* unpack byte array into double d2 */

    if (sizeof(d2) == sizeof(d1)) {
      std::byte* pByteDst = reinterpret_cast<std::byte*>(&d2);
      std::byte* pDstIndex = pByteDst;  			   // non-owning pointers
      pBuffIndex = pBuffer.get();
      for (size_t i = 0; i < Max; ++i) {
        *pDstIndex++ = *pBuffIndex++;
      }
    }

    /* show that src and dst have the same values */

    std::cout << "\n  src double = " << d1;
    std::cout << "\n  dst double = " << d2;

    // byte array on heap will be deallocated here
    // as std::unique_ptr goes out of scope
  }

    reinterpret_cast
   ------------------     
    src double = 3.5
    dst double = 3.5

C++ also defines a member function cast operator that supports creating an instance of a specified destination type from an instance of the casting class:

C++ cast operator

class X {
public:
operator Y () { ... }
};

which supports expressions like:

X x;
Y y = x;

If we declared the cast as explicit:

class X {
public:
explicit operator Y () { ... }
};

then we would have to explicitly use static_cast:

X x;
Y y = static_cast<Y>(x);

3.5 New Standard Types

The std::pair container has been part of standard C++ libraries prior to C++11. The remaining containers were all introduced with or after C++11. You'll see std::pair and std::tuple used in the examples below. All of the rest have been used in sample code that accompanies this chapter.

Special Containers

pair	contains two elements which may have distinct types
tuple	contains finite number of elements with distinct types
initializer_list	contains sequence of elements, all of the same type normally filled with initialization list, e.g., { 1, 2, 3, ... }
any	holds value of any type, provides std::any_cast for retrieval
optional	used to return values or signal failure to return
variant	similar to any, but only holds values from a specified set of types

These new standard types are not essential, but are very effective "friction" reducers. They make routine code we write easier to produce and to understand. We will see examples of their use throughout the remainder of this story.

3.6 Initialization

We've already encountered initialization of data. The section will take a closer look to see how it works. For the fundamental types, there are four sets of syntax:

T t = u
T t(u);
T t = { u };
T t{ u };

Where u ε U is some type that can be coerced to a T type or promoted through a T(U u) constructor. The first two are equivalent, invoking T(U u). Note that there is no assignment in the first. The syntax is completely equivalent to the second. The third and fourth are also equivalent to each other, but not to the first two. The difference is that narrowing conversions like:

int i = 1.75; // size of int is 4 bytes, size of double is 8 bytes

will succeed with either of the first two, with a compiler warning. But that is not true for braced initialization:

int j{ 1.75 }; // fails to compile

Braced initialization, like case 4, above, applies for most data definitions, and has been named "uniform initialization".

Examples of "uniform initialization"

std::string s4{ "four" };

double pi{ 3.1415729 };
double* pDouble{ &pi };
double& rDouble{ pi };

std::pair<int, double> pair1{ 4, 4.5 };

std::vector<std::pair<int, double>> pVec{ pair1, pair2, pair3, pair4 };

std::tuple<int, double, std::string> aTuple{ 1, 3.1415927, "some string" };

std::unordered_map<std::string, int> umap{ { "two", 2}, {"three", 3} };

std::unique_ptr<double> uPtr1{ std::make_unique<double>(pi) };
std::shared_ptr<double> sPtr1{ std::make_shared<double>(pi) };

Uniform initialization makes putting data into containers easy. Since C++17 it's been just as easy to extract data using structured bindings.

Examples of "structured binding"

auto [i5, i6] = std::pair{ 5, 6 };
auto [d, i, s] = std::tuple<double, int, std::string>{ 2.5, 3, "four" };

struct S { int i; char c; std::string s; };
S foobar{ 2, 'Z', "a string" };
auto [fee, fie, foe] = foobar;

The identifiers: i5, i6, d, i, s, fee, fie, foe are all instances with types deduced by auto, based on the types in the containers to which they are bound. You can use them, just like any other identifier, after the binding statement.

Note that these are declarations, so the identifier names must be different than any used before the binding statement in the same scope.

Note: Structured binding only works for public data in the source entity.

Example of where "structured binding" does not work.

class C {
public:
  C(const std::string& str) : str_(str) {} // promotion ctor
  void say() { std::cout << "\n " << str_; }
  operator std::string() { return str_; } // cast operator
private:
  std::string str_;
};

/* uniform initialization works with promotion ctors */
C c{ "hello world" };

/* these fail to compile: can only bind to public data */
//auto [gotit] {c};
//auto [gotit] = { static_cast (c) };

auto gotit = c; // this works via the cast operator, but is not structured binding

3.7 Managing Allocation with Smart Pointers

C++ has a large collection of containers in the Standard Template Library that manage allocations very effectively. They are our first choice for managing data on the native heap¹. When building our own data managers and creational functions, using smart pointers, std::unique_ptr and std::shared_ptr, is preferred over working with raw pointers.

There are two benefits of doing that:

There are no memory leaks because smart pointers delete their allocations when they go out of scope.
Users of creational funtions that return smart pointers can ignore allocation management. The smart pointers do that for them.

The demonstration, below, illustrates use of smart pointers and also illustrates the Dependency Inversion Principle. The demo provides factory functions that return std::unique_ptr<IWidget> to provide access to Widget instances on the heap. Because the user gets access only through an interface, IWidget, she doesn't depend on any of the implementation details for processing in the Widget class. Because she uses a factory, she doesn't depend on the details of creation, so she is completely isolated from the Widget implemenation.

Smart Pointers

Chap2SmartPtrs.cpp

#include <memory> #include <string> #include <iostream> #include "../Chapter7-Display/Chap7Display.h" std::string displayHelper( const std::string& str ) { std::string insert = (str.size() > 0 ? " " + str + " " : " "); return insert; } /*-- interface for Widget class --*/ struct IWidget { virtual ~IWidget() {} virtual void say() = 0; virtual std::string name() = 0; virtual void name(const std::string& nm) = 0; }; /*-- Widget class --*/ class Widget : public IWidget { public: Widget() = default; Widget(const std::string& nm) { name_ = nm; } ~Widget() { std::cout << "\n destroying " << displayHelper(name_); } virtual void say() override { std::cout << "\n Widget instance" << displayHelper(name_) << "here"; std::cout << "\n my address is " << reinterpret_cast<long long>(this); } virtual std::string name() override { return name_; } virtual void name(const std::string& nm) { name_ = nm; } private: std::string name_; };

Smart Pointer Factories:

/*------------------------------------------- first factory function uses unique_ptr */ std::unique_ptr<IWidget> createWidget1() { return std::unique_ptr<Widget>(new Widget); } /*-------------------------------------------- second factory function uses make_unique */ std::unique_ptr<IWidget> createWidget2() { return std::make_unique<Widget>(); } /*------------------------------------------- third factory function uses initialized make_unique */ std::unique_ptr<IWidget> createWidget3(const std::string& name) { return std::make_unique<Widget>( Widget(name) ); } /*------------------------------------------- fourth factory function uses static Widget for make_unique avoiding repeated constr. */ std::unique_ptr<IWidget> createWidget4(const std::string& name) { static Widget widget; widget.name(name); return std::make_unique<Widget>( widget ); }

Using Code:

int main() { displayTitle("Demonstrating Smart Pointers"); displayDemo( "--- createWidget1 with std::unique_ptr ---" ); std::unique_ptr<IWidget> pWidget = createWidget1(); pWidget->say(); pWidget->name("Joe"); pWidget->say(); putline(); displayDemo( "--- createWidget2 with std::make_unique ---" ); pWidget = createWidget2(); pWidget->name("Zhang"); pWidget->say(); putline(); displayDemo( "--- createWidget3 with initialized " "std::make_unique ---" ); pWidget = createWidget3("Ashok"); pWidget->say(); putline(); displayDemo( "--- createWidget4 with " "static initialized " "std::make_unique ---" ); pWidget = createWidget4("Priyaa"); pWidget->say(); putline(); displayDemo( "--- std::shared_ptr ---" ); std::shared_ptr<IWidget> pS1Widget = std::make_shared<Widget>(Widget("Mike")); pS1Widget->say(); std::shared_ptr<IWidget> pS2Widget = pS1Widget; pS2Widget->say(); putline(); displayDemo( "--- std::shared_ptr via factory---" ); std::shared_ptr<IWidget> pS3Widget = createWidget4("Sally"); pS3Widget->say(); std::shared_ptr<IWidget> pS4Widget = pS3Widget; pS4Widget->say(); putline(); }

Output:

Demonstrating Smart Pointers ============================== --- createWidget1 with std::unique_ptr --- Widget instance here my address is 2786941491616 Widget instance Joe here my address is 2786941491616 --- createWidget2 with std::make_unique --- destroying Joe Widget instance Zhang here my address is 2786941490384 --- createWidget3 with initialized std::make_unique --- destroying Ashok destroying Zhang Widget instance Ashok here my address is 2786941490272 --- createWidget4 with static initialized std::make_unique --- destroying Ashok Widget instance Priyaa here my address is 2786941491504 --- std::shared_ptr --- destroying Mike Widget instance Mike here my address is 2786941451616 Widget instance Mike here my address is 2786941451616 --- std::shared_ptr via factory--- Widget instance Sally here my address is 2786941491280 Widget instance Sally here my address is 2786941491280 destroying Sally destroying Mike destroying Priyaa destroying Sally

All of the STL containers manage data with heap allocations except for the std::array. That class uses stack allocation for a fixed number of data items.

3.8 Building Data Structures with STL Containers

In this section we will see how easy it is to "snap together" STL containers into complex, useful data structures. Details of the STL will be deferred to Chapter 6.

In this example we want to save file and path information from a directory tree, starting with a specified root path. Our goal is to save each distinct filename and each distinct path only once. We accomplish that by using several STL container classes.

Each file name encountered may appear in more than one directory. Instead of saving multiple path names, we save multiple iterators (much smaller than the path names) into a path set.

We've simply connected them together, as shown in the using declarations, below. The structure of the STL, using iterators to point to specific contents, works very well. The complete declaration of the file information container is:

std::map<std::string,std::vector<std::set<std::string>::iterator>> --> std::set<std::string>

The using declarations make that much more readable:

FileInfo<File, PathRefs> --> PathSet<Path>

The class diagram and declaration are show in the blocks, below.

Fig 1. FileInfo Data Structure

class FileInfoContainer { public: // file to path structure using File = std::string; using Path = std::string; using PathSet = std::set<Path>; using PathRef = PathSet::iterator; using PathRefs = std::vector<PathRef>; using FileInfo = std::map<File, PathRefs>; using iterator = FileInfo::iterator; using PathInfo = FileInfo; FileInfoContainer& add(const File& file, const Path& path); FileInfoContainer invertFileInfo(FileInfoContainer* pFIC); FileInfo fileInfo(); iterator begin() { return fileInfo_.begin(); } iterator end() { return fileInfo_.end(); } size_t fileCount(); size_t pathCount(); private: FileInfo fileInfo_; PathSet pathSet_; PathInfo pathInfo_; };

Output from a directory navigator that stores its file and directory information in a FileInfo container is shown below. There are two collections of output. The one on the left uses the FileInfo container. The one of the right uses another FileInfo container to hold the results of inverting the path and file relationships.

Instead of using file names as keys and sets of path names as values, we swap the roles of file and path. Instead of using fileInfoContainer.add(file, path) we invert that to fileInfoContainer.add(path, file). Both sets of output are useful. The one on the left, showing for each file name all the paths where it is found, can be used to find the latest version of a file, or older files to be discarded.

The output on the right would be useful for understanding how files are grouped into applications.

FileInfo from path ".."
-------------------------
CallableObjects.cpp
  C:\github\JimFawcett\CppStory\Chapter3-CallableObjects
Chap1.cpp
  C:\github\JimFawcett\CppStory\Chapter1
Chap1.h
  C:\github\JimFawcett\CppStory\Chapter1
Classes.cpp
  C:\github\JimFawcett\CppStory\Chapter4-classes
Coercion.cpp
  C:\github\JimFawcett\CppStory\Chapter3-Coercion
Cpp11-BlockingQueue.h
  C:\github\JimFawcett\CppStory\Chapter3-Logger
CustomTraits.h
  C:\github\JimFawcett\CppStory\CustomTraits
Data.cpp
  C:\github\JimFawcett\CppStory\Chapter2-Data
DateTime.cpp
  C:\github\JimFawcett\CppStory\Chapter2-STL
  C:\github\JimFawcett\CppStory\Chapter3-Logger
  C:\github\JimFawcett\CppStory\DirWalker
DateTime.h
  C:\github\JimFawcett\CppStory\Chapter2-STL
  C:\github\JimFawcett\CppStory\Chapter3-Logger
  C:\github\JimFawcett\CppStory\DirWalker
Dev.cpp
  C:\github\JimFawcett\CppStory\Chapter1-Dev
Dev.h
  C:\github\JimFawcett\CppStory\Chapter1-Dev
DirWalker.cpp
  C:\github\JimFawcett\CppStory\Chapter2-STL
  C:\github\JimFawcett\CppStory\DirWalker
DirWalker.h
  C:\github\JimFawcett\CppStory\Chapter2-STL
  C:\github\JimFawcett\CppStory\DirWalker
Display.h
  C:\github\JimFawcett\CppStory\Chapter3-Logger
  C:\github\JimFawcett\CppStory\Display
Functions.cpp
  C:\github\JimFawcett\CppStory\Chapter3-functions
Functions1.cpp.html
  C:\github\JimFawcett\CppStory\Chapter3-functions
IDev.h
  C:\github\JimFawcett\CppStory\Chapter1-Dev
IPerson - Copy.h
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson.h
  C:\github\JimFawcett\CppStory\Chapter1-Dev
  C:\github\JimFawcett\CppStory\Chapter1-Person
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson1.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson2.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson3.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson4.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson5.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson6.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson7.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson8.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
IPerson9.h.html
  C:\github\JimFawcett\CppStory\Webber\ToHTML
Init.cpp
  C:\github\JimFawcett\CppStory\Chapter2-Init
Logger.cpp
  C:\github\JimFawcett\CppStory\Chapter3-Logger
Logger.h
  C:\github\JimFawcett\CppStory\Chapter3-Logger
Overloading.cpp
  C:\github\JimFawcett\CppStory\Chapter1-overloading
Overriding.cpp
  C:\github\JimFawcett\CppStory\Chapter1-Overriding
Overriding.h
  C:\github\JimFawcett\CppStory\Chapter1-Overriding
Person.cpp
  C:\github\JimFawcett\CppStory\Chapter1-Dev
  C:\github\JimFawcett\CppStory\Chapter1-Person
  C:\github\JimFawcett\CppStory\Webber\ToHTML
Person.h
  C:\github\JimFawcett\CppStory\Chapter1-Dev
  C:\github\JimFawcett\CppStory\Chapter1-Person
  C:\github\JimFawcett\CppStory\Webber\ToHTML
PersonDisplay.h
  C:\github\JimFawcett\CppStory\Chapter1-Person
PersonTest.cpp
  C:\github\JimFawcett\CppStory\Chapter1-PersonTest
STL.cpp
  C:\github\JimFawcett\CppStory\Chapter3-STL
STL_DataStructures.cpp
  C:\github\JimFawcett\CppStory\Chapter2-STL
Sizes.cpp
  C:\github\JimFawcett\CppStory\Chapter2-sizes
Survey.cpp
  C:\github\JimFawcett\CppStory\Chapter1-Survey
TestDev.cpp
  C:\github\JimFawcett\CppStory\Chapter1-DevTest
TestPerson.cpp
  C:\github\JimFawcett\CppStory\Chapter4-classes
ToHTML.cpp
  C:\github\JimFawcett\CppStory\Webber\ToHTML
const_cast.cpp
  C:\github\JimFawcett\CppStory
const_cast1.cpp.html
  C:\github\JimFawcett\CppStory
const_cast1.txt.html
  C:\github\JimFawcett\CppStory
dynamic_cast.cpp
  C:\github\JimFawcett\CppStory
dynamic_cast1.cpp.html
  C:\github\JimFawcett\CppStory
dynamic_cast2.cpp.html
  C:\github\JimFawcett\CppStory
reinterpret_cast.cpp
  C:\github\JimFawcett\CppStory
reinterpret_cast1.cpp.html
  C:\github\JimFawcett\CppStory
static_cast.cpp
  C:\github\JimFawcett\CppStory
static_cast1.cpp.html
  C:\github\JimFawcett\CppStory

number of files: 56
number of paths: 23

PathInfo from path ".."
-------------------------
C:\github\JimFawcett\CppStory
  const_cast.cpp
  const_cast1.cpp.html
  const_cast1.txt.html
  dynamic_cast.cpp
  dynamic_cast1.cpp.html
  dynamic_cast2.cpp.html
  reinterpret_cast.cpp
  reinterpret_cast1.cpp.html
  static_cast.cpp
  static_cast1.cpp.html
C:\github\JimFawcett\CppStory\Chapter1
  Chap1.cpp
  Chap1.h
C:\github\JimFawcett\CppStory\Chapter1-Dev
  Dev.cpp
  Dev.h
  IDev.h
  IPerson.h
  Person.cpp
  Person.h
C:\github\JimFawcett\CppStory\Chapter1-DevTest
  TestDev.cpp
C:\github\JimFawcett\CppStory\Chapter1-Overriding
  Overriding.cpp
  Overriding.h
C:\github\JimFawcett\CppStory\Chapter1-Person
  IPerson.h
  Person.cpp
  Person.h
  PersonDisplay.h
C:\github\JimFawcett\CppStory\Chapter1-PersonTest
  PersonTest.cpp
C:\github\JimFawcett\CppStory\Chapter1-Survey
  Survey.cpp
C:\github\JimFawcett\CppStory\Chapter1-overloading
  Overloading.cpp
C:\github\JimFawcett\CppStory\Chapter2-Data
  Data.cpp
C:\github\JimFawcett\CppStory\Chapter2-Init
  Init.cpp
C:\github\JimFawcett\CppStory\Chapter2-STL
  DateTime.cpp
  DateTime.h
  DirWalker.cpp
  DirWalker.h
  STL_DataStructures.cpp
C:\github\JimFawcett\CppStory\Chapter2-sizes
  Sizes.cpp
C:\github\JimFawcett\CppStory\Chapter3-CallableObjects
  CallableObjects.cpp
C:\github\JimFawcett\CppStory\Chapter3-Coercion
  Coercion.cpp
C:\github\JimFawcett\CppStory\Chapter3-Logger
  Cpp11-BlockingQueue.h
  DateTime.cpp
  DateTime.h
  Display.h
  Logger.cpp
  Logger.h
C:\github\JimFawcett\CppStory\Chapter3-STL
  STL.cpp
C:\github\JimFawcett\CppStory\Chapter3-functions
  Functions.cpp
  Functions1.cpp.html
C:\github\JimFawcett\CppStory\Chapter4-classes
  Classes.cpp
  TestPerson.cpp
C:\github\JimFawcett\CppStory\CustomTraits
  CustomTraits.h
C:\github\JimFawcett\CppStory\DirWalker
  DateTime.cpp
  DateTime.h
  DirWalker.cpp
  DirWalker.h
C:\github\JimFawcett\CppStory\Display
  Display.h
C:\github\JimFawcett\CppStory\Webber\ToHTML
  IPerson - Copy.h
  IPerson.h
  IPerson1.h.html
  IPerson2.h.html
  IPerson3.h.html
  IPerson4.h.html
  IPerson5.h.html
  IPerson6.h.html
  IPerson7.h.html
  IPerson8.h.html
  IPerson9.h.html
  Person.cpp
  Person.h
  ToHTML.cpp

number of files: 56
number of paths: 23

Directory navigator DirWalk<App> was built using the C++17 std::filesystem. It gets access to a FileInfoContainer from an App template parameter that supplies it methods doDir(...) and doFile(...). Those methods simply add files with their paths into a FileInfoContainer member of the App class.

This example should give us an idea of how useful the STL types are. We will dig into the details in Chapter 6.

3.9 Data Epilogue

There are two parts to the data story, types and values. We've explored in some detail the C++ type system and mechanisms to declare and initialize instances of those types.

As with many programming languages, C++ employs a fairly strict type system. Violating the type rules almost always results in compile failure. We will see later that templates manage types in a more nuanced way.

Our interpretation of data, in this story, entails much more than the fundamental types provided by the language. We also include, in our definition of data, instances of standard library types and user-defined types.

This concludes Chapter 2. In Chapter 3 we will look at C++ operations: functions, methods, functors, lambdas, ...

3.10 Data - Programming Exercises

Write code that declares two floating point numbers and initializes one to a known value. Write a function that, using pointers, copies each byte of the intialized number into the other. Verify that after the copy both numbers hold the same value. Write another function that compares the two numbers, byte by byte, using pointers.
You will want to ensure that no buffer overruns occur, by using the sizeof operator.
Write a function that accepts a string and returns a count of the number of words in the string. For this exercise assume that words always begin and end adjacent to whitespace characters excluding the beginning and end of the string.
Expand on the last exercise by building a key-value pair collection where the keys are discovered words and the value is the number of times it occurs in the input string. You may wish to look at std::map<std::string, size_t>. Note that the first time you encounter a word your map will contain { word, 1 }. If you encounter that word again you don't try to add as a new key. Instead you increment the value, e.g., { word, 2 }.
Extend this exercise one more time by creating a black-list of words not entered, e.g., articals and conjuctions: the, and, or, when, ... Look up each discovered word and enter into the map only if it is not on the black-list. That means you will look up every word, so you want to use a data structure that does look-ups efficiently.

3.11 References

Variables and Types - cplusplus.com
Built-in types (C++) - MSDN
Types - austincc.edu/akochis