about
Bits Data C++
11/25/2023
0
Bits: C++ Data Types
types, initialization, construction, assignment
Synopsis:
Most of a language's syntax and semantics derives directly from its type system design. This is true of all of the languages discussed in these Bits: C++, Rust, C#, Python, and JavaScript. C++ has a complex type system, followed in order by C#, Rust, JavaScript, and Python. We will spend more time with this Bit than with all of the others because of its complexities and importance. This page demonstrates simple uses of the most important C++ types. The purpose is to quickly acquire some familiarity with types and their uses.- Primitive types and aggregates of primitive types are copyable. Assignment and pass-by-value copies the source's value to the destination.
- For user-defined types C++ provides special class methods - constructors, operators, and destructors. When properly implemented they provide instance syntax that mimics that of primitives. We won't see that here, but will in the next Bit page.
- C++ supports move operations with move constructor and move assignment operator. temporaries are moved when assigned or passed-by-value to the destination using the resources of the move source. This is a transfer of ownership and makes the source invalid. Attempting to use a "moved" variable is, unfortunately, not a compile error.
- C++ supports making fixed references to either primitive or user-defined types. These may refer to instances in stack memory, static memory, or in the native heap.
- C++ pointers and references are un-constrained and a frequent source of memory management errors. By convention these can be avoided by using range-based for loops and smart pointers. For very large code bases it can be challenging to insure that the conventions have been followed everywhere - think of projects with 100,000 source lines or more.
- Here we begin to see significant differences between the languages, especially when comparing statically typed languages like C++, Rust, and C#, with dynamically typed languages like Python and JavaScript.
C++ Types Details
Table 1.0 C++ Types
Type | Comments | Example |
---|---|---|
-- Integral types ---- | ||
values true and false | ||
with signed, unsigned, short, long qualifiers |
1 == sizeof(char) ≤ sizeof(short) ≤ sizeof(int) ≤ sizeof(long) ≤ sizeof(long long) | |
|
||
with signed and unsigned qualifiers |
signedness of |
|
|
wchar_t is 16 bits and holds UTF-16 code units on Windows, 32 bits and holds UTF-32 on Linux | |
-- Floating point types ---- | ||
values have finite precision, and may have approximate values | ||
-- literal string types -- | ||
literal string, resides in static memory and is always null terminated. "Hello" is a const char[6] containing the chars: 'H', 'e', 'l', 'l', 'o', '\0'. |
|
|
-- Aggregate types ---- | ||
Native array of N elements all of type |
let first = arr[0]; |
|
collection of heterogeneous types accessed by position |
char third = get<2>(tu); |
|
std::optional holds optional value |
||
-- Std::library types ---- | ||
Expandable collection of ASCII characters allocated in the heap | ||
Expandable collection of Unicode characters allocated in the heap | ||
Fixed size generic array of items of type |
||
Expandable generic collection of items of type |
||
Expandable double-ended generic collection of items of type |
v.push_front(-1); ... |
|
Unordered associative container of Key-Value pairs, held in a table of bucket lists. |
map.insert("zero", 0); ... |
|
Ordered associative container of Key-Value pairs, held in binary tree. |
map.insert("zero", 0); ... |
|
forward_list, list, unordered_map, unordered_set, unordered_multimap, unordered_multiset, set, multiset, and several adapters like stack and queue | The C++ std::library also defines types for threading and synchronization, reading and writing to streams, anonymous functions, and many more. | Containers library |
-- User-defined Types -- | ||
User-defined types | Based on classes and structs, these will be discussed in the next Bit. |
C++ Type System Details
Table 2. C++ Copy and Move Operations
Operation | Example | Primitive or Aggregate of Primitives | Library or User-defined Type |
---|---|---|---|
If uεT is a named variable | |||
Construction | |||
Assignment | |||
Pass-by-value | using |
||
If uεT is a temporary or u = std::move(v), vεT | |||
Construction | |||
Assignment | |||
Pass-by-value |
Table 3. C++ Type System Attributes
Static typing | All types are known at compile time and are fixed throughout program execution. |
Inference |
Compiler infers types in expressions if not explicitly annotated or if declared |
Intermediate strength typing |
Types are exhaustively checked but there are many implicit conversions.
|
Generics |
Generics provide types and functions with unspecified parameters,
supporting code reuse and abstraction
over types. Generic parameters are specified at the call site, e.g., |
Class Relationships |
Class relationships are important tools for modeling both application and implementation
domains.
|
Concepts | Concepts are similar to Rust traits and Java and C# interfaces. They define shared behavior that types can implement, supporting abstraction over behavior. Concepts define behavior by declaring concept specific functions. A template type can use a Requires clause with concept arguments to bound types that are valid for a class or function. |
1.0 Initialization
1.1 Primitives
/*----------------------------------------------
All code used for output has been elided
*/
/*-- scalars --*/
bool b = true;
std::byte byte { 0x0f };
/* std::byte => unsigned char {} */
int i = 42; // equiv to int i { 42 };
double d = 3.1415927;
char ch = 'z';
const char* lst = "a literal string";
Output
--- bool ---
b: true
b: type: bool
b: size = 1
--- byte ---
byte: 0xf
byte: type: enum std::byte
byte: size = 1
--- int ---
i: 42
i: type: int
i: size = 4
--- double ---
d: 3.141593
d: type: double
d: size = 8
--- char ---
ch: z
ch: type: char
ch: size = 1
--- const char* ---
lst: "a literal string"
lst: type: char const * __ptr64
lst: size = 8
lst: char count = 16
1.2 Aggregates
/*---------------------------------------------------
All code used for output has been elided
*/
/*-- array --*/
short int fa[] { 1, 2, 3, 4, 5 };
short int fa_alt[5] { 1, 2, 3, 4, 5 };
auto afirst = fa[0];
/*-- struct --*/
struct S { int a; char b; double c; };
S strct { 1, 'a', 3.1415927 };
auto sfirst = strct.a;
/*-- tuple --*/
std::tuple<int, double, char> tup { 1, 3.14, 'z' };
auto tfirst = get<0>(tup);
/*-- optional --*/
std::optional<double> opt1 { 3.1415927 };
std::optional<double> opt2; // { std::nullopt };
if(opt2 == std::nullopt) {
std::cout << "empty\n";
}
else {
std::cout << *opt2 << "\n";
}
Output
--- aggregate types ---
--- native array ---
fa[]: [ 1, 2, 3, 4, 5 ]
fa[]: type: short const * __ptr64
fa[]: size = 10
afirst = 1
--- struct ---
strct: S { 42, a, 3.14159 }
strct: type: struct `void __cdecl initialize_primitiv...
strct: size = 16
sizeof(strct.a) = 4
sizeof(strct.b) = 1
sizeof(strct.c) = 8
sfirst = 42
--- tuple ---
tup: { -1, 3.14, z }
tup: type: class std::tuple
tup: size = 24
tfirst = -1
--- optional ---
opt1: 3.14159
opt2: empty
opt1: type: class std::optional
opt1: size = 16
1.3 Std::library Types
/*-- initialize std::library types --*/
/*-------------------------------------------------
All code for output has been elided
*/
/*-- generic array --*/
std::array<double, 7>
sarr { 1.0, 1.5, 2.0, 2.5, 2.0, 1.5, 1.0 };
double afirst = sarr[0];
/*-- expandable string --*/
std::string sstr = "a string";
double sfirst = sstr[0];
/*-- expandable indexable array --*/
std::vector<int> vec { 1, 2, 3, 4 };
vec.push_back(5);
int vthird = vec[2];
/*-- expandable double-ended queue --*/
std::deque<int> dq { 1, 2, 3 };
dq.push_front(0);
dq.push_back(4);
int qfirst = dq.front();
/*-- expandable associative container --*/
std::unordered_map<std::string, int> umap
{{"one", 1}, {"two", 2}, {"three", 3}};
umap.insert({"zero", 0});
int uZero = umap["zero"];
Output
--- std::array<double, 7> ---
sarr: [ 1.00, 1.50, 2.00, 2.50, 2.00, 1.50, 1.00 ]
sarr: type: class std::array<double,7>
--- std::string => std::basic_string<char> ---
sstr: "a string"
sstr: type: class std::basic_string<char,struct std:...
--- std::vector<int> ---
vec: [ 1, 2, 3, 4, 5 ]
vec: type: class std::vector<int,class std::allocat...
--- std::deque<int> ---
dq: [ 0, 1, 2, 3, 4 ]
dq: type: class std::deque<int,class std::allocato...
--- std::unordered_map<std::string, int> ---
umap: { {one, 1}, {two, 2}, {zero, 0}, {three, 3} }
umap: type: class std::unordered_map<class std::basi...
1.4 Heap Storage
/*-- initialized in heap memory --*/
/*--------------------------------------------------
All code for output has been elided
*/
/*-- raw pointer --*/
int* tmol = new int{42};
// using code here
delete tmol; // forgetting this causes memory leak
/*-- unique_ptr to scalar --*/
std::unique_ptr<int> suni =
std::make_unique<int>(-1);
int utmp = *suni;
// std::unique_ptr deallocates heap storage when
// it goes out of scope, so no memory leak.
/*-- shared_ptr to scalar --*/
std::shared_ptr<int> stmol
= std::make_shared<int>(42);
int stmp = *stmol;
// std::shared_ptr deallocates heap storage when
// all references go out of scope, so no memory leak.
/*-- shared_ptr to collection --*/
std::shared_ptr<std::vector<int>> pVec =
make_shared<std::vector<int>>(
std::vector<int>{ 1, 2, 3 }
);
auto vtmp = *pVec;
// only difference from scalar case is initialization
/*-- using aliases to simplify --*/
using VecI = std::vector<int>;
using SPtr = std::shared_ptr<VecI>;
SPtr pVec2 = make_shared<VecI>(VecI{ 1, 2, 3 });
// equivalent to pVec, just abbreviates syntax
Output
--- initialized in heap memory ---
--- raw pointer ---
*tmol: 42
*tmol: type: int
*tmol: size = 4
tmol: 0000016E4A956D80
tmol: type: int * __ptr64
tmol: size = 8
--- std::unique_ptr<int> ---
*suni: -1
*suni: type: int
*suni: size = 4
suni: 000001D7CFCE6300
suni: type: class std::unique_ptr<int,struct std::de...
suni: size=8
--- std::shared_ptr<int> ---
*stmol: 42
*stmol: type: int
*stmol: size=4
stmol: 0000016E4A967420
stmol: type: class std::shared_ptr<int>
stmol: size=16
--- std::shared_ptr<std::vector<int>> ---
*pVec: [ 1, 2, 3 ]
*pVec: type: class std::vector<int,class std::allocat...
*pVec: size=32
pVec: 0000016E4A968260
pVec: type: class std::shared_ptr<class std::vector<...
pVec: size=16
--- using aliases to simplify ---
*pVec2: [ 1, 2, 3 ]
*pVec2: type: class std::vector<int,class std::allocat...
pVec2: 0000016E4A968880
pVec2: type: class std::shared_ptr<class std::vector<...
2.0 Copy Operations
2.1 Copy Operations for Primitives
/*-- copy operations for primitives --*/
/*-------------------------------------------------------
All code for output has been elided
*/
/*-- primitive copy construction - bit-wise copy --*/
int ival = 42; // initialization
int jval = ival; // here's the copy construction
/*-- copy assignment, output elided --*/
double dval = 3.1415927; // create dst instance
double eval = 1.33333; // create src instance
dval = eval; // copy assignment operation
Output
------------------------------------------
copy operations for primitives
--------------------------------------------------
--- copy construction ---
ival: 42
ival: type: int
ival: address: 0000002E4DEFDAF4
--- int jval = ival ---
jval: 42
jval: type: int
jval: address: 0000002E4DEFDC04
---------------------------------------------
Addresses of ival and jval are different,
demonstrating copy, as expected.
---------------------------------------------
--- copy assignment ---
dval: 3.141593
dval: type: double
dval: address: 0000002E4DEFDD18
eval: 1.333330
eval: type: double
eval: address: 0000002E4DEFDDD8
--- dval = eval ---
dval: 1.333330
dval: type: double
dval: address: 0000002E4DEFDD18
---------------------------------------------
Addresses of dval and eval are different
demonstrating copy.
---------------------------------------------
2.2 Copy Operations for Std::library Types
/*-- copy operations for std::library types --*/
/*-------------------------------------------------------
All code for output has been elided
*/
/*-- vector copy construction --*/
std::vector<double> vec { 1.0, 1.5, 2.0 };
auto uec = vec; // copy construction
/*-- vector copy assignment --*/
std::vector<double> tec { 1.0, 1.5, 2.0, 2.5 };
uec = tec; // copy assignment
Output
--------------------------------------------------
copy operations for std::library types
--------------------------------------------------
--- vector copy construction ---
vec: [ 1.00, 1.50, 2.00 ]
vec: address: 0000002E4DEFDFF8
vec[0]: address: 0000016E4A967950
--- auto uec = vec ---
uec: [ 1.00, 1.50, 2.00 ]
uec: address: 0000002E4DEFE198
uec[0]: address: 0000016E4A967A10
--------------------------------------------------
Note:
Both uec and vec and their resources are unique.
That's because the vector copy constructor
copies each element of vec to uec.
Managed languages copy handles to instances,
not the instances themselves, so construction
does not create a new instance in those
languages. Resources are shared.
--------------------------------------------------
--- vector copy assignment ---
tec: [ 1.00, 1.50, 2.00, 2.50 ]
tec: address: 0000002E4DEFE338
tec[0]: address: 0000016E4A967770
--- uec = tec ---
uec: [ 1.00, 1.50, 2.00, 2.50 ]
uec: address: 0000002E4DEFE198
uec[0]: address: 0000016E4A9673B0
--- original source vec has not changed ---
vec: [ 1.00, 1.50, 2.00 ]
vec: address: 0000002E4DEFDFF8
vec[0]: address: 0000016E4A967950
--------------------------------------------------
Note:
Both uec and tec and their resources are unique.
That's because vector copy assignment operator
copies each element of tec to uec.
Managed languages copy handles to instances,
not the instances themselves, so assignment
causes sharing of resources in those languages.
--------------------------------------------------
3.0 Move Operations
Move std::string
/*-- move temporary string --*/
auto first = std::string("first part");
auto mid = std::string(" and ");
auto last = std::string("last part");
auto aggr = first + mid + last;
/*-----------------------------------------------------
first + mid + last is a temporary string that
is moved to aggr using move constructor.
------------------------------------------------------*/
/*-- forced string move --*/
auto src = std::string("src string");
auto dst = std::move(src);
/*-----------------------------------------------------
There is no guarantee that src will have valid
state after move, so using src after has undefined
behavior.
------------------------------------------------------*/
Output
--- move temporary string ---
first: "first part"
mid: " and "
last: "last part"
--- auto aggr = first + mid + last ---
aggr: "first part and last part"
--------------------------------------------------
first + mid + last is a temporary string that
is moved to aggr using move constructor.
--------------------------------------------------
--- forced string move ---
src: "src string"
src: type: class std::basic_string<char,struct std:...
--- auto dst=std::move(src) ---
dst: "src string"
dst: type: class std::basic_string<char,struct std:...
src: "" // DON'T DO THIS
src: type: class std::basic_string<char,struct std:...
--------------------------------------------------
There is no guarantee that src will have a valid
state after move, so the src display, above, has
undefined behavior - just happens to work on MSVC.
--------------------------------------------------
4.0 Pass-by-value & Pass-by-reference
/*-------------------------------------------
All code is shown for this example, e.g.,
includes formatting code and output.
-------------------------------------------*/
template<typename T>
void pass_seq_by_value(T t) {
std::cout << formatOutput<T>(
t, " passed", seq_collectionToString<T>
);
std::cout << formatAddress<T>(t, " passed");
std::cout << formatAddress<typename T::value_type>(
t[0], "passed[0]"
);
}
template<typename T>
void pass_seq_by_reference(T& t) {
std::cout << formatOutput<T>(
t, " passed", seq_collectionToString<T>
);
std::cout << formatAddress<T>(t, " passed");
std::cout << formatAddress<typename T::value_type>(
t[0], "passed[0]"
);
}
void pass_by_value_and_ref() {
showLabel("pass-by-value and pass-by-reference");
nl();
showOp("std::vector<int> pass-by-value");
nl();
using VECI = std::vector<int>;
auto v = std::vector<int>() = { 1, 2, 3 };
std::cout << formatOutput<VECI>(
v, " call",
seq_collectionToString<VECI>
);
std::cout << formatAddress<VECI>(v, " call");
std::cout << formatAddress<int>(v[0], " call[0]");
pass_seq_by_value(v);
showLabel(
"passed has the same value as call.\n"
" call and its resources are different from\n"
" passed and its resources. That demonstrates\n"
" passed was copy constructed from call."
);
nl();
showOp("std::vector<int> pass-by-reference");
nl();
using VECI = std::vector<int>;
auto rv = std::vector<int>() = { 1, 2, 3 };
std::cout << formatOutput<VECI>(
rv, " call",
seq_collectionToString<VECI>
);
std::cout << formatAddress<VECI>(rv, " call");
std::cout << formatAddress<int>(rv[0], " call[0]");
pass_seq_by_reference(rv);
showLabel(
"call and its resources are the same as\n"
" passed and its resources. That demonstrates\n"
" that only reference was copied."
);
}
Output
--------------------------------------------------
pass-by-value and pass-by-reference
--------------------------------------------------
--- std::vector<int> pass-by-value ---
call: [ 1, 2, 3 ]
call: type: class std::vector<int,class std::allocat...
call: address: 0000004A4276F3C8
call[0]: address: 00000202AD1B4B80
passed: [ 1, 2, 3 ]
passed: type: class std::vector<int,class std::allocat...
passed: address: 0000004A4276F760
passed[0]: address: 00000202AD1B4AE0
--------------------------------------------------
passed has the same value as call.
call and its resources are different from
passed and its resources. That demonstrates
passed was copy constructed from call.
--------------------------------------------------
--- std::vector<int> pass-by-reference ---
call: [ 1, 2, 3 ]
call: type: class std::vector<int,class std::allocat...
call: address: 0000004A4276F578
call[0]: address: 00000202AD1B45E0
passed: [ 1, 2, 3 ]
passed: type: class std::vector<int,class std::allocat...
passed: address: 0000004A4276F578
passed[0]: address: 00000202AD1B45E0
--------------------------------------------------
call and its resources are the same as
passed and its resources. That demonstrates
that only reference was copied.
--------------------------------------------------
5.0 Display Functions
Display Functions
/*-- convert scalar to string ----------------------*/
template <typename T>
std::string scalarToString(const T& scalar) {
/* format in-memory stringstream so formats
are temporary */
std::ostringstream out;
out.precision(3);
out << std::showpoint;
out << std::boolalpha;
out << scalar;
return out.str();
}
/*-- convert sequential collection to string -------*/
template <typename C>
std::string seq_collectionToString(const C& coll) {
/* format in-memory stringstream so formats
are temporary */
std::ostringstream out;
out.precision(3);
out << std::showpoint;
out << std::boolalpha;
out << "[ ";
/*-- show comma only in interior of sequence --*/
bool first = true;
for(auto item : coll) {
if(first) {
out << item;
first = false;
}
else {
out << ", " << item;
}
}
out << " ]";
return out.str();
}
/*-- several conversion functions elided --*/
/*-- return formatted output string ----------------
- third arg is lambda (aka closure) std::function
- intent is to pass in format converter function
customized for type T
- examples of converter functions are given above
*/
template <typename T>
std::string formatOutput(
const T& t, // format t
const std::string& nm, // caller name
std::function<std::string(const T&)> f, // format
bool showtype = true // default to show type
){
std::stringstream out;
out << " " << nm + ": "
<< f(t) << "\n";
if(showtype) {
out << getType(t, nm);
out << " " << nm + ": size = "
<< sizeof(t) << "\n";
}
return out.str();
}
/*--------------------------------------------------
return formatted address as string
*/
static const size_t WIDTH = 8;
/*-- return address of t --*/
template<typename T>
std::string formatAddress(
const T& t, const std::string& nm
) {
const T* ptrToArg = &t;
std::stringstream out;
out.precision(7);
out << " " << std::showpoint;
out << std::boolalpha;
out << std::setw(WIDTH) << std::left
<< nm + ": " << "address: ";
out << std::showbase << std::hex
<< ptrToArg << "\n";
return out.str();
}
/*-----------------------------------------------
truncate string to length n
- does nothing if string length is less than n
*/
inline std::string truncate(
const std::string& str, size_t n = 40
) {
std::string tmp = str;
if(tmp.size() < n) {
return tmp;
}
tmp.resize(n);
tmp += "...";
return tmp;
}
/*-----------------------------------------------
return type of t
*/
template<typename T>
std::string getType(T t, const std::string &nm) {
std::ostringstream out;
out << " "
<< nm + ": "; // show name at call site
out << "type: "
<< truncate(typeid(t).name()); // show type
out << "\n";
return out.str();
}
/*-----------------------------------------------
display text delimited with "---"
*/
inline void showOp(
const std::string& text,
const std::string& suffix = ""
) {
std::cout << " --- "
<< text
<< " ---"
<< std::endl << suffix;
}
/*-----------------------------------------------
display text with lines above and below
*/
inline void showLabel(
const std::string& text, size_t n = 50
) {
auto line = std::string(n, '-');
std::cout << line << std::endl;
std::cout << " " << text << std::endl;
std::cout << line << std::endl;
}
/*-----------------------------------------------
send a text string to std::cout
*/
inline void print(const std::string& txt) {
std::cout << "\n " << txt;
}
/*-----------------------------------------------
send a line of text to std::cout
*/
inline void println(const std::string& txt) {
std::cout << "\n " << txt << "\n";
}
/*-----------------------------------------------
emit newline
*/
static void nl() {
std::cout << std::endl;
}
functions for converting scalars and collections to strings. |
a formatOutput function accepting an object and one of the converter functions. |
That function returns a formatted string representing the first argument. |
The first code block also shows a function used to format pointer addresses for display. |
|
|
6.0 Build
C:\github\JimFawcett\Bits\Cpp\Cpp_Data\build
cmake --build .
MSBuild version 17.5.1+f6fdcf537 for .NET Framework
Bits_Data.cpp
Bits_DataAnalysis.cpp
Generating Code...
Cpp_Data.vcxproj ->
C:\github\JimFawcett\Bits\Cpp\Cpp_Data\build\Debug\Cpp_Data.exe
C:\github\JimFawcett\Bits\Cpp\Cpp_Data\build
where the executable name, here CppData.exe, is defined in CMakeLists.txt.
6.1 CMakeLists.txt
cmake_minimum_required(VERSION 3.25)
project(CppData)
#---------------------------------------------------
set(CMAKE_BUILD_TYPE Debug)
#---------------------------------------------------
# CppData dir
# -- CMakeLists.txt (this file)
# -- src dir
# -- Bits_Data.cpp
# -- Bits_DataAnalysis.h
# -- Bits_DataAnalysis.cpp
# -- build directory
# -- Debug directory
# -- Cpp_Data.exe
# -- ...
#---------------------------------------------------
# Wasn't able to get std::library modules to work with CMake.
# - does work in Visual Studio, preview edition, non CMake project
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}
/experimental:module /std:c++latest /EHsc /MD"
)
#-- Things I tried to get CMake to find str module --
# set(CMAKE_MODULE_PATH "C:/Users/Public/std_modules")
# set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /std:c++20")
# set_property(TARGET $CppData PROPERTY CXX_STANDARD 20)
# target_compile_features(CppData PUBLIC CXX_STANDARD 20)
# set(CMAKE_CXX_FLAGS "${/experimental:module /std:c++latest}")
# set(CMake_CXX_STANDARD 20)
# set(CMAKE_CXX_STANDARD_REQUIRED ON)
# set(CMAKE_CXX_EXTENSIONS OFF)
#---------------------------------------------------
# build Bits_Data.obj, Bits_DataAnalysis.obj
# in folder build/Cpp_Data.dir/debug
#---------------------------------------------------
set(SRC
src/Bits_Data.cpp
)
include_directories(src)
add_executable(CppData.exe ${SRC})
#---------------------------------------------------
7.0 VS Code View
8.0 References
Reference | Description |
---|---|
C++ Story | E-book with thirteen chapters covering most of intermediate C++ |
C++ Bites | Relatively short feature discussions |