Idioms and Patterns Story
Iteration
Home  Repo
Top, Bottom

Iteration and Iteration Adapters

Iteration through Strings and Byte Arrays


Iteration is a very basic program operation on collections. It has relatively primitive operations, but languages are adding new and more powerful ways of manipulating collections. Iterating through Rust Strings is complicated by its utf-8 characters. The sizes of these vary from 1 to 4 bytes, so they cannot be indexed. Rust provides a String iterator, chars(), that knows how to detect byte sequences that indicate utf-8 character boundaries. Rust defines an iterator trait for which there are many useful adapters. We will demonstrate a few of those here. C++ std::strings are simpler, holding constant size characters. As of C++ 20, there is a ranges facility that defines views and adapters, similar in concept to the Rust adapters. The code below shows how to use both Rust and C++ iterators and adapters.
C++ hide ///////////////////////////////////////////////// // idioms::iteration_cpp::Iteration.cpp // // - idioms are styles of writing snippets // // in ways that are valued by language // // community // // - this idiom code focuses on iteration // // Jim Fawcett, 24 Jan 2021 // ///////////////////////////////////////////////// /* C++ std::String ---------------- C++ std::String is a container of ascii chars with null terminator. - ascii characters are all 1 byte, so String instances can be indexed. References: ----------- std::string https://doc.rust-lang.org/std/string /struct.String.html std::ranges https://en.cppreference.com/w/cpp/ranges std::string_view https://en.cppreference.com/w/cpp/string /basic_string_view std::vector https://en.cppreference.com/w/cpp /container/vector std::distance https://en.cppreference.com/w/cpp /iterator/distance STL std::algorithms https://jimfawcett.github.io /CppStory_LibrarySTL.html#algor std::all_of ... https://en.cppreference.com/w/cpp /algorithm/all_any_none_of */ #include<string> #include<ranges> #include<iostream> #include <locale> #include <algorithm> #include <string_view> #include <typeinfo> #include <iterator> #include <vector> /*-- helper function --*/ void putln(size_t num = 1) { for(size_t i=0; i<num; ++i) std::cout << "\n"; } /*-- basic string iteration demos --*/ void string_iteration() { std::string test_string = "a test string"; std::cout << "\n ascii characters from " << test_string << "\n "; std::string::iterator iter = test_string.begin(); while(iter != test_string.end()) { char ch = *iter; std::cout << ch << " "; if(iter == test_string.end()) break; ++iter; } std::cout << "\n test_string: " << test_string; putln(); } /*-- idiomatic string iteration demos --*/ void idomatic_string_iteration() { std::string test_string = "another test string"; std::cout << "\n idiomatic ascii chars from:\n " << test_string << "\n "; for(auto ch : test_string) { std::cout << ch << " "; } putln(); char ch2 = test_string[1]; std::cout << "\n 2nd character of " << test_string << " is " << ch2; std::cout << "\n test_string: " << test_string; putln(); } /*----------------------------------------------- demonstrate all_of(...), is_alphabetic, is_..., ranges, and string_view. */ /*-- helper function show collection items --*/ template<typename C> void put_coll( C& coll, const std::string& prefix = "" ) { std::cout << prefix; for(auto item : coll) { std::cout << item; } } /*-- helper function, displays test results --*/ void test( bool pred, const std::string& src, const std::string& category ) { if(pred) { std::cout << "\n " << src << " is " << category; } else { std::cout << "\n " << src << " is not " << category; } } /*-- demonstrate string adapter functions --*/ void string_adapters() { std::string ls = "abc123"; /*-- are all chars alphbetic --*/ auto is_alpha = [](char ch) -> bool { return std::isalpha(ch); }; test( std::all_of(ls.begin(), ls.end(), is_alpha), ls, "alphabetic" ); /*-- are all chars alphanumeric --*/ auto is_alnum = [](char ch) -> bool { return std::isalnum(ch); }; test( std::all_of(ls.begin(), ls.end(), is_alnum), ls, "alphanumeric" ); /*-- are all chars ascii --*/ auto is_ascii = [](char ch) -> bool { return 0 <= ch && ch < 128; }; test( std::all_of(ls.begin(), ls.end(), is_ascii), ls, "ascii" ); /*-- are all chars numeric --*/ auto is_num = [](char ch) -> bool { return std::isdigit(ch); }; test( std::all_of(ls.begin(), ls.end(), is_num), ls, "numeric" ); putln(); /*-- using range::view with pipe --*/ auto r = ls | std::views::filter(is_num); put_coll(r,"\n r is "); /*-- numeric if numeric range, r, is same size as ls --*/ test( std::distance(r.begin(), r.end()) == ls.size(), ls, "numeric" ); putln(); /*-- display chars from str slice --*/ ls = "abc123"; /*-- non-owning view --*/ std::string_view slice{ ls }; slice.remove_prefix(2); slice.remove_suffix(2); std::cout << "\n third and fourth chars of " << ls << " are " << slice; put_coll(slice, "\n slice is "); std::cout << "\n ls is still " << ls; putln(); /*--------------------------------------------- Form string from numeric chars in source, ls. Uses std::range adapter std::view. */ auto results = ls | std::views::filter(is_num); std::cout << "\n numeric chars of " << ls << " are "; for(auto r:results) { std::cout << r; } put_coll(results, "\n "); /*--------------------------------------------- The results item has a very ugly type. Uncomment lines below to see it. That means that std::cout << results; will fail to compile. */ // std::cout << typeid(results).name(); putln(); } /*----------------------------------------------- Define and iterate through byte array */ using byte = short int; using Iter = byte*; void define_and_iterate_byte_array() { byte ba[] = { 1, 2, 3, 4, 5 }; std::cout << "\n "; for( Iter it=std::begin(ba); it != std::end(ba); ++it ) std::cout << *it << " "; putln(); } void idiomatic_define_and_iterate_byte_array() { short int ba[] = { 1, 2, 3, 4, 5 }; std::cout << "\n "; for(auto i : ba) { std::cout << i << " "; } put_coll(ba, "\n "); putln(); std::cout << "\n ["; auto temp = ba | std::views::take(4); for(auto i : temp) { std::cout << i << ", "; } auto iter = std::end(ba); auto last = *(--iter); std::cout << last << "]"; } int main() { std::cout << "\n -- demonstrate iteration --\n"; std::cout << "\n -- string iteration --"; string_iteration(); idomatic_string_iteration(); std::cout << "\n -- string iteration adapters --"; string_adapters(); std::cout << "\n\n -- byte array iteration --"; define_and_iterate_byte_array(); std::cout << "\n -- idiomatic byte array iter'n --"; idiomatic_define_and_iterate_byte_array(); std::cout << "\n\n That's all Folks!\n\n"; } Output -- demonstrate iteration -- -- string iteration -- ascii characters from a test string a t e s t s t r i n g test_string: a test string idiomatic ascii characters from: another test string a n o t h e r t e s t s t r i n g 2nd character of another test string is n test_string: another test string -- string iteration adapters -- abc123 is not alphabetic abc123 is alphanumeric abc123 is ascii abc123 is not numeric r is 123 abc123 is not numeric third and fourth chars of abc123 are c1 slice is c1 ls is still abc123 numeric chars of abc123 are 123 123 -- byte array iteration -- 1 2 3 4 5 -- idiomatic byte array iteration -- 1 2 3 4 5 12345 [1, 2, 3, 4, 5] That's all Folks! Rust unhide ///////////////////////////////////////////////// // idioms::iteration::main.rs // // - idioms are styles of writing snippets // // in ways that are valued by language // // community // // - this idiom code focuses on iteration // // Jim Fawcett, 24 Jan 2021 // ///////////////////////////////////////////////// /* Rust std::String ---------------- Rust std::String is a container of utf8 chars with no null terminator. - utf8 characters may consist of 1 up to 6 bytes, so String instances can not be indexed. Character boundaries are defined by specific byte sequences, used by String's chars() iterator to return characters. References: ----------- String https://doc.rust-lang.org/std/string /struct.String.html chars() https://doc.rust-lang.org/std/string /struct.String.html#method.chars slice https://doc.rust-lang.org/std/slice /index.html iterator https://doc.rust-lang.org/std/iter /trait.Iterator.html array https://doc.rust-lang.org/std /primitive.array.html Vector https://doc.rust-lang.org/std/vec /struct.Vec.html */ fn string_iteration() { let test_string = String::from("a test string"); /* chars() returns iter over utf8 chars */ let mut iter = test_string.chars(); print!( "\n utf8 characters from {:?}:\n ", &test_string ); loop { /*-- iter returns std::Option<char> --*/ let opt = iter.next(); if opt == None { break; } /*-- unwrap char from Some(char) --*/ print!("{} ",opt.unwrap()); } let ls = test_string.as_str(); print!("\n test_string: {:?}", ls); println!(); } fn idomatic_string_iteration() { let test_string = String::from("another test string"); print!( "\n idiomatic utf8 chars from {:?}:\n ", &test_string ); for ch in test_string.chars() { /*-- option handling implicit here --*/ print!("{} ",ch); } /*-- nth(i) returns Option --*/ let i = 1; let rslt = &test_string.chars().nth(i); if let Some(ch) = rslt { print!( "\n at index {} char of {:?} is {}", i, test_string, ch ); } else { print!("\n index {} out of range", i); } let ls = test_string.as_str(); print!("\n test_string: {:?}", ls); println!(); } /*----------------------------------------------- demonstrate chars(), is_alphabetic, is_..., for_each, filter, and collect There are many iterator adapters. These are some of the most often used. */ fn string_adapters() { let ls = "abc123"; /*-- are all chars alphabetic --*/ print!( "\n {:?} is alphabetic {}", ls, ls.chars().all(|c| {c.is_alphabetic()}) ); /*-- are all chars alphanumeric? --*/ print!( "\n {:?} is alphanumeric {}", ls, ls.chars().all(|c| {c.is_alphanumeric()}) ); /*-- are all chars numeric? --*/ print!( "\n {:?} is numeric {}", ls, ls.chars().all(|c| {c.is_numeric()}) ); /*-- are all chars ascii? --*/ print!( "\n {:?} is ascii {}", ls, ls.chars().all(|c| {c.is_ascii()}) ); /*-- display chars from str slice --*/ let (min, max) = (2usize, 4usize); if min <= ls.len() && max <= ls.len() && min <= max { let slice = &ls[min..max]; print!( "\n 3rd and 4th chars of {:?} are: ", ls ); slice.chars() .for_each(|c| print!("{}", c)); } else { print!( "\n invalid {} and {} for {}", min, max, ls ) } /*-- from numeric chars in source, ls --*/ print!( "\n numeric chars of {:?} are: {:?}", ls, ls.chars() .filter(|c| c.is_numeric()) .collect::<String>() ); println!(); } /* Rust byte arrays ---------------- Rust arrays have sizes that must be determined at compile-time, even those created on the heap. Rust Vectors have sizes that can be determined at run-time, and they will readily give access to their internal heap-based arrays by takeing slices. This is perfectly data-safe, because: - slices have a len() function - even if you index past the end of the array, you can't read or write that memory, because a panic occurs immediately. */ fn define_and_iterate_byte_array() { let ba: [u8;5] = [1,2,3,4,5]; // size is determined at compile-time, even for // arrays created on the heap (unlike C++) let max = ba.len(); print!("\n bytes from byte array:\n ["); /*-- display all but last --*/ for i in 0..max-1 { print!("{}, ", ba[i]); } /*-- display last char --*/ print!("{}]", ba[max-1]); } fn idiomatic_define_and_iterate_byte_array() { let v: Vec<u8> = vec![5,4,3,2,1]; let ba: &[u8] = &v[..]; /*------------------------------------------- type of slice of Vector<u8> is byte slice: &[u8] - slices implement len() function - &v[..] slice of all elements of v - &v[m..n] slice of elements m up to, but not including n - Length of slice &v[..] determined by length of v, which can be determined at run-time. */ print!("\n idiomatic bytes from byte array:"); print!( "\n length of byte slice: {}", ba.len() ); let max = ba.len(); /*-- print all but the last --*/ print!("\n ["); for item in ba.iter().take(max-1) { print!("{}, ", item); } /*-- print last one without trailing comma --*/ print!("{}]", ba[max - 1]); print!( "\n printing array with implicit iteration:" ); print!("\n {:?}", ba); } fn main() { print!("\n -- demonstrate iteration --\n"); print!("\n -- string iteration --"); string_iteration(); idomatic_string_iteration(); print!("\n -- string iteration adapters --"); string_adapters(); print!("\n\n -- byte array iteration --"); define_and_iterate_byte_array(); idiomatic_define_and_iterate_byte_array(); print!("\n\n That's all Folks!\n"); } Output -- demonstrate iteration -- -- string iteration -- -- string iteration -- utf8 characters from "a test string": a t e s t s t r i n g test_string: "a test string" idiomatic utf8 chars from "another test string": a n o t h e r t e s t s t r i n g at index 1 char of "another test string" is n test_string: "another test string" -- string iteration adapters -- "abc123" is alphabetic false "abc123" is alphanumeric true "abc123" is numeric false "abc123" is ascii true third and fourth chars of "abc123" are: c1 numeric chars of "abc123" are: "123" -- byte array iteration -- bytes from byte array: [1, 2, 3, 4, 5] idiomatic bytes from byte array: length of byte slice: 5 [5, 4, 3, 2, 1] printing array with implicit iteration: [5, 4, 3, 2, 1] That's all Folks! C# hide To appear eventually

4. Epilogue

The following pages provide sequences of code examples for idioms and principles in each of the three languages cited here, e.g. C#, C++, and Rust. Object model differences will often be pointed out in comments within the code blocks.