Code/Projects/ folder holds multi-language implementations of software
tools, each built to the same design and command-line interface so the languages can be
compared side-by-side. Currently two project families are included.
| Family | Languages | Description |
|---|---|---|
| TextFinder | C++23, C#, Python, Rust, Rust (opt) | Walk a directory tree and report files whose content matches a regular expression |
| PageValidator | Rust, C++23, C#, Python | Validate HTML files for structural correctness against eight rules |
CommandLine DirNav Output
\ | /
EntryPoint
/Key [Value] tokens from argv| Option | Meaning | Default |
|---|---|---|
| /P <path> | Root directory to search | . (current directory) |
| /p <ext,...> | File extensions to include | all files |
| /r <regex> | Regular expression matched against file content | . (any) |
| /s | Recurse into subdirectories | true |
| /H | Show only directories with matches | true |
| /h | Print help and exit |
class)
| Variant | Min (s) | Median (s) | Max (s) |
|---|---|---|---|
| PyTextFinder | 0.222 | 0.281 | 0.715 |
| RustTextFinderOpt | 0.536 | 0.610 | 1.034 |
| CppTextFinder | 0.568 | 0.647 | 0.706 |
| CsTextFinder | 0.827 | 1.053 | 1.456 |
| RustTextFinder | 0.873 | 0.905 | 1.402 |
os.scandir, re.search) runs in C.
C++ trails partly because std::regex uses a slower backtracking engine
compared to the DFA-based engines in Python and Rust.
Tokenizer ← Lexer ← Validator ← EntryPoint
| Rule ID | Description |
|---|---|
| doctype | Document begins with <!DOCTYPE html> |
| root-element | Exactly one <html> element wraps the entire document |
| head-required | <head> is present and contains at least one <title> |
| body-required | <body> is present |
| tag-nesting | Every open tag has a matching close tag in correct stack order |
| void-elements | Void elements (br, hr, img, input, link, meta, ...) carry no close tag |
| attr-quotes | All attribute values are enclosed in quotes |
| duplicate-id | The id attribute value is unique within the document |
| Validator | Min (s) | Median (s) | Max (s) |
|---|---|---|---|
| C++ (Release) | 0.521 | 0.645 | 2.729 |
| Rust (Release) | 0.901 | 0.936 | 1.972 |
| C# (Release) | 1.127 | 1.290 | 1.538 |
| Python | 2.766 | 2.846 | 3.029 |
Code/Projects/ support timing and metrics collection.
python tf_timer.py <program> [--runs N] [TextFinder options ...]
Program names: PyTextFinder, CsTextFinder, CppTextFinder,
RustTextFinder, RustTextFinderOpt.
python pa_timer.py [--site PATH] [--runs N]
python code_metrics.py [path] [--html] [--html-only] [--no-recurse]
Recognized extensions: .py .cs .cpp .c .h .hpp .ixx .rs .js .ts .jsx .tsx .java .go