PyTextFinder

recursive, regex-based file-content search tool

Concept:

A recursive, regex-based file-content search tool written in Python 3.10+. Given a root directory, a set of file extensions, and a regular expression, it walks the directory tree and prints every file whose content contains a match. No build step required — run directly with Python.

Packages:

The project is composed of four packages wired together only in the entry point.
PackageKindRole
CommandLine library Parses /Key [Value] or -Key [Value] command-line tokens into a key-value map
DirNav library Depth-first directory walker; fires callable delegates for each directory and file
Output library Reads file content, tests it against a compiled regex, and writes matching results to the console
EntryPoint executable Application — wires the packages, parses options, and drives the search

Quick Start:

# Search the current directory tree for Python files containing "def "
python EntryPoint/PyTextFinder.py /P . /p py /r "def "

# Git Bash / MINGW users - use - prefix to avoid shell path expansion
python EntryPoint/PyTextFinder.py -P . -p py -r "def "

Command-Line Options:

Options accept either a / prefix (Windows / PowerShell / cmd) or a - prefix (bash / MINGW); both are equivalent.
OptionArgumentDefaultMeaning
/P path . Root directory for the search
/p extensions (all files) Comma-separated file extensions to include, e.g. py,txt
/r regex . (any) Regular expression matched against file content
/s true/false true Recurse into subdirectories
/H true/false true true: print a directory only when it has a matching file (clean output). false: print every directory as it is entered (real-time progress).
/v (flag) off Print all resolved options before searching
/h (flag) off Print help and exit
-T (flag) off Run the built-in test suite and exit
Examples:
# Find all Python files containing "class" under the repo root
python EntryPoint/PyTextFinder.py /P . /p py /r "class"

# Search source and markdown files for a TODO comment, show all directories
python EntryPoint/PyTextFinder.py /P . /p "py,md" /r "TODO" /H false

# Verbose output - shows resolved path, extensions, and regex before searching
python EntryPoint/PyTextFinder.py /P .. /p py /r "def " /v

Output:

Matching files are grouped under their containing directory:
  PyTextFinder
 ==============
  searching path: "."
  extensions: ["py"]
  matching files with regex: "def "

  ./DirNav
      "dir_nav.py"

  ./EntryPoint
      "PyTextFinder.py"

  2 files matched out of 8 visited in 4 dirs
Directories that contain no matching files are hidden by default (/H true). Set /H false to print every directory as it is entered.

Design:

CommandLine

Parses raw argv tokens into a dictionary of option keys to string values. A token starting with / or - is treated as a key; the next token (if it does not itself start with / or -) becomes its value, otherwise the value is set to "true".

DirNav

Depth-first directory walker. Callers register callables:
on_dir(path: str)   # called when entering a directory
on_file(path: str)  # called for each file whose extension is in the filter list
DirNav fires on_dir when entering each directory and on_file for each matching file. Build-output and VCS directories are skipped automatically — never entered.

Output

Holds a compiled regex pattern and exposes on_dir(path) and on_file(path) methods matching the DirNav callable signatures. on_file reads the file content and applies the regex. When a match is found the filename is printed; the containing directory header is buffered and flushed on the first match (implementing the hide-empty-dirs logic for /H true).

EntryPoint (PyTextFinder.py)

Wires the three libraries together. PyTextFinder.py parses options via CmdLine, constructs an Output instance with the resolved regex, then constructs a DirNav instance and registers output.on_dir and output.on_file as the walk callables before starting the traversal. The three libraries never import each other — all coupling flows through EntryPoint.

Hide/show directory logic

When /H true (the default), a directory header is buffered rather than printed immediately. It is flushed to output only when the first matching file in that directory is found. Directories with no matches are never printed.

Excluded directories

Language / toolSkipped names
C# / .NETbin, obj
Rusttarget
C++build, out
Python__pycache__, .venv, venv, dist
VCS / IDE.git, .vs, .idea
Archivesarchive

Running the Tool:

No build step is required. Python 3.10+ must be installed.
# Verify Python version
python --version    # should show 3.10 or later

# Run directly from the repo root
python EntryPoint/PyTextFinder.py -P . -p py -r "def "

# PowerShell / cmd.exe can use / prefix
python EntryPoint/PyTextFinder.py /P . /p py /r "def "

Testing:

# Run the built-in test suite via the -T flag
python EntryPoint/PyTextFinder.py -T

# Or use the standard unittest runner
python -m unittest discover -s . -p "test_*.py"

# Functional smoke test - search this repo for Python files containing "def "
python EntryPoint/PyTextFinder.py -P . -p py -r "def " /v

External Dependencies:

ModuleSourceUsed byPurpose
re Python stdlib Output Compile and match regular expressions against file content
os, os.path Python stdlib DirNav Directory enumeration and path manipulation
sys Python stdlib CommandLine, EntryPoint Command-line argument access
unittest Python stdlib test modules Unit testing framework
All functionality uses the Python standard library only. No pip packages required.