Brainstorming session with Gemini, 2026 03 21
Smart Regex (//) Engine Specification v1.0
A Context-Aware Stream-Editing Pre-Processor for Code Refactoring
1. Core Philosophy
The Smart Regex engine translates programmer intent into robust POSIX/PCRE regular expressions. It operates on syntactic tokens rather than literal characters.
- Intent Over Literal: It parses identifiers, values, and structural boundaries.
- Whitespace Agnostic: A literal space
in the search string automatically matches any amount of horizontal whitespace ([ \t]*). - Format Preserving: Operations on captured tokens (like math or slicing) preserve the radix, prefix, and delimiter formatting of the original source text.
- Lexical Safety: Smart Atoms will not match inside string literals or comment blocks unless explicitly instructed to do so.
2. Invocation (The Triggers)
Smart Mode is explicitly triggered to prevent parser ambiguity and preserve standard vi compatibility.
- Cursor/Visual Mode: Triggered by
//- Example:
//\i = \n
- Example:
- Ex Command Mode: Triggered by a Double Delimiter immediately following the substitute command.
- Example:
:s--\i=\n--...-(Smart Mode ON, delimiter is-) - Example:
:s/\i=\n/.../(Smart Mode OFF, standard regex)
- Example:
3. Search Atoms (Left-Hand Side)
| Atom | Name | Description | Example Match |
|---|---|---|---|
\i |
Identifier | Valid C/Assembly identifier. Respects word boundaries. | counter_A, PORT_1 |
\n |
Radix Number | Multi-base numeric value (Hex, Dec, Bin). | 10, $0A, 0x0A |
\q |
Quote/String | Complete string/char literal. Handles escaped quotes (\"). |
"Error: %d", 'c' |
\c |
Comment | Complete comment block based on active filetype. | // note, /* ... */, ; init |
\_ |
Magic Blank | Bridges vertical whitespace and skips hexdump address gutters. | Spaces crossing a line break. |
(...) |
Enclosed List | Captures balanced, nested lists. Performs Depth-Zero split. | func(a, b(c), d) |
\... |
Naked List | Captures variadic lists until a terminator/newline. | MACRO arg1, arg2 |
4. Substitution Grammar (Right-Hand Side)
Captured groups can be referenced literally ($1, $2) or manipulated using an Evaluation Block: ${...}.
4.1 Array Slicing & Filtering
If a group contains a list (captured via ... or \...), it can be sliced. The engine automatically re-joins elements using the original delimiter.
- Reorder:
${2:1,0,2}(Swaps the first two arguments). - Ranges:
${2:1-3,5-}(Selects elements 1, 2, 3, and 5 to the end). - Drop Mask (
!): Starting the slice with!creates a "Keep All Except" boolean mask.- Example:
${2:!1}drops the second argument (index 1) and heals the commas.
- Example:
4.2 Radix-Aware Arithmetic
Perform math on captured \n tokens. The output string adopts the prefix and zero-padding of the original capture.
- Stateless Math:
${1 + 0x10}shifts the captured value by a fixed offset. - Stateful Sequencing: The
#iterator tracks sequence state across a single global substitution command (e.g.,:%s--).- Example:
${#}outputs the sequence number, auto-incrementing on each match. - Example:
${#=100}seeds the sequence at 100 for the first match.
- Example:
4.3 The Spatial Slice Matrix
Extract raw text spans and "glue" relative to the captured groups. The first character defines the Left Boundary; the second defines the Right Boundary.
><(The Gap):${1><2}extracts only the literal text between the end of$1and the start of$2.=<(Left-Inclusive):${1=<3}extracts from the start of$1up to the start of$3.>=(Right-Inclusive):${1>=3}extracts from the end of$1up to the end of$3.==(Total Span):${1==3}extracts from the start of$1to the end of$3.
5. Practical Recipes
1. C++ Argument Collapse (Spatial Slice) Collapse a function call while keeping the assignment operator and exact spacing intact.
- Search:
:s--auto \i=\i(\...)-- - Replace:
int ${1=<3} - Result:
auto result = calc(a, b)$\rightarrow$int result = calc
2. Assembly Sequence Generation (Math Evaluator)
Generate sequentially numbered #define hardware ports from a template.
- Search:
:%s--#define PORT_\i-- - Replace:
#define PORT_${#} ${0x2100 + #} - Result:
#define PORT_0 0x2100...#define PORT_1 0x2101
3. Variadic Macro Refactoring (List Slicing) Remove a deprecated middle argument (index 1) from an assembly macro list.
- Search:
:s--\i \... \_ \c-- - Replace:
${1} ${2:!1} ${3} - Result:
MAC D, OBSOLETE, S ; note$\rightarrow$MAC D, S ; note
4. Safely Modifying Strings (Quote Atom) Wrap all raw string literals in a localization macro.
- Search:
:%s--\q-- - Replace:
_T(${1}) - Result:
print("Hello \"world\"");$\rightarrow$print(_T("Hello \"world\""));