diff --git a/ISSUE_BACKLOG.md b/ISSUE_BACKLOG.md new file mode 100644 index 0000000..917d220 --- /dev/null +++ b/ISSUE_BACKLOG.md @@ -0,0 +1,115 @@ +# Issue Backlog + +The following items are written so they can be turned into GitHub issues later with minimal rewriting. + +## 1. Specify the template language grammar + +Define the concrete grammar for variables, expressions, control flow, includes, comments, inline mode, and block mode so lexer and parser work can start on a stable spec. + +## 2. Specify CLI behavior, inspection commands, and exit codes + +Document flags, positional input handling, stdout/stderr behavior, inspection-only commands, and the exit-code matrix. + +## 3. Create the base project structure for CLI, engine, parser, evaluator, and config + +Set up the initial module layout for CLI, lexer, parser, AST/IR, evaluator, renderer, rules, settings, and cache. + +## 4. Implement input auto-detection for string, file, directory, and stdin + +Implement the positional-input resolution order and stdin fallback behavior. + +## 5. Implement output routing for stdout, output file, and output directory + +Support `-o` for single-output and directory-output modes, including usage errors for invalid combinations. + +## 6. Implement the variable store with deterministic last-write-wins behavior + +Create the variable-resolution layer that merges values from multiple CLI sources in argument order. + +## 7. Implement parsing for `-Dkey=value` + +Support direct scalar variable injection from the CLI. + +## 8. Implement parsing for `-Dkey=@file` and `@@` escaping + +Load variable values from files and treat `@@` as a literal `@` escape sequence. + +## 9. Implement parsing for `-d ` multi-variable input files + +Support reading multiple `key=value` pairs from a variable-definition file. + +## 10. Specify and implement external structured variable input + +Define how arrays, objects, and nested values are passed in from CLI and variable files, then implement that behavior. + +## 11. Implement the rule model and parser for global rules + +Add support for `-r key=value`, validation, and normalized internal rule representation. + +## 12. Implement selector rules for file extensions and exact relative paths + +Support `.md::key=value` and `README.md::key=value` forms. + +## 13. Implement deterministic rule precedence resolution + +Apply exact-path-over-extension-over-global precedence, with later definitions winning at equal specificity. + +## 14. Define and implement JSON rule import for `-r rules.json` + +Create the JSON schema for rule files and merge imported rules into the same precedence pipeline as CLI rules. + +## 15. Implement the settings namespace for `-s` / `--settings` + +Separate parser/engine/cache/CLI settings from template rules while preserving deterministic override behavior. + +## 16. Implement a simple high-performance lexer + +Tokenize template input with low overhead and support the selected delimiter configuration. + +## 17. Implement the recursive-descent parser for expressions and control flow + +Parse variable output, expressions, `if`, `else if`, `else`, `for`, `while`, and nesting into a stable internal representation. + +## 18. Implement expression evaluation for nested variables, member access, and indexing + +Support structured data access such as `user.name`, `items[0]`, and deeper nested expressions. + +## 19. Implement inline and block rendering semantics + +Add rendering behavior that clearly separates same-line substitution from multi-line control-flow output. + +## 20. Implement include processing with preprocessing and cycle detection + +Allow templates to include local files, preprocess them through the same engine, and fail cleanly on circular includes. + +## 21. Implement built-in function-like variables and runtime metadata helpers + +Add a reserved built-in namespace for values such as time, date, current file metadata, and loop metadata. + +## 22. Implement the core renderer for strings and single files + +Build the main rendering pipeline with variable substitution, expression evaluation, control flow, and output encoding. + +## 23. Implement directory rendering with relative path preservation + +Render full directory trees into a target directory while keeping relative input structure intact. + +## 24. Implement in-memory caching for parsed templates and include dependencies + +Add cache keys, cache invalidation, and correctness checks for repeated renders of unchanged templates. + +## 25. Evaluate and implement optional persistent cache support + +Design and, if justified, implement a disk-backed cache for repeated CLI executions. + +## 26. Implement diagnostics: `--benchmark`, `--list_rules`, `--explain`, and `-X` + +Add benchmark output, rule inspection, setting/rule explanations, and debug logging without corrupting render output. + +## 27. Build the automated test matrix for precedence, includes, control flow, and cache correctness + +Create tests for input detection, variable precedence, rule precedence, include behavior, control-flow rendering, diagnostics, and cache invalidation. + +## 28. Benchmark and optimize hot paths in lexer, parser, evaluator, and renderer + +Measure cold and warm performance, identify bottlenecks, and optimize the most important execution paths. diff --git a/LANGUAGE_SPEC.md b/LANGUAGE_SPEC.md new file mode 100644 index 0000000..b956e25 --- /dev/null +++ b/LANGUAGE_SPEC.md @@ -0,0 +1,529 @@ +# Prebyte Language Specification + +Version: 0.1-draft +Date: 2026-04-07 +Status: Proposed + +## 1. Purpose + +This document defines the proposed syntax and runtime semantics of the Prebyte template language. + +It exists to make lexer, parser, evaluator, and renderer implementation possible without keeping language behavior vague. + +This is still a draft spec, but unlike `REQUIREMENTS.md`, it is intentionally concrete. + +## 2. Design Goals + +- Keep the grammar simple enough for a fast lexer and recursive-descent parser. +- Support enough logic for realistic templating and scaffolding. +- Keep evaluation deterministic. +- Avoid arbitrary code execution. +- Preserve predictable whitespace and newline behavior. +- Treat missing data safely by default. + +## 3. Core Concepts + +The language supports two main template construct categories: + +- inline expressions +- block directives + +Inline expressions are intended for same-line substitution. + +Block directives are intended for multi-line control flow and template structure. + +The engine must support configurable delimiters, but this spec uses the default examples below: + +- inline expression delimiter: `{{ ... }}` +- block directive delimiter: `{% ... %}` + +Note: + +- The implementation may also support alternate delimiter pairs such as `%% ... %%` through configuration. +- Even when delimiters are configurable, the semantic distinction between inline expressions and block directives must remain. + +## 4. Data Types + +The evaluator must support the following value types: + +- string +- integer +- float +- boolean +- null +- array +- object +- missing + +`missing` is a first-class internal evaluation state used for safe access propagation. + +## 5. Variable and Path Access + +### 5.1 Identifiers + +Identifiers should support: + +- ASCII letters +- digits after the first character +- underscore + +Example: + +- `name` +- `project_name` +- `user1` + +### 5.2 Member Access + +Object fields are accessed with dot notation. + +Example: + +```text +{{ project.author.name }} +``` + +### 5.3 Index Access + +Arrays are accessed with bracket notation. + +Example: + +```text +{{ users[0].name }} +``` + +### 5.4 Safe Missing Propagation + +Missing access must be safe by default. + +Required semantics: + +- If a root variable does not exist, the result is `missing`. +- If a member is accessed on `missing`, the result remains `missing`. +- If an index is accessed on `missing`, the result remains `missing`. +- If an array index is out of bounds, the result is `missing`. +- If an object key does not exist, the result is `missing`. +- Chained access such as `test[0].hello.name` must not crash evaluation. + +Example: + +```text +{{ test[0].hello.name }} +``` + +If `test[0]` does not exist: + +- evaluation result is `missing` +- rendering does not hard-fail by default +- strict failure happens only if active rules require it + +## 6. Output Semantics + +### 6.1 Inline Expression Output + +Inline expressions evaluate to a value and write that value into the current output position. + +Example: + +```text +Hello {{ user.name }} +``` + +### 6.2 Missing Inline Output + +When an inline expression evaluates to `missing`: + +- if `strict_variables=false`, it is treated as unresolved output according to active rules +- if `default_variable_value` exists, that fallback may be used +- if `strict_variables=true`, rendering fails + +The exact precedence between `default_variable_value` and strict mode must follow the engine rules defined in `REQUIREMENTS.md`. + +## 7. Expressions + +Inline expressions and directive conditions may use expressions. + +### 7.1 Supported Expression Features + +- variable reference +- member access +- index access +- literals +- unary operators +- binary operators +- comparison operators +- boolean operators +- parenthesized expressions +- built-in functions or built-in values + +### 7.2 Literals + +Supported literals: + +- strings +- integers +- floats +- booleans: `true`, `false` +- `null` + +Examples: + +```text +{{ "hello" }} +{{ 42 }} +{{ 3.14 }} +{{ true }} +{{ null }} +``` + +### 7.3 Operators + +Planned operator classes: + +- unary: `!`, unary `-` +- arithmetic: `+`, `-`, `*`, `/`, `%` +- comparison: `==`, `!=`, `<`, `<=`, `>`, `>=` +- boolean: `&&`, `||` + +### 7.4 Truthiness + +The evaluator should use the following truthiness rules: + +- `false` is false +- `null` is false +- `missing` is false +- empty string is false +- empty array is false +- empty object is false +- numeric zero is false +- all other values are true + +This rule set keeps control-flow predictable and useful for templating. + +## 8. Control Flow + +Control flow is expressed with block directives. + +### 8.1 If / Else If / Else + +Example syntax: + +```text +{% if user.is_admin %} +Admin +{% else if user.is_guest %} +Guest +{% else %} +User +{% end %} +``` + +Rules: + +- `if` starts a conditional block +- `else if` adds an additional conditional branch +- `else` provides the fallback branch +- `end` closes the block + +### 8.2 For Loops + +Example syntax: + +```text +{% for item in items %} +- {{ item.name }} +{% end %} +``` + +Rules: + +- `for` iterates arrays or other iterable values supported by the engine +- the loop body may contain nested expressions and nested directives +- empty collections produce no loop body output + +### 8.3 While Loops + +Example syntax: + +```text +{% while has_more %} +{{ next_value }} +{% end %} +``` + +Rules: + +- `while` is allowed by language design +- implementation must prevent unsafe infinite-loop behavior + +Constraint: + +- v1 should define an engine setting for maximum loop iterations or equivalent safety guard + +### 8.4 Nesting + +All control-flow blocks may be nested. + +Example: + +```text +{% for user in users %} +{% if user.active %} +{{ user.name }} +{% end %} +{% end %} +``` + +## 9. Inline vs Block Behavior + +The language must distinguish same-line substitution from multi-line structural rendering. + +### 9.1 Inline Mode + +Inline mode is intended for substitutions that remain on the same line. + +Example: + +```text +status={{ if enabled }}on{{ else }}off{{ end }} +``` + +Semantic rule: + +- inline constructs must not unexpectedly insert leading newlines + +### 9.2 Block Mode + +Block mode is intended for directives that control whole regions of output. + +Example: + +```text +{% if enabled %} +feature = true +{% end %} +``` + +Semantic rules: + +- block constructs may emit multi-line output +- surrounding newline handling must be predictable +- omitted blocks must not leave broken syntax artifacts unless the template author explicitly wrote them + +Note: + +- The final inline shorthand syntax may still change, but the semantic distinction itself is required. + +## 10. Includes + +Includes insert another template into the current render flow. + +### 10.1 Basic Include + +Example syntax: + +```text +{% include "partials/header.txt" %} +``` + +Behavior: + +- the included file is loaded from the local filesystem +- the included file is lexed, parsed, and rendered with the same engine +- included content may itself contain expressions, control flow, and includes + +### 10.2 Include Resolution + +Resolution order should be: + +1. current file-relative path, if allowed by engine rules +2. configured `include_path`, if present +3. fail with include-resolution error + +### 10.3 Include Safety + +Requirements: + +- local files only in v1 +- no network sources +- include cycles must be detected +- cycle errors must report the include chain clearly + +## 11. Built-Ins + +Built-ins are reserved names or callable helpers exposed by the engine. + +### 11.1 Built-In Namespace + +To avoid collisions, built-ins should use a reserved namespace. + +Recommended direction: + +- `__TIME__` +- `__DATE__` +- `__FILE__` +- `__DIR__` +- `__INDEX__` + +or a namespaced function style such as: + +- `sys.time()` +- `sys.date()` +- `file.path()` + +The final naming convention still needs to be frozen. + +### 11.2 Built-In Categories + +Planned built-in categories: + +- time/date +- input path metadata +- current file metadata +- current directory metadata +- loop metadata +- render metadata + +### 11.3 Built-In Usage + +Built-ins may be used inside: + +- inline output expressions +- conditions +- loops +- include arguments, if supported by the final grammar + +## 12. Comments + +Template comments are controlled by the `allow_comments` rule. + +Proposed syntax: + +```text +{# this is a comment #} +``` + +Behavior: + +- comments produce no output +- comments may appear between text and directives + +## 13. Whitespace and Newline Handling + +Whitespace behavior must be predictable because the engine is intended for code generation and document rendering. + +Relevant rule properties: + +- `trim` +- `strip` +- `trim_spaces` +- `replace_tabs` +- `tab_size` + +Current rule: + +- the exact whitespace policy is not fully frozen yet +- the engine must keep whitespace behavior deterministic +- block omission must not produce surprising duplicated blank lines unless implied by the template text itself + +## 14. Error Semantics + +### 14.1 Parse Errors + +Parse errors must include: + +- file context where available +- line number +- column number +- a useful message + +### 14.2 Evaluation Errors + +Evaluation errors include: + +- strict missing-variable failures +- invalid operator usage +- invalid loop input types +- include resolution failures +- include cycle failures + +### 14.3 Safe Missing Access vs Errors + +Safe missing access is not itself an evaluation error. + +It only becomes an error when: + +- strict variable rules require failure +- a specific construct explicitly requires a resolved value + +## 15. Example Snippets + +### 15.1 Simple Variable + +```text +Hello {{ name }} +``` + +### 15.2 Nested Access + +```text +Author: {{ project.author.name }} +``` + +### 15.3 Safe Missing Access + +```text +Hello {{ users[0].profile.name }} +``` + +If `users[0]` is missing, the whole expression evaluates to `missing`. + +### 15.4 Conditional Block + +```text +{% if project.private %} +visibility = private +{% else %} +visibility = public +{% end %} +``` + +### 15.5 Loop + +```text +{% for file in files %} +- {{ file.name }} +{% end %} +``` + +### 15.6 Include + +```text +{% include "partials/footer.txt" %} +``` + +### 15.7 Built-In + +```text +Generated at {{ __TIME__ }} +``` + +## 16. Open Spec Points + +- Final delimiter strategy for expression vs block constructs +- Final shorthand syntax for inline conditionals +- Final built-in naming scheme +- Final structured external variable input format +- Exact whitespace-control semantics +- Exact loop safety configuration for `while` + +## 17. Summary + +This spec defines a template language that is: + +- expressive enough for real template generation +- safe by default for missing nested access +- deterministic in evaluation +- compatible with a simple lexer and recursive-descent parser +- performance-oriented by design diff --git a/REQUIREMENTS.md b/REQUIREMENTS.md new file mode 100644 index 0000000..61b4da7 --- /dev/null +++ b/REQUIREMENTS.md @@ -0,0 +1,603 @@ +# Prebyte Template Engine Requirements + +Version: 0.2-draft +Date: 2026-04-07 +Status: Draft + +## 1. Vision + +Prebyte is a high-performance template engine and CLI designed to render: + +- inline strings +- single files +- entire directory trees + +The product goal is to deliver a faster, non-Python alternative to tools like Cookiecutter while keeping the internal architecture simple, predictable, and performance-oriented. + +The core implementation should favor: + +- a simple lexer +- a recursive-descent parser +- an efficient intermediate representation +- a fast rendering pipeline +- low-overhead CLI behavior + +## 2. Product Goals + +- Provide extremely fast template rendering for strings, files, and directories. +- Support a template language that is expressive enough for real scaffolding and text generation. +- Keep CLI behavior deterministic and script-friendly. +- Support scoped rules and engine settings with explicit precedence. +- Avoid Python runtime dependencies completely. +- Make preprocessing, parsing, and rendering efficient enough for repeated CLI and automation usage. + +## 3. Non-Goals + +- Full Cookiecutter compatibility +- Python hooks or Python-based execution +- GUI or TUI in v1 +- Network-based includes or template fetching in v1 +- Plugin system in v1 +- Arbitrary code execution inside templates + +## 4. MoSCoW Prioritization + +### Must Have + +- Render inline strings, single files, and entire directory trees +- Auto-detect whether positional input is a file, a directory, or a literal string +- Read from `stdin` when no positional input is provided +- Write to `stdout` when no output target is provided +- Support `-o` for file and directory output +- Support variables via `-Dkey=value` +- Support variables via `-Dkey=@file` +- Support variable files via `-d ` +- Support `@@` as an escape for literal `@` +- Support rule definitions via `-r` / `--rule` +- Support rule scope at global, file extension, and exact relative path level +- Define deterministic rule precedence +- Keep `-s` / `--settings` separate from template rules +- Implement a simple lexer and a recursive-descent parser +- Support control flow in template expressions and template blocks +- Support nested variables, structs/objects, arrays, indexing, and member access +- Support includes that load and preprocess other files +- Support built-in function-like variables such as time/date/system metadata equivalents +- Add caching to reduce repeated parsing/render overhead +- Prioritize speed across the entire pipeline + +### Should Have + +- Rule loading from JSON files via `-r rules.json` +- `--benchmark` +- `--list_rules` +- `-e` / `--explain` +- `-X` for debug output +- Good parse/render error messages with file context +- Cache invalidation behavior that is explicit and predictable + +### Could Have + +- Structured benchmark output, for example JSON +- Rule origin diagnostics per file +- Extended ignore-pattern syntax beyond simple wildcards +- Optional persistent cache on disk in addition to in-memory cache + +### Won't Have in v1 + +- Remote includes +- Python-based template hooks +- Full Cookiecutter compatibility +- Plugin runtime +- Arbitrary embedded scripting languages + +## 5. Functional Requirements + +### 5.1 Input Detection + +The CLI accepts zero or one positional input. + +Input resolution order: + +1. If no positional input is provided, read from `stdin`. +2. If the positional input points to an existing file, use file mode. +3. If the positional input points to an existing directory, use directory mode. +4. Otherwise, treat the positional input as a literal string. + +Acceptance criteria: + +- `prebyte` reads from `stdin`. +- `prebyte template.txt` uses file mode if `template.txt` exists. +- `prebyte templates/` uses directory mode if `templates/` exists. +- `prebyte foo/bar.txt` treats `foo/bar.txt` as a literal string if the path does not exist. + +### 5.2 Output Behavior + +Rules: + +1. Without `-o`, string and file input write to `stdout`. +2. With `-o `, string and file input write to the target file. +3. With directory input, `-o ` is required. +4. Directory rendering must preserve relative input structure. + +Acceptance criteria: + +- `prebyte "Hello {{name}}" -Dname=World` writes to `stdout`. +- `prebyte template.txt -o out.txt` writes to `out.txt`. +- `prebyte templates/ -o out/` renders into `out/` while preserving relative paths. +- `prebyte templates/` without `-o` exits with a usage error. + +### 5.3 Variable Sources + +Supported forms: + +- `-Dkey=value` +- `-Dkey=@value.txt` +- `-d vars.txt` + +Semantics: + +- `-Dkey=value` sets `key` directly. +- `-Dkey=@value.txt` loads the entire contents of `value.txt` into `key`. +- `-Dkey=@@value.txt` sets the literal value `@value.txt`. +- `-d vars.txt` loads multiple variables from a file. + +Assumption for `-d ` in v1: + +- UTF-8 encoded file +- one `key=value` pair per line +- blank lines allowed +- lines starting with `#` are comments + +Variable precedence: + +1. Variable definitions are applied in CLI order. +2. For duplicate keys, last write wins. +3. If `allow_env=true`, environment variables may be used as fallback only. +4. Explicit CLI-defined variables override environment values. +5. If a variable still cannot be resolved, `default_variable_value` may apply. +6. If it still cannot be resolved and `strict_variables=true`, rendering fails. + +Missing-value evaluation semantics: + +- Accessing a missing root variable evaluates to a missing value. +- Accessing a member on a missing value evaluates to a missing value. +- Accessing an array index on a missing value evaluates to a missing value. +- Accessing an out-of-range array index evaluates to a missing value. +- Accessing a missing object key evaluates to a missing value. +- Missing-value propagation continues through chained access, for example `test[0].hello.name`. +- Missing-value propagation becomes an error only when the active rule set requires strict failure. + +Acceptance criteria: + +- `-Dname=Alice -Dname=Bob` resolves `name` to `Bob`. +- `-Dbody=@body.txt` uses the contents of `body.txt`. +- `-Dbody=@@body.txt` uses the literal string `@body.txt`. +- Accessing `test[0].hello.name` when `test[0]` does not exist evaluates to a missing value instead of crashing evaluation. + +### 5.4 Data Model in Templates + +Templates must support structured data and nested access. + +Required data capabilities: + +- scalar values: string, integer, float, boolean, null +- arrays/lists +- structs/objects/maps +- nested data structures +- member access, for example `user.name` +- index access, for example `users[0]` +- nested expressions, for example `users[0].profile.name` + +Acceptance criteria: + +- A variable tree such as `project.author.name` can be resolved from structured input. +- Array indexing is supported in expressions. +- Nested object and array access works inside control-flow expressions and output expressions. +- Missing nested access such as `test[0].hello.name` is treated as a missing value, not as an immediate hard failure, unless strict mode requires an error. + +### 5.5 Template Language + +The template language must support both expression output and control flow. + +Supported delimiter behavior: + +- variable/expression delimiters are configurable through `variable_prefix` and `variable_suffix` +- examples include `{{ ... }}` and `%% ... %%` + +The language must support: + +- plain variable output +- expressions within delimiters +- conditional logic: `if`, `else if`, `else` +- loops: `for`, `while` +- nested control flow +- nested variable references +- includes +- built-in function-like values or built-in expression functions + +Open syntax point: + +- The exact concrete grammar is still to be specified, but the engine must support these language features. + +### 5.6 Inline vs Block Control Flow + +Control flow and preprocessing must support two rendering modes: + +- inline mode: replacement stays on the same output line +- block mode: replacement may emit content that starts on a new line or spans multiple lines + +Requirements: + +- The language must distinguish constructs intended for inline substitution from constructs intended to control multi-line output. +- The render semantics must preserve predictable newline behavior. +- Block replacements must not accidentally collapse or duplicate surrounding newlines. + +Open point: + +- The exact syntax that differentiates inline constructs from block constructs still needs to be specified. + +Acceptance criteria: + +- Inline conditions can render alternative values on the same line. +- Block conditions can include or omit full line blocks without corrupting surrounding formatting. +- Loop output in block mode preserves intended line structure. + +### 5.7 Includes + +Templates must support including other files. + +Requirements: + +- Includes may load another file or that file's contents into the current render flow. +- Included content must be preprocessed and rendered using the same engine pipeline. +- Includes must respect active rules and relevant file-scoped behavior. +- Includes are disabled unless `allow_includes=true`. +- Include resolution starts from `include_path` if configured. + +Security and determinism requirements: + +- Includes must be local-file only in v1. +- Include path resolution must be deterministic. +- Recursive include loops must be detected and fail with a clear error. + +Acceptance criteria: + +- A template can include another file and render the included content. +- Included templates can themselves contain variables and control flow. +- Circular includes fail with a specific error. + +### 5.8 Built-In Variables and Built-In Functions + +The engine must support built-in function-like values, for example equivalents of `__TIME__`, `__DATE__`, or similar runtime metadata helpers. + +Requirements: + +- Built-ins must be clearly namespaced or reserved to avoid collisions with user variables. +- Built-ins may expose time/date, file metadata, path metadata, render metadata, or engine metadata. +- Built-ins must be deterministic where possible, or clearly documented when dynamic. +- Built-ins must be usable inside expressions and control flow. + +Examples of desired capabilities: + +- current timestamp +- current date/time in formatted form +- input path or current file path +- current relative directory +- render iteration metadata in loops + +Open point: + +- The exact built-in naming scheme and final built-in set remain to be specified. + +### 5.9 Rules + +Rules are set via `-r` or `--rule`. + +Supported forms: + +- global: `-r strict_variables=true` +- file extension scope: `-r .md::strict_variables=true` +- exact relative path scope: `-r README.md::strict_variables=true` +- file-based import: `-r rules.json` + +Rule precedence for a specific target file: + +1. exact relative path +2. file extension +3. global + +If two rules have the same specificity, the later declaration wins. + +Additional rule: + +- For literal string input and `stdin` without file context, only global rules apply. + +Acceptance criteria: + +- `-r strict_variables=false -r .md::strict_variables=true` makes markdown files use `strict_variables=true`. +- `-r strict_variables=false -r .md::strict_variables=true -r README.md::strict_variables=false` makes only `README.md` use `false`. + +### 5.10 Settings + +`-s` / `--settings` is reserved for parser, engine, cache, or CLI settings and is separate from template rules. + +Requirements: + +- Settings use a separate namespace. +- Settings resolve conflicts with last write wins. +- Settings must not implicitly override template rule properties. + +### 5.11 Rule and Property Catalog + +Core rule properties for v1: + +- `strict_variables` +- `case_sensitive_variables` +- `default_variable_value` +- `variable_prefix` +- `variable_suffix` +- `max_variable_length` +- `replace_tabs` +- `tab_size` +- `trim` +- `strip` +- `trim_spaces` +- `allow_includes` +- `include_path` +- `output_encoding` +- `allow_comments` +- `allow_env` + +Additional CLI-adjacent functions: + +- `-i` / `--ignore` +- `--benchmark` +- `-e` / `--explain` +- `-X` +- `--list_rules` + +#### `strict_variables` + +- Type: boolean +- Meaning: fail rendering if a variable or expression dependency cannot be resolved after normal missing-value propagation rules have been applied. + +#### `case_sensitive_variables` + +- Type: boolean +- Meaning: variable lookup is case-sensitive. + +#### `default_variable_value` + +- Type: string +- Meaning: fallback value when a variable is missing and rendering is allowed to continue. + +#### `variable_prefix` + +- Type: string +- Meaning: start delimiter for expressions or variable processing. + +#### `variable_suffix` + +- Type: string +- Meaning: end delimiter for expressions or variable processing. + +#### `max_variable_length` + +- Type: integer +- Meaning: maximum allowed variable identifier length. + +#### `replace_tabs` + +- Type: boolean +- Meaning: replace tab characters in output. + +#### `tab_size` + +- Type: integer +- Meaning: number of spaces used when replacing tabs. + +#### `trim` + +- Type: boolean +- Meaning: trim leading and trailing whitespace in defined contexts. +- Open point: exact scope and application points still need specification. + +#### `strip` + +- Type: boolean +- Meaning: strip specific whitespace or line-boundary content according to template semantics. +- Open point: must be clearly distinguished from `trim`. + +#### `trim_spaces` + +- Type: boolean +- Meaning: remove spaces according to a defined output policy. +- Open point: must clarify whether only literal spaces or broader whitespace are affected. + +#### `allow_includes` + +- Type: boolean +- Meaning: enable include processing. + +#### `include_path` + +- Type: path/string +- Meaning: default base path for resolving includes. + +#### `output_encoding` + +- Type: string +- Meaning: output file encoding, for example `utf-8`. + +#### `allow_comments` + +- Type: boolean +- Meaning: enable template comment syntax. + +#### `allow_env` + +- Type: boolean +- Meaning: permit environment variable lookup as fallback. + +#### `error_on_false_input` + +- Status: open +- Reason: the intended semantics are still underspecified. + +### 5.12 Ignore Semantics + +`-i` / `--ignore` defines variables or patterns that should not be resolved. + +Assumption for v1: + +- exact names +- simple `*` wildcards + +Behavior: + +- ignored variables are not replaced +- ignored variables do not trigger strict-variable failures + +### 5.13 Benchmark, Explain, Debug, and Rule Listing + +#### `--benchmark` + +- Prints runtime and basic statistics. +- Diagnostic output goes to `stderr`. + +#### `-e` / `--explain ` + +- Explains a rule or setting. +- Should include type, meaning, scope, and allowed values. + +#### `-X` + +- Enables debug output. +- Debug output goes to `stderr`. + +#### `--list_rules` + +- Prints normalized active rules. +- Should show scope, property, value, and origin. + +## 6. Caching Requirements + +Caching is a core requirement because the engine is explicitly optimized for speed. + +The implementation must support at least in-memory caching for: + +- parsed templates +- token streams or equivalent parse inputs when useful +- include resolution metadata when safe +- reusable rule resolution artifacts when safe + +Caching requirements: + +- Cache behavior must be deterministic. +- Cache invalidation must account for input file changes and included file changes. +- Cached parse results must not leak state across independent render operations. +- String input caching may use the literal input text and relevant settings as cache key components. +- File-based caching must account for file identity and modification changes. + +Should-have extension: + +- optional persistent cache on disk for repeated CLI runs + +Acceptance criteria: + +- Repeated rendering of the same template set with unchanged inputs is measurably faster after warm-up. +- Changing an included file invalidates dependent cached render units. +- Cache-enabled renders produce byte-identical output to cold renders. + +## 7. Non-Functional Requirements + +### 7.1 Performance + +The engine must be built for efficiency first. + +Initial target metrics for a measurable first version: + +- `--help` in release builds should typically complete under 100 ms. +- String rendering of roughly 10 KB with 100 variables should complete in low single-digit milliseconds inside the engine. +- Single-file rendering of roughly 1 MB should complete well under 100 ms in normal cases. +- Directory rendering of roughly 100 files totaling 10 MB should complete under 1.5 s locally in normal cases. +- Warm-cache repeated runs should outperform cold-cache repeated runs in benchmark mode. + +Note: + +- Exact benchmark methodology and reference hardware still need to be specified. + +### 7.2 Architecture + +- The lexer must stay intentionally simple. +- The parser must be implemented as a recursive-descent parser. +- CLI, parser, evaluator, rule resolution, and renderer should be clearly separated. +- The internal representation should optimize the hot render path. +- Control-flow and expression evaluation must avoid arbitrary runtime execution. + +### 7.3 Robustness + +- Equal inputs with equal rules and equal settings must produce deterministic output. +- Parse and render failures must be clear and actionable. +- Errors should include file path and line/column when relevant. +- Circular include detection must be explicit. +- Invalid control-flow syntax must fail fast. +- Safe missing-value propagation must be deterministic and must not crash chained evaluation. + +### 7.4 Security + +- No arbitrary code execution in templates +- No network access in v1 +- Environment variable access disabled by default +- Includes disabled by default +- Include resolution restricted to local filesystem paths +- Built-ins must not expose sensitive process information by default + +### 7.5 Portability + +- Linux: must support +- macOS: should support +- Windows: should support +- No Python runtime dependency + +## 8. CLI Semantics and Exit Codes + +### stdout / stderr + +- Render output goes to `stdout` unless a target file or directory is used. +- Errors go to `stderr`. +- Debug and benchmark output go to `stderr`. +- `--list_rules` and `--explain` should behave as inspection commands and must not corrupt render output behavior. + +### Exit Codes + +- `0`: success +- `2`: CLI or usage error +- `3`: I/O error +- `4`: parse or render error +- `5`: invalid rule or setting configuration +- `6`: include resolution or include cycle error +- `7`: cache consistency or cache configuration error + +## 9. Assumptions and Open Points + +- The exact concrete grammar for variables, blocks, includes, comments, and expressions is still open. +- The syntax distinction between inline and block control flow is still open. +- The JSON schema for `-r rules.json` is still open. +- The exact behavior of `trim`, `strip`, and `trim_spaces` still needs clarification. +- The meaning of `error_on_false_input` is still open. +- Symlink behavior during directory rendering and include resolution is still open. +- The exact built-in naming convention and built-in catalog are still open. +- The structured input format for complex variables from CLI/files may need an additional formal spec if objects and arrays are passed externally. + +## 10. Definition of Done for v1 + +- All Must-Have requirements are implemented. +- Rule precedence and variable precedence are covered by automated tests. +- Structured data access, control flow, and include behavior are covered by automated tests. +- stdout/stderr behavior is stable and documented. +- Circular include handling is tested. +- Cache correctness and cache invalidation are tested. +- CLI help documents all relevant flags and examples. +- Key performance targets are measurable and benchmarked. +- No open critical defects remain in string, file, or directory rendering paths.