feat: lazy per-bundle loading for vault, driver, datastore, and report registries#1089
feat: lazy per-bundle loading for vault, driver, datastore, and report registries#1089
Conversation
Heads up: merge conflicts with #1103 + #1104PRs #1103 and #1104 (fixing #1100 — S3 datastore bundle path resolution) landed on main and touch the same files this PR modifies. Here's what changed and what you'll need to account for when rebasing: What changedAll 4 extension loaders ( constructor(denoRuntime: DenoRuntime, repoDir: string | null = null)to: constructor(
denoRuntime: DenoRuntime,
repoDir: string | null = null,
datastoreResolver?: DatastorePathResolver,
)Except What this PR needs to do after rebasing
Key constraint
|
…t registries Extend the lazy per-bundle loading pattern (from #1053) to all remaining extension registries. Each registry now supports: - LazyEntry types for catalog-indexed but not-yet-imported types - setTypeLoader/ensureTypeLoaded for on-demand single-bundle imports - registerLazy/promoteFromLazy for lazy entry lifecycle - buildIndex on loaders for mtime-based catalog population - loadSingleType on loaders for on-demand bundle import The shared ExtensionCatalogStore (SQLite at .swamp/_extension_catalog.db) is passed from CLI startup to all loaders. Search commands benefit from zero-import catalog reads on subsequent runs. Before/after on vault type search with @swamp/aws-sm pulled: - Old binary: 1.56s (full import every time) - New binary (cached): 0.70s (catalog read, no bundle import) Closes #1062 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After rebasing onto main (which added DatastorePathResolver in #1103), update the new catalog-related methods (populateCatalogFromRegistry, getVaultBundlePath, getDriverBundlePath, getReportBundlePath) to use this.resolveBundlePath() instead of hardcoded join() paths. This ensures lazy loading works correctly with S3 datastores. The datastore loader intentionally keeps hardcoded local paths due to bootstrap ordering constraints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oaders Port fixes from the model loader to vault, driver, datastore, and report loaders to prevent regressions of known bugs: - #1107: Add sourceDirsFingerprint() and catalog invalidation when extension source directories change between runs. Uses per-kind fingerprint keys so loaders don't overwrite each other. - #1094: Update findStaleFiles() to check transitive dependency mtimes via resolveLocalImports(), not just the entry point file. Catches edits to imported .ts files that don't touch the entry point. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
86a0f18 to
4302aee
Compare
…lib skip Three fixes to align the report loader with vault/driver/datastore loaders: 1. Freshness check failure now falls through to rebundle attempt instead of immediately returning stale cache 2. bundleExtension() wrapped in try/catch with cache fallback + isExpectedBundleFailure logging (critical for pulled extensions lacking a local deno.json import map) 3. Async discoverFiles skips _-prefixed directories (e.g., _lib/) matching the sync version and all other loaders Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three improvements across all extension loaders: - importBundle now accepts baseDir and includes bundleNamespace in the bundle path, matching bundleWithCache. Previously the file URL import always failed (wrong path) and fell through to data URL overhead. - Extract sourceDirsFingerprint to a shared export in extension_catalog_store.ts, removing identical copies from all 5 loader files (model, vault, driver, datastore, report). - Update model loader to use the shared function for consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Well-structured PR that extends the lazy per-bundle loading pattern from models to the remaining four extension registries (vault, driver, datastore, report). The shared ExtensionCatalogStore is correctly hoisted to CLI startup and passed down, the sourceDirsFingerprint utility is properly extracted for reuse, and all four registries follow the same consistent pattern.
Blocking Issues
None.
Suggestions
-
High code duplication across loaders: The
buildIndex,findStaleFiles,rebundleAndUpdateCatalog,populateCatalogFromRegistry,populateCatalogFromDir, anddiscoverFilesSyncmethods are nearly identical across all four loaders (vault, driver, datastore, report), differing only in the kind string, export name, schema, and registry calls. This is a significant amount of copy-paste (~400 lines × 4). A shared base class or a generic helper parameterized by kind/schema/registry would reduce the maintenance surface considerably. Not blocking since the pattern is proven from the model loader and all four copies are consistent. -
Report loader
bundleWithCacherefactor: The report loader'sbundleWithCachemethod received a significant rewrite in this PR (the "try rebundle, fall back to cache" logic withisExpectedBundleFailure). This looks like a separate improvement that could have been its own commit for easier bisection, but the logic itself is sound. -
vault_type_search.tsimports from internal path:src/cli/commands/vault_type_search.ts:33importsgetVaultTypesfrom../../domain/vaults/vault_types.tsrather than from../../libswamp/mod.ts. This predates this PR, so not blocking here, but worth noting for future cleanup per the CLAUDE.md libswamp import boundary rule.
DDD Assessment
The changes correctly keep domain logic (registries, lazy entries, type promotion) in the domain layer and infrastructure concerns (catalog store, SQLite) in the infrastructure layer. The LazyEntry types are proper value objects (immutable metadata with no identity). The registries continue to function as domain services. The ExtensionCatalogStore remains a clean infrastructure concern. Good separation.
Test Coverage
All four registries have comprehensive lazy loading tests covering: registerLazy, ensureTypeLoaded (happy path, already loaded, unknown type), concurrent dedup, and retry after failure. The vault registry has additional tests for promoteFromLazy, getAllLazy, and registerLazy skip-if-already-loaded. Good coverage.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
-
ReportRegistry case normalization mismatch —
src/domain/reports/report_registry.ts:65-68vsuser_report_loader.ts:363registerLazyFromCatalogstores lazy entries keyed byentry.type_normalized(lowercased frompopulateCatalogFromDir). ButpromoteFromLazy(parsed.data.name, ...)(line 363 of the loader) uses the original-case report name. SincepromoteFromLazydoesthis.lazyTypes.delete(name)with the original-case key, it will miss the lowercased lazy entry if the name contains uppercase characters.Breaking example: A report named
"Cost-Report"is cataloged as"cost-report"(lowercased).registerLazystores it under key"cost-report". After loading,promoteFromLazy("Cost-Report", ...)triesthis.lazyTypes.delete("Cost-Report")— no-op, since the key is"cost-report". The lazy entry is orphaned. IfensureTypeLoaded("cost-report")is later called, it sees the orphaned lazy, triggers re-import, andimportAndRegisterBundlechecksreportRegistry.get(entry.type_normalized)=get("cost-report")which misses"Cost-Report"inreports— causing a redundant bundle import.Not currently triggered since no production code calls
reportRegistry.ensureTypeLoaded()yet, but this is a latent bug waiting to surface.Suggested fix: Either normalize the name in
promoteFromLazyandensureTypeLoaded(using.toLowerCase()like all other registries), or store lazy entries with the original-case key. The former is more consistent with the other registries. -
listAvailableTypesdoesn't include lazy types —src/libswamp/vaults/create.ts:88listAvailableTypes()callsvaultTypeRegistry.getAll()which only returns fully loaded types, not lazy ones. When vault creation fails with "Unknown vault type", the error message listing available types will omit extension vault types that are indexed but not yet imported. Not a correctness issue — the error message is just incomplete.
Low
-
getVaultBundlePath/getDriverBundlePath/getDatastoreBundlePath/getReportBundlePathreturn empty string whenrepoDiris null — e.g.user_vault_loader.tsaround thegetVaultBundlePathmethod. The empty string is then passed tocatalog.upsertasbundle_path. IfloadSingleTypelater reads this entry,Deno.readTextFile("")would throw. In practice,repoDiris always set whenbuildIndexis called (the code path requires a repo), so this is theoretical. -
discoverFilesSynchas unbounded recursion — All four loader implementations recursively descend directories with no depth limit. A symlink loop (even accidental) could cause a stack overflow. The asyncdiscoverFileshas the same pattern, so this is pre-existing.
Verdict
PASS — Well-structured extension of the existing lazy-loading pattern. The four registries and loaders follow a consistent template with good test coverage (lazy registration, concurrent dedup, retry-after-failure). The report registry case mismatch (Medium #1) is latent and not yet triggered in production code. The catalog store's per-kind fingerprint approach is backward-compatible with the existing model loader. The bundleWithCache refactor in the report loader improves resilience by attempting a rebundle before falling back to cache.
PR #1089 introduced lazy per-bundle loading for vault, datastore, driver, and report registries. The get() method only returns fully-loaded types, returning undefined for lazy entries — but many consumer code paths call get() without first calling ensureTypeLoaded() to promote lazy entries. This caused extension types like @swamp/s3-datastore and @swamp/aws-sm to be treated as unknown, breaking 11 UATs across vault and datastore operations. The fix adds await registry.ensureTypeLoaded(type) before every .get() call across all four registries, matching the pattern already used by resolveModelType() in extension_auto_resolver.ts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds 10 tests verifying the lazy registry promotion contract introduced by PR #1089 and fixed by PR #1116: Category 1 (8 tests, fresh instances): Documents the registry invariant that get() returns undefined for lazy entries and ensureTypeLoaded() must be called first to promote them. Covers all four registries (vault, datastore, driver, report). Category 2 (2 tests, global singletons): Verifies that resolveVaultType() and resolveDatastoreType() promote lazy entries before returning, so callers can safely use get() afterwards. These tests would have caught the PR #1089 regression — they fail without the ensureTypeLoaded() calls added in PR #1116. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
LazyEntrytypes,setTypeLoader/ensureTypeLoadedfor on-demand single-bundle imports, andregisterLazy/promoteFromLazylifecyclebuildIndex()(mtime-based catalog population without importing) andloadSingleType()(on-demand single bundle import)ExtensionCatalogStore(SQLite at.swamp/_extension_catalog.db) passed from CLI startup to all loadersPerformance
Before/after on
swamp vault type search --jsonwith@swamp/aws-smpulled:Test Plan
@swamp/s3-datastore(datastore),@swamp/aws-sm(vault),@john/tensorlake(driver),@webframp/aws/cost-report(report)Closes #1062
🤖 Generated with Claude Code