Skip to content

docs: MDL i18n design proposal#87

Merged
ako merged 2 commits intomendixlabs:mainfrom
engalar:worktree-i18n-design
Apr 3, 2026
Merged

docs: MDL i18n design proposal#87
ako merged 2 commits intomendixlabs:mainfrom
engalar:worktree-i18n-design

Conversation

@engalar
Copy link
Copy Markdown
Contributor

@engalar engalar commented Apr 3, 2026

Summary

  • Proposes syntax-layer i18n support for MDL with inline translation map literals { en_US: 'Hello', zh_CN: '你好' }
  • Adds DESCRIBE ... WITH TRANSLATIONS output mode for roundtrip-safe multi-language output
  • Adds SHOW TRANSLATIONS [MISSING] query command for translation coverage auditing

Motivation

MDL currently drops all non-default translations when describing or creating model elements. Multi-language Mendix projects cannot use MDL for full model management without losing translations.

Design

See docs/plans/2026-04-03-mdl-i18n-design.md for the full proposal including:

  • ANTLR grammar extensions
  • AST node design
  • BSON writer merge semantics
  • 6-phase implementation plan
  • Compatibility analysis

Test plan

  • Review design document for completeness
  • Validate grammar rule doesn't conflict with existing widget body { } syntax
  • Confirm BSON merge semantics match Studio Pro behavior

🤖 Generated with Claude Code

Proposes syntax-layer i18n support for MDL: inline translation map
literals `{ en_US: 'Hello', zh_CN: '你好' }`, DESCRIBE WITH TRANSLATIONS
output mode, and SHOW TRANSLATIONS query command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

AI Code Review

We need to review this PR. The PR is titled "docs: MDL i18n design proposal" and only adds a design document (docs/plans/2026-04-03-mdl-i18n-design.md). No code changes are present in the diff—only a new Markdown file.

According to the review checklist, I need to check for overlap/duplication, MDL syntax design (if syntax is added), full-stack consistency, test coverage, security/robustness, scope/atomicity, code quality, and bugs/correctness.

Since this PR only adds a design document and no actual code or syntax changes, many checklist items are not applicable:

  • Overlap & duplication: The design document is new; I should check if there's an existing proposal covering the same functionality. The proposals list in the context shows no proposals directory, so likely no duplication. However, the design document itself might duplicate existing documentation, but it's a plan, so probably not.

  • MDL syntax design: The PR does not add or modify MDL syntax (no grammar changes, no AST changes, etc.). The design document proposes syntax, but since it's not implemented yet, there's nothing to review against the syntax design checklist. The checklist items for MDL syntax design only apply if the diff adds or modifies MDL syntax. Here, the diff is only a doc file, so skip.

  • Full-stack consistency: Similarly, since no MDL feature is implemented, no need to check grammar, AST, visitor, etc.

  • Test coverage: No new code, so no test files needed.

  • Security & robustness: No code changes, so no issues.

  • Scope & atomicity: The PR is scoped to a single concern: documenting a design proposal. It's atomic.

  • Code quality: The document should be checked for quality, but the checklist is about code. For documentation, we might check clarity, but the checklist doesn't explicitly cover that. However, the "Documentation" section of the checklist says: "Skills — new features documented in .claude/skills/ ... CLI help — mxcli command help text updated ... Syntax reference — docs/01-project/MDL_QUICK_REFERENCE.md updated ... MDL examples — working examples added to mdl-examples/ ... Site docs — docs-site/src/ pages added or updated". This PR adds a design doc in docs/plans/, which might be acceptable, but I should check if it's the right place.

Looking at the project structure, there's a docs/ directory, and within it, plans/ seems to be for design proposals (as seen by the path docs/plans/2026-04-03-mdl-i18n-design.md). The context mentions "docs/11-proposals/" in the overlap & duplication checklist, but in the actual repo, it seems proposals are in docs/plans/ (since the PR adds there). The context says: "Check docs/11-proposals/ for existing proposals covering the same functionality" — but in the diff, we see docs/plans/, not docs/11-proposals/. Perhaps docs/11-proposals/ is a placeholder or the actual location is docs/plans/. In the CLAUDE.md, under "Documentation" in the checklist, it doesn't specify where design docs go, but the PR is adding to docs/plans/, which seems reasonable.

However, I should verify if there's an existing proposal for i18n. The context says: "Proposals in repo: No proposals directory." But that might be outdated or referring to a different location. Since the PR adds to docs/plans/, and the overlap checklist says to check docs/11-proposals/, I need to interpret that. Possibly docs/11-proposals/ is a typo or the actual folder is docs/plans/ (as 11 might be a section number). In many projects, proposals are in docs/proposals/ or similar. Given that the PR is adding to docs/plans/, and the context mentions docs/11-proposals/, I'll assume that's where proposals live, and the PR is placing it correctly.

Since the PR only adds a design document and no implementation, there are no critical issues from a code perspective. However, I should check if the design document itself has any issues that might violate the checklist indirectly.

For example, under "Design docs match the actual implementation" in code quality, but since there's no implementation yet, it's not applicable. The design document should be feasible and follow guidelines.

Let me skim the design document for any red flags:

  • It proposes inline translation map literals: { en_US: 'Hello', zh_CN: '你好' }
  • It uses AS for name mappings? No, in the translation map, it's en_US: 'Hello', which is a key-value pair with colon, not AS. The design says: "translationEntry : IDENTIFIER ':' STRING_LITERAL" — so identifier (language code) colon string literal. This is correct per the guideline: colon for property definitions (Key: value). Language codes are identifiers, so en_US is an IDENTIFIER token, which is fine.
  • It adds DESCRIBE ... WITH TRANSLATIONS — follows SHOW/DESCRIBE pattern.
  • Adds SHOW TRANSLATIONS [MISSING] — uses SHOW, which is standard.
  • The design seems to follow the syntax guidelines: reads as English, qualified names used elsewhere (though not shown in snippets), property format with colon, etc.

Potential issues:

  • The design mentions that widget-level translations are not indexed in the catalog strings table, and a follow-up task is needed. That's acceptable as out of scope.
  • It says "No syntax for deleting a translation (use Studio Pro)." — that's fine.
  • The BSON writer merge semantics: for bare strings, update only default language; for maps, merge (preserve unspecified languages). This seems correct.

But since this is just a design proposal, and no code is changed, the main thing is to ensure the design document is in the right place and doesn't duplicate existing work.

The overlap checklist: "Check docs/11-proposals/ for existing proposals covering the same functionality". Since the context says "Proposals in repo: No proposals directory.", but the PR is adding to docs/plans/, I think there might be a mismatch. Perhaps in this repo, proposals are in docs/plans/ and the checklist refers to that. Or maybe docs/11-proposals/ is a specific file? Unlikely.

Given that the PR body says it's a design proposal and links to the design doc, and no code is changed, this PR is likely just for review of the design before implementation.

In such cases, the review should focus on whether the design is sound and ready for implementation, but the checklist is geared towards code changes.

However, the instructions say: "Review against ALL of the following checklist items". For items that don't apply (like grammar rules when none are added), I should skip them or note that they are not applicable.

But the review structure requires: "Only list sections that have findings — skip empty sections."

So for sections where there are no issues (because no code changed), I should not list them.

Let me go through each checklist section:

Overlap & duplication

  • Check proposals list: The context says "Proposals in repo: No proposals directory." but the PR adds to docs/plans/. Assuming docs/11-proposals/ is meant to be docs/plans/ or similar, I should check if there's an existing i18n proposal. Since the PR is the first one, likely no duplication. However, to be thorough, I should note that if there is an existing proposal, it would be a problem, but I don't see one. So no finding here.

MDL syntax design

  • The PR does not add or modify MDL syntax (diff only touches a doc file), so this section is not applicable. Skip.

Full-stack consistency

  • Similarly, no MDL feature implemented, so skip.

Test coverage

  • No new code, so skip.

Security & robustness

  • No code changes, so skip.

Scope & atomicity

  • The PR is scoped to a single feature concern (i18n design proposal). It's atomic. No issue.

Code quality

  • For the design document itself: is it well-written? The checklist item "Design docs should match the actual implementation" — but there is no implementation yet, so it's forward-looking. However, the design should be detailed enough to be implementable. The design document seems comprehensive. No obvious code quality issues since it's not code.

Bugs & correctness

  • Not applicable.

Given that, there might be no findings. But I should check if the design document has any flaws that would violate the checklist if implemented.

For example, under MDL syntax design, if the proposed syntax violates guidelines, that would be a moderation issue even if not implemented yet.

Let me evaluate the proposed syntax against the guidelines:

  • Follows standard CREATE/ALTER/DROP/SHOW/DESCRIBE pattern: The design adds DESCRIBE ... WITH TRANSLATIONS and SHOW TRANSLATIONS [MISSING]. These use standard verbs. Good.

  • Uses Module.Element qualified names everywhere: The design doesn't show examples with qualified names, but it implies that in DESCRIBE, you'd do DESCRIBE PAGE Module.MyPage, so qualified names are used. The translation map syntax itself doesn't involve qualified names, but that's fine as it's a value.

  • Property lists use ( Key: value, ... ) format: The translation map uses { lang: 'text', ... } which is similar but with braces instead of parentheses. However, the design says it's for property values, and the guideline says property lists use parentheses. But wait: the guideline is for property definitions in statements like CREATE ENTITY ( Name: 'Customer' ). For values that are maps, the design proposes using curly braces.

Check the design guideline from the context: "Property format — uses ( Key: value, ... ) with colon separators, one per line"

This is for property lists in statements. However, for values that are themselves maps (like translation maps), what should the syntax be?

In the PR body, it says: "New MDL syntax must follow design guidelines: reads as English (target audience is citizen developers), uses standard CRUD verbs (CREATE/ALTER/DROP/SHOW/DESCRIBE), consistent property format (Key: value) for property definitions, AS for name-to-name mappings (old AS new), qualified names (Module.Element), no keyword overloading, no symbolic syntax. IMPORTANT: colon is for property definitions (Key: value), AS is for name-to-name mappings — do not flag AS in mapping contexts as a syntax violation."

So, for property definitions (i.e., after a colon in a statement), the format should be Key: value. But when the value is complex, like a map, what should it be?

In the example given in the design:

Title: {
  en_US: 'Hello World',
  zh_CN: '你好世界',
  nl_NL: 'Hallo Wereld'
}

Here, Title: is the property definition (using colon), and the value is a map literal enclosed in braces.

The guideline says "consistent property format (Key: value) for property definitions", which is satisfied because it's Title: { ... }. The value being a map is acceptable as long as the map syntax follows the guidelines.

Now, does the map syntax follow the guidelines?

  • It uses { ... } with key: value pairs inside, separated by commas.
  • The keys are language codes (identifiers like en_US), values are string literals.
  • It uses colon for key-value pairs inside the map, which is consistent with the property definition colon usage.
  • It does not use AS inside the map (AS is for name mappings, not for key-value pairs in data literals).
  • It reads as English: "Title has English 'Hello World', Chinese '你好世界', etc."
  • Qualified names: not applicable here as keys are language codes, not module elements.
  • No keyword overloading: the colon inside the map is used for map key-value, same as for property definitions — is that overloading?

Ah, here's a potential issue: the colon is used in two contexts:

  1. In property definitions: Title: { ... } — colon separates property name from value.
  2. Inside the map: en_US: 'Hello World' — colon separates map key from value.

The guideline says: "colon is for property definitions (Key: value), AS is for name-to-name mappings". It implies that colon should only be used for property definitions, not for other purposes.

But in many languages, colon is used for key-value pairs in maps/objects (like JSON). However, the design guideline might be trying to avoid overloading.

Let me check existing MDL syntax for similar constructs.

From the context, I know that MDL uses { } for widget bodies, but that's different.

For example, in pages, you might have properties like Caption: 'Save', but not map values.

Is there any existing MDL syntax that uses colon inside a value?

Looking at the doctype test files mentioned:

  • 01-domain-model-examples.mdl
  • etc.

I don't have the content, but from general knowledge of MDL (as per the project), MDL is SQL-like, and I don't recall map literals existing.

However, the design guideline explicitly says: "no symbolic syntax" and "colon is for property definitions".

This suggests that using colon inside a value for map keys might be considered symbolic syntax or overloading.

But the design also says: "AS is for name-to-name mappings (old AS new)" — implying that AS has a specific use, and colon another.

In the translation map, if we used AS, it might look like { en_US AS 'Hello World', ... } but that would be weird because AS is for mapping names, not for associating keys with values in a data structure.

The design guideline says: "do not flag AS in mapping contexts as a syntax violation" — meaning when AS is used for renaming (like oldName AS newName), it's ok, but if used elsewhere, it's bad.

For the translation map, using key: value inside braces seems natural.

But to comply with "colon is for property definitions", perhaps the map should use a different syntax.

Alternative syntaxes:

  • Using AS: { en_US AS 'Hello World', zh_CN AS '你好' } — but AS is meant for renaming identifiers, not for key-value assignment in data. This might confuse users into thinking it's renaming the language code.
  • Using equals: { en_US = 'Hello World', ... } — but equals might be confused with assignment or comparison.
  • Using a list of pairs: [ ('en_US', 'Hello World'), ... ] — but that uses parentheses and quotes, and might not read as English.

The design guideline emphasizes "reads as English". The proposed { en_US: 'Hello World' } reads as "en_US colon Hello World", which is a bit technical, but common in programming.

Perhaps it's acceptable.

Let me see if there's any existing use of colon in values in MDL.

From the CLAUDE.md, under "Mendix Expression String Escaping", it talks about escaping in expressions, but that's different.

In the microflow examples, you might have expressions like $object/Attribute == 'value', but no map literals.

Another thought: in the ALTER PAGE SET widget properties, you set properties like Caption: 'Save', so the value is a string. If you wanted a translation, you'd need a way to specify multiple languages.

The design proposal seems reasonable.

However, to strictly adhere to "colon is for property definitions", one could argue that inside a value, colon should not be used for other purposes.

But the guideline might mean that colon should not be used as a symbolic operator in expressions, but for data literals, it's ok.

The design guideline says: "no symbolic syntax" — and gives examples like no overloading, no symbols for domain operations.

Using colon in a map literal might be considered symbolic syntax.

But let's look at the "IMPORTANT" note: "colon is for property definitions (Key: value), AS is for name-to-name mappings — do not flag AS in mapping contexts as a syntax violation."

This implies that colon has a specific reserved use: for property definitions in statements. Therefore, using colon elsewhere (like in data literals) would be incorrect.

If that's the case, then the proposed translation map syntax violates the guideline.

How else could we specify a translation map?

Perhaps using a function-like syntax: translations(en_US: 'Hello World', zh_CN: '你好') — but that introduces a function call, which might not read as English.

Or using AS in a different way: but AS is for name mappings.

Another idea: treat the translation map as a set of properties, but that doesn't fit.

Perhaps the value should be a string that gets parsed, but that seems messy.

Given the context, I recall that in some MDL contexts, there are complex values.

For example, in expressions, you have operators, but those are symbolic.

But for data literals, MDL might only support simple literals so far.

The design proposal introduces a new kind of literal.

To comply with the guideline, perhaps the map should use a syntax like:
{ 'en_US': 'Hello World', 'zh_CN': '你好' } with quotes around keys, but that still uses colon.

Or maybe use a list: LANGUAGE en_US 'Hello World', LANGUAGE zh_CN '你好' — but that's verbose.

The design guideline emphasizes "LLM-friendly — one example is sufficient for an LLM to generate correct variants" and "Diff-friendly — adding one property is a one-line diff".

The proposed map syntax is diff-friendly: adding a language is one line.

If we had to use a different syntax, it might not be.

But let's see what the checklist says for MDL syntax design: it includes "Colon for property definitions (Key: value), AS for name-to-name mappings ('oldName' AS 'newName') — do not flag AS in mapping/renaming contexts as wrong"

This suggests that in the context of property definitions, colon is used, and AS is used mapping


Automated review via OpenRouter (Nemotron Super 120B) — workflow source

Copy link
Copy Markdown
Collaborator

@ako ako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

This is a design-doc-only PR (210 lines, single file). The proposal is well-structured with clear phasing and compatibility analysis. Here's my assessment, focusing on the overlap you mentioned.

Overlap with existing functionality

The proposal has significant overlap with features already merged to main:

Proposed feature Already exists on main
SHOW TRANSLATIONS MISSING SHOW LANGUAGES — lists all languages with string counts (commit a060152)
Translation coverage auditing QUAL005 MissingTranslations linter rule — detects elements missing translations in languages used by the project
Catalog strings with language data strings FTS5 table already stores per-language translations with ElementId for uniqueness (commit 50b1e79)

The proposal's SHOW TRANSLATIONS command (Section 3) is essentially a pivot view of data already available via SELECT * FROM CATALOG.strings WHERE Language = 'nl_NL'. The MISSING filter is what the QUAL005 linter rule does. The proposal should reference these existing features and explain what value the new command adds beyond what's already possible.

What's genuinely new

The unique contributions are:

  1. Inline translation map literals { en_US: 'Hello', zh_CN: '你好' } — this is the core value. No existing syntax supports writing multi-language strings.
  2. DESCRIBE WITH TRANSLATIONS — showing all languages in DESCRIBE output.
  3. Merge semantics for bare strings — preserving existing translations when writing a single-language string.

These are valuable and well-designed.

Design feedback

Grammar ambiguity risk is real

The proposal acknowledges { ambiguity with widget body blocks but dismisses it:

Grammar context: translatedText only appears in property value position, not statement position. Widget bodies follow ) not :.

This isn't quite right. Consider pluggable widget properties (from PR #68):

PLUGGABLEWIDGET 'com.example.Widget' myWidget (
    Caption: { en_US: 'Hello' }     -- translation map
) {                                   -- widget body
    CONTAINER content { ... }
}

The parser sees Caption: { and must decide: is { starting a translation map or a widget body? Since Caption: uses : and widget bodies follow ), this particular case is disambiguated. But what about:

keyword: { en_US: 'text' }

vs a hypothetical future syntax where { follows : for other purposes? The proposal should add a concrete ANTLR rule showing how translatedText is integrated into the existing propertyValueV3 rule, not just the standalone rule.

Merge semantics need more detail

Bare string 'text' writes to the project's DefaultLanguageCode. Languages not mentioned are preserved.

This implies every bare-string write requires a read-modify-write cycle:

  1. Read existing Texts$Text from MPR
  2. Find/update the default language entry
  3. Preserve other entries
  4. Write back

Currently, the writer constructs Texts$Text from scratch. This would be a significant change to all writer functions listed in the proposal. The implementation phases should call this out — it's not a trivial change and affects the core writer architecture.

Sort translations by language code

The proposal mentions this as a risk mitigation but doesn't include it in the design. Make it explicit: DESCRIBE output and BSON serialization should sort translations by language code for deterministic output.

Phase plan assessment

The 6-phase plan is reasonable, but:

  • P5 (SHOW TRANSLATIONS) should be dropped or scoped down to "show gaps not covered by QUAL005" — the linter already does gap detection, and SHOW LANGUAGES already shows the language inventory.
  • P3 (DESCRIBE WITH TRANSLATIONS) is the highest user-value feature and could be done without P1/P2 (read-only, no grammar change needed).
  • P4 (Writer merge semantics) is the riskiest phase — I'd flag it as requiring careful testing against Studio Pro.

Minor

  • The "Author: @anthropics/claude-code" attribution is unusual for a design doc. Should this be the human author?
  • The translatable fields inventory is useful but should note that SHOW LANGUAGES already covers most of these through the catalog strings table.
  • The proposal doesn't mention how ALTER PAGE SET would work with translation maps — this is an important use case for in-place translation updates.

Recommendation

The core idea (translation map literals + DESCRIBE WITH TRANSLATIONS) is sound and fills a real gap. The proposal should be updated to:

  1. Reference the existing SHOW LANGUAGES, QUAL005, and strings catalog table
  2. Drop or re-scope SHOW TRANSLATIONS to avoid duplicating existing functionality
  3. Add a concrete ANTLR integration example showing how translatedText fits into propertyValueV3
  4. Detail the read-modify-write implications for bare-string merge semantics

- Drop SHOW TRANSLATIONS (overlaps with SHOW LANGUAGES + QUAL005)
- Add concrete ANTLR integration into propertyValueV3
- Detail read-modify-write impact on writer architecture
- Reorder phases: P1 = DESCRIBE WITH TRANSLATIONS (highest value, zero risk)
- Add ALTER PAGE SET + translation map interaction
- Sort translations by language code for deterministic output
- Fix author attribution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@engalar engalar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the design doc addressing all points. Summary of changes:

SHOW TRANSLATIONS dropped. SHOW LANGUAGES + QUAL005 + SELECT ... FROM CATALOG.strings already cover translation auditing. Removed from scope entirely.

Concrete ANTLR integration added. New translationMap rule integrates into existing propertyValueV3 as a new alternative. Disambiguation analysis: translationMap only appears after : or = in property definitions; widgetBodyV3 follows ) at statement level. No ambiguity in the grammar — the parser context resolves it unambiguously. See updated Section 1.

Read-modify-write impact detailed. Current writers construct Texts$Text from scratch. Bare-string merge semantics require a shared mergeTranslation(existingBSON, translations) helper that reads existing items, updates/inserts the target language, and preserves others. This touches 11+ call sites across 7 writer files. Flagged as high-risk P4 phase requiring Studio Pro validation.

Phase order revised. P1 is now DESCRIBE WITH TRANSLATIONS (read-only, no grammar change, highest user value). Grammar/AST/Visitor are P2-P3. Writer merge semantics are P4 (highest risk).

ALTER PAGE SET interaction added. Translation maps work in ALTER PAGE SET via propertyValueV3 — no additional grammar changes needed (Section 3).

Translation ordering made explicit. Items array sorted by language code for deterministic BSON output.

Author attribution fixed.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

AI Code Review

Critical Issues

  • PR body and design document inconsistency: PR body claims "SHOW TRANSLATIONS [MISSING] query command" is added, but design document explicitly states it was dropped ("Dropped: SHOW TRANSLATIONS command — SHOW LANGUAGES + QUAL005 + SELECT ... FROM CATALOG.strings already cover translation auditing"). This misalignment must be resolved.
  • Missing LSP/VS Code extension considerations: Design document does not address how new syntax (translation maps, WITH TRANSLATIONS clause) will be supported in the Language Server Protocol or VS Code extension (syntax highlighting, completions, diagnostics), which is required for full-stack consistency per checklist.

Moderate Issues

  • Language code identifier limitation: Grammar uses IDENTIFIER for translation map keys (e.g., en_US), which prohibits hyphens. Language codes like zh-Hans (valid in Mendix) would be invalid as identifiers. Design document does not specify how to handle non-identifier language codes.
  • Empty translation map behavior: Unclear semantics for { } (e.g., does it preserve all translations, clear default language, or cause error?). Design document lacks specification.
  • Default language interaction ambiguity: When using a translation map that omits the default language code, it's unclear whether the default language is preserved (as implied by "languages not mentioned are preserved") or if the map should implicitly include the default language.

Minor Issues

  • No test plan in design document: PR body lists test items (design review, grammar validation, BSON semantics), but design document itself lacks a dedicated test coverage section as implied by checklist.
  • Incomplete translatable fields inventory: Design document lists widget caption/placeholder but omits other translatable properties like tooltip, validation messages, or dynamic labels that may use Texts$Text.

What Looks Good

  • Thorough grammar analysis

Automated review via OpenRouter (Nemotron Super 120B) — workflow source

@engalar engalar requested a review from ako April 3, 2026 07:28
@ako ako merged commit 5e6dfca into mendixlabs:main Apr 3, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants