Is your feature request related to a problem? Please describe.
Currently, CLDK's support for analyzing Python source code via codeanalyzer-python is limited in comparison to the Java capabilities available in codeanalyzer-java. While the Java backend supports deep semantic extraction including symbol tables, call graphs, control/data flow graphs, CRUD operation detection, and framework-specific analyses, the Python counterpart lacks many of these features. This asymmetry makes it hard to build language-agnostic tooling and limits reuse of the CLDK analysis stack for Python-heavy projects.
Describe the solution you'd like
I’d like CLDK to offer parity between its Java and Python analysis capabilities, ideally exposing the same abstraction surfaces. This includes:
- A fully-featured
PyApplication schema with semantic depth matching JApplication
- Robust symbol resolution using
jedi, LSP, and other tools to handle module/function/class-level scoping, type inference, imports, and dynamic features
- Generation of call graphs (direct + transitive), with support for async and dynamic callsites
- Control and data flow graphs at the function and module level
- Basic interprocedural analysis across modules and packages
- CRUD operation detection and annotation in frameworks like Django, SQLAlchemy, FastAPI, etc.
- Support for incremental/eager analysis and caching like the Java analyzer
- Compatibility with existing CLDK pipelines like test generation, transformation, and migration agents
Describe alternatives you've considered
- Wrapping third-party static analysis tools like Pyright, pyre, or mypy — but these often lack consistent and unified schema output
- Rewriting analyses from scratch for Python — not ideal due to duplication and maintenance burden
- Falling back on AST-based pattern matching without semantic resolution — brittle and low precision
Additional context
Reference implementations and design can be found in:
The goal is to bring codeanalyzer-python up to parity with codeanalyzer-java in CLDK’s analysis infrastructure. This will enable unified workflows across Java and Python ecosystems and allow tool developers to target a shared intermediate representation for static analysis tasks.
Is your feature request related to a problem? Please describe.
Currently, CLDK's support for analyzing Python source code via
codeanalyzer-pythonis limited in comparison to the Java capabilities available incodeanalyzer-java. While the Java backend supports deep semantic extraction including symbol tables, call graphs, control/data flow graphs, CRUD operation detection, and framework-specific analyses, the Python counterpart lacks many of these features. This asymmetry makes it hard to build language-agnostic tooling and limits reuse of the CLDK analysis stack for Python-heavy projects.Describe the solution you'd like
I’d like CLDK to offer parity between its Java and Python analysis capabilities, ideally exposing the same abstraction surfaces. This includes:
PyApplicationschema with semantic depth matchingJApplicationjedi, LSP, and other tools to handle module/function/class-level scoping, type inference, imports, and dynamic featuresDescribe alternatives you've considered
Additional context
Reference implementations and design can be found in:
The goal is to bring
codeanalyzer-pythonup to parity withcodeanalyzer-javain CLDK’s analysis infrastructure. This will enable unified workflows across Java and Python ecosystems and allow tool developers to target a shared intermediate representation for static analysis tasks.