Add feature to make studies from embedded images#538
Add feature to make studies from embedded images#538pradeeban merged 4 commits intoControlCore-Project:devfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds first-class support for generating Concore studies directly from embedded-image artifacts (PNG/JPG/JPEG) by extracting embedded GraphML and wiring that support into the mkconcore/makestudy workflow, addressing #537.
Changes:
- Introduces
tools/extract_graphml.pyto extract embedded GraphML from PNG chunks (tEXt/zTXt/iTXt) or raw XML spans in JPEGs. - Extends
mkconcore.pyto accept image inputs, extract GraphML to a temporary.graphml, and proceed with the existing pipeline. - Updates
makestudyandmakestudy.batto better handle double-extensions like*.graphml.png/*.graphml.jpgwhen deriving the study directory name.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| tools/extract_graphml.py | New GraphML-extraction utility for PNG/JPEG embedded metadata/content. |
| mkconcore.py | Adds image-input handling by extracting embedded GraphML to a temp file before processing. |
| makestudy.bat | Allows .png/.jpg/.jpeg inputs and adjusts name handling for *.graphml.* filenames. |
| makestudy | Adjusts study directory name derivation for double-extension inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tools/extract_graphml.py
Outdated
| def _extract_from_png(data: bytes) -> str | None: | ||
| #Return the GraphML string embedded in a PNG file, or None |
There was a problem hiding this comment.
pyproject.toml sets requires-python = ">=3.9", but this module uses the PEP 604 union syntax (str | None), which is a Python 3.10+ feature and will raise a SyntaxError on 3.9. Please switch these annotations to Optional[str] / Optional[...] (or add from __future__ import annotations and avoid |), so the script remains runnable on the supported Python versions.
There was a problem hiding this comment.
I have fixed this right now as per the copilot's suggestion, but later when we decide to update the python version, the current syntax will be better in my opinion
tools/extract_graphml.py
Outdated
| def extract_graphml_to_file(image_path: str, output_path: str | None = None) -> str | None: | ||
| graphml = extract_graphml(image_path) |
There was a problem hiding this comment.
Same Python version-compatibility issue here: output_path: str | None / return str | None requires Python 3.10+, but the project supports Python 3.9+. Use Optional[str] etc. to avoid syntax errors on 3.9.
mkconcore.py
Outdated
| _tool_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "tools") | ||
| sys.path.insert(0, _tool_dir) | ||
| from extract_graphml import extract_graphml as _extract_graphml |
There was a problem hiding this comment.
This change imports tools/extract_graphml.py by constructing a tools path relative to mkconcore.py. In the packaged distribution, pyproject.toml only declares py-modules = ["concore", "concoredocker", "concore_base", "mkconcore"] (no tools), so an installed mkconcore is likely to fail with ModuleNotFoundError when passed a PNG/JPEG. Consider moving extract_graphml.py into a shipped module/package (or adding it to py-modules / package data) and importing it without relying on a non-packaged tools/ directory.
mkconcore.py
Outdated
| _tool_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "tools") | ||
| sys.path.insert(0, _tool_dir) | ||
| from extract_graphml import extract_graphml as _extract_graphml |
There was a problem hiding this comment.
Modifying sys.path at runtime (sys.path.insert(0, _tool_dir)) can have unintended side effects on later imports in this long-running script (it changes module resolution globally and can shadow stdlib/third-party modules). Prefer importing extract_graphml via importlib.util.spec_from_file_location using an explicit file path, or at least remove the inserted path after the import.
There was a problem hiding this comment.
nice catch, fixed!
| _tmp_graphml_file = tempfile.NamedTemporaryFile( | ||
| mode="w", suffix=".graphml", delete=False, encoding="utf-8" | ||
| ) | ||
| _tmp_graphml_file.write(_graphml_str) | ||
| _tmp_graphml_file.close() | ||
| GRAPHML_FILE = _tmp_graphml_file.name | ||
| import atexit as _atexit | ||
| _atexit.register(os.unlink, GRAPHML_FILE) |
There was a problem hiding this comment.
The temp file cleanup uses atexit.register(os.unlink, GRAPHML_FILE). If the file is already removed (or never created due to a partial failure after creation), os.unlink will raise at process exit and can produce noisy shutdown output. Consider registering a small cleanup wrapper that ignores FileNotFoundError.
There was a problem hiding this comment.
This issue exists before this PR also for normal graphml, I can raise a separate PR for all together, not necessary to fix right now
| # --- Image input support: extract embedded GraphML from PNG/JPG --- | ||
| _IMAGE_EXTS = {".png", ".jpg", ".jpeg"} | ||
| _tmp_graphml_file = None # keep reference so the temp file isn't deleted early | ||
| if os.path.splitext(GRAPHML_FILE)[1].lower() in _IMAGE_EXTS: |
There was a problem hiding this comment.
Now that image inputs are supported, the script’s usage: py mkconcore.py file.graphml ... message (earlier in the file) is misleading. Consider updating the usage/help text to mention .png/.jpg/.jpeg so users don’t assume only .graphml works.
There was a problem hiding this comment.
I will document this properly later in guide.md.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@pradeeban @Rahuljagwani @mayureshkothare please have a look, I have gone through all of the copilot's suggestion and implemented those which are impactful! |
tools/extract_graphml.py: Added a new utility to detect bytes and safely extract GraphML from tEXt, zTXt, iTXt PNG structures or raw JPEG streams (basically this will decode the embedded images and extract graphml)
updated makestudy.bat so that it accepts .png, .jpg, and .jpeg sequences as primary mkconcore.py targets.
This fixes #537