feat(backend): add /api/diagnostics endpoint for environment and dependency health#623
feat(backend): add /api/diagnostics endpoint for environment and dependency health#623SxBxcoder wants to merge 2 commits intoAOSSIE-Org:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdded a new Flask GET endpoint Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Server as Flask Server
participant OS as OS/System
participant Torch as PyTorch (optional)
Client->>Server: GET /api/diagnostics
Server->>OS: gather platform, release, arch, python_version, cpu_count
alt PyTorch import succeeds
Server->>Torch: import torch
Torch-->>Server: torch.__version__, cuda.is_available()
else PyTorch import fails
Torch-->>Server: ModuleNotFoundError / Exception
end
Server-->>Client: JSON diagnostics (200) or 403 if disabled
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/server.py`:
- Around line 495-528: The diagnostics endpoint system_diagnostics currently
returns detailed host/runtime metadata; gate it behind a configuration flag and
tighten production behavior: add a config check (e.g.,
app.config.get("DIAGNOSTICS_ENABLED", False) or ENV != "production") at the top
of system_diagnostics and return a minimal 403/404 or a small
{"status":"unavailable"} response when disabled, and in production ensure only
non-identifying info (or just status) is returned; update any references to
platform/sys/os/torch usage to only run when diagnostics are enabled to avoid
leaking environment details.
- Around line 518-525: The current try/except only catches ImportError for the
torch import block and can still crash on OSError/RuntimeError or attribute
access failures; update the diagnostics logic around the torch import and
attribute reads (the block that sets
diagnostics["ml_environment"]["pytorch_available"],
diagnostics["ml_environment"]["torch_version"],
diagnostics["ml_environment"]["cuda_available"]) to catch broad exceptions
(catch Exception) and on any failure set diagnostics["status"] = "degraded
(pytorch error)" and record the exception message into diagnostics (e.g.,
diagnostics["ml_environment"]["pytorch_error"]) so broken installs return a
degraded status instead of raising; also guard attribute access
(torch.__version__, torch.cuda.is_available) inside the same exception scope to
avoid unhandled errors.
|
Addressed CodeRabbit's security and resilience feedback in the latest commit.
Pipeline is green and this is ready for maintainer review whenever you have a chance @yatikakain @Aditya062003 |
Addressed Issues:
Fixes N/A (Proactive backend infrastructure addition to assist maintainers in triaging local setup and dependency failures).
Screenshots/Recordings:
N/A (Backend JSON Endpoint).
Expected Output from
GET /api/diagnostics:{ "status": "healthy", "system": { "os": "Windows", "release": "10", "architecture": "AMD64", "python_version": "3.10.11", "cpu_count": 8 }, "ml_environment": { "pytorch_available": true, "cuda_available": false, "torch_version": "2.1.2+cpu" } }Additional Notes:
This PR introduces a lightweight /api/diagnostics endpoint to backend/server.py.
Currently, when new contributors (especially those on Windows or machines with 8GB RAM) experience backend crashes during onboarding, maintainers have to guess if the issue is a Python version mismatch, a missing PyTorch wheel, or a CPU/VRAM bottleneck.
This endpoint requires zero new dependencies and provides an instant snapshot of the host's environment. Moving forward, when a user reports a crash in the Discord, maintainers can simply ask them to ping /api/diagnostics and share the output, drastically reducing triage time and friction for GSoC applicants.
AI Usage Disclosure:
We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.
Check one of the checkboxes below:
I have used the following AI models and tools: TODO
Checklist
Summary by CodeRabbit