Fix: robust timeout handling for ML endpoints with multiprocessing and parallel execution by mahek2016 · Pull Request #596 · AOSSIE-Org/EduAid

mahek2016 · 2026-03-17T14:44:30Z

Addressed Issues:

Fixes #592

Screenshots/Recordings:

N/A (Backend improvement – no UI changes)

Additional Notes:

This PR introduces robust timeout handling for ML endpoints to prevent indefinite execution and resource exhaustion.

Key improvements include:

Replaced threading with multiprocessing to ensure actual termination of long-running tasks
Added execute_with_timeout() wrapper for safe execution
Converted /get_problems endpoint to parallel execution using ThreadPoolExecutor
Implemented proper HTTP status handling (400, 500, 504)
Added input validation for text and question parameters
Fixed response structure to avoid returning raw model outputs
Added handling for partial failures and edge cases

This ensures better performance, stability, and user experience.

AI Usage Disclosure:

This PR does not contain AI-generated code at all.
This PR contains AI-generated code. I have read the AI Usage Policy and this PR complies with this policy. I have tested the code locally and I am responsible for it.

I have used the following AI models and tools:

ChatGPT (OpenAI)

Checklist

My PR addresses a single issue, fixes a single bug or makes a single improvement.
My code follows the project's code style and conventions
If applicable, I have made corresponding changes or additions to the documentation
If applicable, I have made corresponding changes or additions to tests
My changes generate no new warnings or errors
I have joined the Discord server and I will share a link to this PR with the project maintainers there
I have read the Contribution Guidelines
Once I submit my PR, CodeRabbit AI will automatically review it and I will address CodeRabbit's comments.
I have filled this PR template completely and carefully, and I understand that my PR may be closed without review otherwise.

Summary by CodeRabbit

New Features
- YouTube transcript retrieval (requires external downloader tool installed)
- Content endpoint accepts document URLs or IDs; form generation returns share/edit/responder links
- New MCQ, short-answer, and boolean question generation endpoints
Improvements
- Consistent JSON success/error responses and clearer error codes
- Input validation, global payload limits, per-endpoint timeouts, parallelized generation, and request rate limiting
- Removed the MCQ "hard" variant; form generation no longer opens a browser automatically

…Flask-Limiter

…ints

…patibility

…Rabbit review

coderabbitai · 2026-03-17T14:44:53Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a thread-based timeout wrapper for model/external calls, global request-size and per-route rate limits, centralized silent JSON parsing with input validation and standardized error payloads, concurrent generation for /get_problems, yt-dlp-based transcript fetching with subprocess timeouts, and updates to multiple endpoints to return consistent JSON shapes.

Changes

Cohort / File(s)	Summary
Core backend `backend/server.py`	Introduces `execute_with_timeout` using a single-worker ThreadPoolExecutor; applies timeouts to generator and external calls; centralizes `request.get_json(silent=True)` parsing and validation; enforces `app.config["MAX_CONTENT_LENGTH"] = 2MB`; adds Flask error handlers for 413/429 and standardized error payloads.
Rate limiting / endpoints `backend/server.py`	Integrates Flask-Limiter and applies per-route `@limiter.limit(...)` caps across many POST/GET endpoints (answers, generators, gform, hard routes); updates endpoints to return uniform `{ "status": "success" }` or structured error objects and appropriate 400/429/504/500 codes.
Generators & concurrency `backend/server.py`	Refactors `/get_mcq`, `/get_boolq`, `/get_shortq` to run MediaWiki/model steps via `execute_with_timeout` with validation of model outputs; `/get_problems` now runs the three generators concurrently with per-step timeouts and returns per-generator results/errors, returning overall 500/504 only if all fail.
Hard routes / answers `backend/server.py`	Removes `/get_mcq_hard`; adds throttling, stricter input validation (types, max counts, length bounds) and timeout-protected MediaWiki/model calls for `/get_shortq_hard` and `/get_boolq_hard`; answer endpoints use silent JSON parsing and throttling.
Content & gform `backend/server.py`	`/get_content` requires `doc_id` or derives from `document_url`, fetches via Google Docs edit URL, returns `{ "content": ... }`; `/generate_gform` no longer opens a browser, returns `form_link`, `responder_url`, and `edit_url` in JSON.
Transcript handling `backend/server.py`	Reworks `/getTranscript` to discover `yt-dlp` via `shutil.which`, validate `videoId`, run `yt-dlp` in a temp dir with subprocess timeout, read/clean the first `.vtt`, and return explicit JSON error codes for missing `yt-dlp`, invalid IDs, timeouts, or fetch failures.
Dependencies `requirements.txt`	Adds `Flask-Limiter==3.5.0` and `yt-dlp`; normalizes `google-auth` entry.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client as Client
    participant Flask as Flask App
    participant Limiter as Rate Limiter
    participant Executor as ThreadPool Executor
    participant Worker as Worker (generator / subprocess)
    participant External as ML Model / yt-dlp / External API

    Client->>Flask: POST /get_mcq or /get_problems
    Flask->>Limiter: check per-endpoint rate limit
    alt rate limited
        Limiter-->>Client: 429 JSON
    else allowed
        Flask->>Executor: execute_with_timeout(func, timeout, args...)
        Executor->>Worker: run generator or subprocess
        Worker->>External: call ML model or yt-dlp
        alt success
            External-->>Worker: result
            Worker-->>Executor: (result, None)
            Executor-->>Flask: (result, None)
            Flask-->>Client: 200 JSON (structured success)
        else timeout
            Worker-->>Executor: (None, "timeout")
            Executor-->>Flask: (None, "timeout")
            Flask-->>Client: 504 JSON (standardized)
        else internal error
            Worker-->>Executor: (None, error)
            Executor-->>Flask: (None, error)
            Flask-->>Client: 500 JSON (standardized)
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I time the hops and thread the maze of calls,
I guard the JSON gates and catch the stalls.
I fetch captions, race three makers at once,
Return neat answers—no endless waits to shunt.
A rabbit cheers: small hops, robustly tall.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: introducing robust timeout handling for ML endpoints with parallel execution, which aligns with the core modifications throughout backend/server.py.
Linked Issues check	✅ Passed	The PR meets issue `#592` requirements: adds backend timeout protection via execute_with_timeout wrapper, implements proper HTTP status codes (400, 500, 504), and adds input validation for ML endpoints.
Out of Scope Changes check	✅ Passed	All changes are within scope: timeout handling, input validation, error handling, parallel execution, response structure fixes, and endpoint refactoring align with issue `#592` objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can enable review details to help with troubleshooting, context usage and more.

Enable the reviews.review_details setting to include review details such as the model used, the time taken for each step and more in the review comments.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

backend/server.py (2)

483-487: ⚠️ Potential issue | 🔴 Critical

OAuth interactive flow will fail in server/production context.

tools.run_flow() opens a browser for interactive OAuth consent. This:

Blocks the request thread indefinitely waiting for user interaction
Fails entirely in headless server environments
Creates race conditions if multiple requests trigger auth simultaneously

For server-to-server auth, use a service account or pre-authorize and store refresh tokens.

🔧 Suggested approach using service account

from google.oauth2 import service_account

SCOPES = ["https://www.googleapis.com/auth/forms.body"]
creds = service_account.Credentials.from_service_account_file(
    "service_account_key.json", scopes=SCOPES
)

form_service = discovery.build("forms", "v1", credentials=creds, ...)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 483 - 487, The current code uses the
interactive OAuth flow (client.flow_from_clientsecrets and tools.run_flow) with
file.Storage and creds, which will block/fail in server environments; replace
this with non-interactive server auth by loading service account credentials
(use google.oauth2.service_account.Credentials.from_service_account_file with
the SCOPES) and pass that creds into discovery.build when creating the Forms API
client, removing the tools.run_flow/client.flow_from_clientsecrets logic and
file.Storage usage; alternatively, if using a user account, load a
pre-authorized refresh token from secure storage and refresh it programmatically
instead of calling tools.run_flow.

644-672: ⚠️ Potential issue | 🟠 Major

Missing timeout protection and input validation on /get_shortq_hard (and similar _hard endpoints).

These endpoints call qg.generate() directly without execute_with_timeout, contradicting the PR's objective to add timeout protection to all ML endpoints. They also lack the input length validation (10-50000 chars) applied to other endpoints.

Additionally, num_questions=len(input_questions) will be 0 if input_questions is empty, potentially causing unexpected behavior.

🔧 Proposed fix for /get_shortq_hard (apply similarly to _mcq_hard and _boolq_hard)

 def get_shortq_hard():
     data = request.get_json(silent=True)

     if data is None or not isinstance(data, dict):
         return jsonify({...}), 400

     input_text = data.get("input_text", "")
     use_mediawiki = data.get("use_mediawiki", 0)
+    input_questions = data.get("input_question", [])
+
+    if not isinstance(input_text, str):
+        return jsonify({"error": "input_text must be string"}), 400
+
+    input_text = input_text.strip()
+
+    if len(input_text) < 10:
+        return jsonify({"error": "Input too short"}), 400
+
+    if len(input_text) > 50000:
+        return jsonify({"error": "Input too long"}), 400
+
+    if not input_questions:
+        return jsonify({"error": "input_question is required"}), 400

     input_text = process_input_text(input_text, use_mediawiki)
-    input_questions = data.get("input_question", [])

-    output = qg.generate(...)
+    output, error = execute_with_timeout(
+        qg.generate,
+        60,
+        article=input_text,
+        num_questions=len(input_questions),
+        answer_style="sentences"
+    )
+
+    if error == "timeout":
+        return jsonify({"error": "Request timed out"}), 504
+    if error:
+        return jsonify({"error": error}), 500

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 644 - 672, The /get_shortq_hard handler calls
qg.generate directly and lacks input length checks and protection against zero
num_questions; wrap the qg.generate(...) call in the existing
execute_with_timeout(...) helper (use the same timeout semantics used by other
ML endpoints), validate processed input_text length is between 10 and 50000
characters and return a 400 JSON error if outside that range, and ensure
num_questions is never zero (e.g., num_questions = max(1, len(input_questions))
or derive a sensible default) before passing it to qg.generate; preserve
post-processing with make_question_harder and propagate execute_with_timeout
errors the same way other endpoints do.

🧹 Nitpick comments (4)

backend/server.py (3)

580-588: Simplify redundant key check.

The static analyzer flagged this pattern. Using .get() is cleaner.

♻️ Proposed fix

-            if "options" in qapair and qapair["options"]:
-                options = qapair.get("options") or []
+            options = qapair.get("options")
+            if options:
                 valid_options = [opt for opt in options if opt]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 580 - 588, The current code redundantly
checks "if 'options' in qapair and qapair['options']"; replace that pattern by
always reading options = qapair.get("options") or [] and proceed — compute
valid_options = [opt for opt in options if opt], build choices =
[qapair["answer"], *valid_options[:3]] (or ensure answer-only if no options),
shuffle and create choices_list as before; update the block around qapair,
options, valid_options, choices and choices_list to remove the explicit key
presence check and rely on .get().

456-464: Overly broad exception handling obscures root cause.

Catching bare Exception returns a generic 400 error for all failures, but different errors warrant different responses:

Authentication/permission errors → 403
Document not found → 404
Network/server errors → 500

♻️ Proposed improvement

     try:
         content = docs_service.get_document_content(doc_id)
         return jsonify({"content": content})
-
-    except Exception:
+    except HttpError as e:
+        if e.resp.status == 404:
+            return jsonify({"error": "Document not found", "code": "not_found"}), 404
+        elif e.resp.status == 403:
+            return jsonify({"error": "Permission denied", "code": "forbidden"}), 403
+        return jsonify({"error": "Failed to retrieve document", "code": "document_fetch_error"}), e.resp.status
+    except Exception as e:
         return jsonify({
-            "error": "Failed to retrieve document content",
+            "error": f"Failed to retrieve document content: {type(e).__name__}",
             "code": "document_fetch_error"
-        }), 400
+        }), 500

This requires importing HttpError from googleapiclient.errors.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 456 - 464, Replace the broad except Exception
around the docs_service.get_document_content call with specific handlers: import
HttpError from googleapiclient.errors and catch HttpError first, inspect the
HTTP status (e.g., e.resp.status or e.status_code) to return 403 for 403
responses, 404 for 404 responses, and 500 for other HttpError statuses; also add
explicit except FileNotFoundError to return 404 and except PermissionError to
return 403, finally keep a generic except Exception that returns a 500 and logs
the exception for unexpected errors so the endpoint
(docs_service.get_document_content) returns appropriate status codes and
preserves root-cause details.

365-378: Consider adding timeout protection to QA model calls.

The /get_mcq_answer, /get_shortq_answer, and /get_boolean_answer endpoints call qa_model() and answer.predict_boolean_answer() without timeout protection. While these are typically faster than generation, they can still hang on pathological inputs.

For consistency with the PR's timeout protection goals, consider wrapping these calls.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 365 - 378, Wrap calls to qa_model(...) and
answer.predict_boolean_answer(...) with a timeout so they cannot hang
indefinitely; specifically, in the loop that calls qa_model (the block using
variables input_questions, input_options and computing options_with_answer) and
in the handlers for the /get_mcq_answer, /get_shortq_answer, and
/get_boolean_answer endpoints, execute the model calls via a timeout-enabled
executor (e.g., concurrent.futures with future.result(timeout=...) or
asyncio.wait_for) and handle TimeoutError by logging the timeout and returning a
safe fallback (skip the question, return a default/empty answer, or a 504
response) so the server remains responsive. Ensure you catch and log the timeout
and other exceptions around qa_model(...) and answer.predict_boolean_answer(...)
and keep existing similarity-selection logic (vectorizer, cosine_similarity)
intact when the call succeeds.

requirements.txt (1)

35-36: Pin yt-dlp to a specific version for reproducibility.

yt-dlp is unpinned while Flask-Limiter==3.5.0 is pinned. Since yt-dlp is actively developed with frequent breaking changes, an unpinned version could cause unexpected failures in production. Given that the latest version is 2026.3.13, consider updating to a recent pinned version:
📦 Proposed fix
 Flask-Limiter==3.5.0
-yt-dlp
+yt-dlp==2026.3.13
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@requirements.txt` around lines 35 - 36, The requirements file currently pins
Flask-Limiter but leaves yt-dlp unpinned (the lines containing
"Flask-Limiter==3.5.0" and "yt-dlp"); pin yt-dlp to a specific, recent stable
version (for example yt-dlp==2026.3.13 or another vetted release) to ensure
reproducible installs, update the requirements.txt entry for "yt-dlp" to the
chosen pinned version, and commit the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 323-336: The current logic assumes mcq_out, bool_out, and
short_out are dicts and calls .get() even when execute_with_timeout may return
None (e.g., returns (None, "unknown_error")), causing AttributeError; update the
checks around the response assembly to defensively handle None/empty outputs by
using a safe accessor (e.g., replace mcq_out.get(...) with (mcq_out or
{}).get(...) or explicitly check isinstance(mcq_out, dict)) and do the same for
bool_out and short_out so that when mcq_err/bool_err/short_err is falsy but the
corresponding *_out is None or missing keys you fall back to empty lists instead
of calling .get() on None; reference the variables mcq_out, bool_out, short_out
and the function execute_with_timeout when making the change.
- Around line 76-79: The limiter is using get_remote_address so behind a reverse
proxy all requests will show the proxy IP; update the rate-limiting key to use
the real client IP by either applying werkzeug.middleware.proxy_fix.ProxyFix to
app.wsgi_app (e.g., wrap app with ProxyFix(x_for=1, ...) and mark your proxy as
trusted) or replace the key_func passed to Limiter with a small function that
returns request.headers.get('X-Forwarded-For', request.remote_addr). Ensure you
modify the Limiter instantiation (the limiter variable) to use the new key
function instead of get_remote_address and document/configure trusted proxy
behavior accordingly.
- Around line 42-68: The current execute_with_timeout spawns a new Process with
a bound method (e.g., MCQGen.generate_mcq, BoolQGen.generate_boolq,
ShortQGen.generate_shortq) which forces pickling of heavy model state
per-request; instead refactor to a worker pool pattern so models are initialized
once per worker process and tasks call those existing instances: create a Pool
or long-lived worker processes with an initializer that constructs generator
instances (MCQGen/BoolQGen/ShortQGen) in-process, then submit jobs via
apply_async or a task queue with a timeout on the result; alternatively, if you
must keep in-process concurrency, replace the per-call Process in
execute_with_timeout with thread-based timeouts or use a dedicated worker
process that receives lightweight task payloads (not bound methods) and returns
results, ensuring p.terminate is no longer used to kill model-bearing processes
to avoid GPU/resource leaks.

---

Outside diff comments:
In `@backend/server.py`:
- Around line 483-487: The current code uses the interactive OAuth flow
(client.flow_from_clientsecrets and tools.run_flow) with file.Storage and creds,
which will block/fail in server environments; replace this with non-interactive
server auth by loading service account credentials (use
google.oauth2.service_account.Credentials.from_service_account_file with the
SCOPES) and pass that creds into discovery.build when creating the Forms API
client, removing the tools.run_flow/client.flow_from_clientsecrets logic and
file.Storage usage; alternatively, if using a user account, load a
pre-authorized refresh token from secure storage and refresh it programmatically
instead of calling tools.run_flow.
- Around line 644-672: The /get_shortq_hard handler calls qg.generate directly
and lacks input length checks and protection against zero num_questions; wrap
the qg.generate(...) call in the existing execute_with_timeout(...) helper (use
the same timeout semantics used by other ML endpoints), validate processed
input_text length is between 10 and 50000 characters and return a 400 JSON error
if outside that range, and ensure num_questions is never zero (e.g.,
num_questions = max(1, len(input_questions)) or derive a sensible default)
before passing it to qg.generate; preserve post-processing with
make_question_harder and propagate execute_with_timeout errors the same way
other endpoints do.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 580-588: The current code redundantly checks "if 'options' in
qapair and qapair['options']"; replace that pattern by always reading options =
qapair.get("options") or [] and proceed — compute valid_options = [opt for opt
in options if opt], build choices = [qapair["answer"], *valid_options[:3]] (or
ensure answer-only if no options), shuffle and create choices_list as before;
update the block around qapair, options, valid_options, choices and choices_list
to remove the explicit key presence check and rely on .get().
- Around line 456-464: Replace the broad except Exception around the
docs_service.get_document_content call with specific handlers: import HttpError
from googleapiclient.errors and catch HttpError first, inspect the HTTP status
(e.g., e.resp.status or e.status_code) to return 403 for 403 responses, 404 for
404 responses, and 500 for other HttpError statuses; also add explicit except
FileNotFoundError to return 404 and except PermissionError to return 403,
finally keep a generic except Exception that returns a 500 and logs the
exception for unexpected errors so the endpoint
(docs_service.get_document_content) returns appropriate status codes and
preserves root-cause details.
- Around line 365-378: Wrap calls to qa_model(...) and
answer.predict_boolean_answer(...) with a timeout so they cannot hang
indefinitely; specifically, in the loop that calls qa_model (the block using
variables input_questions, input_options and computing options_with_answer) and
in the handlers for the /get_mcq_answer, /get_shortq_answer, and
/get_boolean_answer endpoints, execute the model calls via a timeout-enabled
executor (e.g., concurrent.futures with future.result(timeout=...) or
asyncio.wait_for) and handle TimeoutError by logging the timeout and returning a
safe fallback (skip the question, return a default/empty answer, or a 504
response) so the server remains responsive. Ensure you catch and log the timeout
and other exceptions around qa_model(...) and answer.predict_boolean_answer(...)
and keep existing similarity-selection logic (vectorizer, cosine_similarity)
intact when the call succeeds.

In `@requirements.txt`:
- Around line 35-36: The requirements file currently pins Flask-Limiter but
leaves yt-dlp unpinned (the lines containing "Flask-Limiter==3.5.0" and
"yt-dlp"); pin yt-dlp to a specific, recent stable version (for example
yt-dlp==2026.3.13 or another vetted release) to ensure reproducible installs,
update the requirements.txt entry for "yt-dlp" to the chosen pinned version, and
commit the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 05c0db9e-a33a-4e5c-9b3a-28c2c3c18e48

📥 Commits

Reviewing files that changed from the base of the PR and between fc3bf1a and 87ab02b.

📒 Files selected for processing (2)

backend/server.py
requirements.txt

backend/server.py

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/server.py (1)
365-378: ⚠️ Potential issue | 🟡 Minor

Missing timeout protection and input validation for QA endpoints.

The /get_mcq_answer, /get_shortq_answer, and /get_boolean_answer endpoints call qa_model() and answer.predict_boolean_answer() directly without timeout protection or input text length validation. A malicious or oversized input could cause these endpoints to hang.

Consider applying consistent input validation and timeout handling across all ML-backed endpoints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 365 - 378, Add input validation and timeout
protection around all ML-backed endpoints (/get_mcq_answer, /get_shortq_answer,
/get_boolean_answer): validate and reject overly large or empty input_text and
input_questions before calling qa_model or answer.predict_boolean_answer, and
wrap calls to qa_model(...) and answer.predict_boolean_answer(...) in a
bounded-time execution (e.g., asyncio.wait_for or concurrent.futures with a
timeout) so slow or malicious inputs raise a timeout and return a 4xx/5xx
response. Update the handlers that call qa_model and
answer.predict_boolean_answer to catch validation errors and timeout exceptions
and return a clear error response instead of hanging.

♻️ Duplicate comments (1)

backend/server.py (1)
42-68: ⚠️ Potential issue | 🔴 Critical

Critical: Multiprocessing approach cannot efficiently handle bound methods with ML model state.

This execute_with_timeout implementation passes bound methods (e.g., MCQGen.generate_mcq) to multiprocessing.Process, which requires pickling the entire generator instance. The MCQGenerator, BoolQGenerator, and ShortQGenerator classes (from backend/Generator/main.py) all load heavy objects in __init__:

T5ForConditionalGeneration models

spaCy NLP pipelines

Sense2Vec models

NLTK FreqDist

This causes:

Multi-GB serialization overhead per request — negating timeout benefits

Potential PicklingError — PyTorch models may fail to pickle

Resource leaks — p.terminate() (Line 57) sends SIGTERM but doesn't guarantee GPU memory or file handle cleanup

Consider a worker pool pattern where generators are initialized once per worker process, or use thread-based timeouts with signals for CPU-bound operations.

,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 42 - 68, The current execute_with_timeout
wrapper spawns a new multiprocessing.Process with a bound method (e.g.,
MCQGen.generate_mcq) which forces pickling of heavy model-laden generator
instances (MCQGenerator, BoolQGenerator, ShortQGenerator), causing huge
serialization overhead, PicklingError risks, and unsafe resource cleanup on
terminate; fix by replacing this pattern with a persistent worker pool where
each worker process initializes its generator instances once at startup and
exposes a lightweight task interface (e.g., task IDs or plain data inputs) that
execute_with_timeout (or a new submit_with_timeout) uses to enqueue work, or
alternatively switch to a thread-based timeout/signal approach for CPU-bound
tasks—ensure you stop passing bound instances into Process/wrapper and instead
reference worker-local generators by name, and update
execute_with_timeout/wrapper to accept only serializable data payloads and
return status via IPC (Queue) without pickling model objects.

🧹 Nitpick comments (4)

backend/server.py (4)

460-464: Broad exception catch obscures failure reasons.

Catching bare Exception hides whether the failure was due to authentication, network issues, or an invalid document. Consider logging the actual exception for debugging while returning a generic message to the client:

-    except Exception:
+    except Exception as e:
+        app.logger.error(f"Document fetch failed for doc_id={doc_id}: {e}")
         return jsonify({
             "error": "Failed to retrieve document content",
             "code": "document_fetch_error"
         }), 400

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 460 - 464, The bare except in the document
retrieval handler masks root causes; modify the except Exception block that
returns jsonify({"error":"Failed to retrieve document
content","code":"document_fetch_error"}) to catch/log the real exception: add
logging (e.g., use the existing logger or app.logger) to record the exception
and stacktrace, and optionally narrow catches to known error types
(auth/network/NotFound) while still returning the same generic JSON error to the
client; ensure you reference the same exception handling block around the
document fetch logic so the logged error includes exception info and context
(document id/request params).

47-48: Consider preserving exception type for better diagnostics.

Catching bare Exception loses the exception type and traceback. For debugging production issues, consider capturing more context:

         except Exception as e:
-            q.put(("error", str(e)))
+            q.put(("error", f"{type(e).__name__}: {e}"))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 47 - 48, The except block that currently does
`except Exception as e: q.put(("error", str(e)))` should preserve exception type
and traceback for diagnostics: import the traceback module and change the
payload sent to the queue from a single string to include the exception class
name and the formatted traceback (e.g., include type(e).__name__ and
traceback.format_exc()) so callers receiving `q.put(...)` (the error message
producer in backend/server.py) get full context for debugging.

308-316: ThreadPoolExecutor wrapping multiprocessing adds unnecessary overhead.

Each thread just spawns a subprocess via execute_with_timeout. Consider using multiprocessing.Pool or concurrent.futures.ProcessPoolExecutor directly to avoid the extra threading layer. However, this is secondary to the fundamental pickling issue with ML model instances.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 308 - 316, The ThreadPoolExecutor is just
spawning subprocesses via execute_with_timeout, adding unnecessary threading
overhead and exposing pickling problems for ML model instances; replace the
ThreadPoolExecutor block that submits run_mcq/run_boolq/run_shortq with a
process-based executor (e.g., concurrent.futures.ProcessPoolExecutor or
multiprocessing.Pool) and ensure the target call uses execute_with_timeout in
the worker process. Also make the tasks picklable by loading any ML model
instances inside the worker process (move model initialization out of
module/global scope into the body of run_mcq/run_boolq/run_shortq or implement
custom __getstate__/__setstate__ for model objects) so the ProcessPoolExecutor
can serialize the tasks correctly. Ensure you reference/modify run_mcq,
run_boolq, run_shortq, and execute_with_timeout accordingly.

581-588: Simplify redundant key check.

The condition already verifies qapair["options"] is truthy, making the subsequent or [] unnecessary:

-            if "options" in qapair and qapair["options"]:
-                options = qapair.get("options") or []
-                valid_options = [opt for opt in options if opt]
+            options = qapair.get("options")
+            if options:
+                valid_options = [opt for opt in options if opt]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 581 - 588, The code redundantly uses
qapair.get("options") or [] after already checking "options" in qapair and
qapair["options"] is truthy; replace that line with a direct access (options =
qapair["options"]) to simplify the logic and keep behavior identical, leaving
the rest of the block (valid_options, choices, choices_list) unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 663-667: The qg.generate(...) call in the hard-question endpoints
(/get_shortq_hard, /get_mcq_hard, /get_boolq_hard) lacks timeout protection and
input-length validation; wrap the qg.generate invocation with the same
execute_with_timeout utility used elsewhere (use the same timeout value) and
enforce the same input text length checks (10-50000 chars) before calling it;
update the handler functions that call qg.generate to validate the incoming
article text and then call execute_with_timeout(() => qg.generate(...)) so the
endpoint returns a timeout/error consistently instead of hanging.
- Around line 456-458: The endpoint currently passes a raw doc_id to
get_document_content which expects a full document URL (since
extract_document_id(url) looks for '/document/d/{id}'); construct the full URL
that matches that pattern before calling get_document_content (e.g. build a
string like f"{<docs host>}/document/d/{doc_id}" or use
docs_service.base_url/format helper if available) and pass that URL to
docs_service.get_document_content(document_url) instead of the raw doc_id.

---

Outside diff comments:
In `@backend/server.py`:
- Around line 365-378: Add input validation and timeout protection around all
ML-backed endpoints (/get_mcq_answer, /get_shortq_answer, /get_boolean_answer):
validate and reject overly large or empty input_text and input_questions before
calling qa_model or answer.predict_boolean_answer, and wrap calls to
qa_model(...) and answer.predict_boolean_answer(...) in a bounded-time execution
(e.g., asyncio.wait_for or concurrent.futures with a timeout) so slow or
malicious inputs raise a timeout and return a 4xx/5xx response. Update the
handlers that call qa_model and answer.predict_boolean_answer to catch
validation errors and timeout exceptions and return a clear error response
instead of hanging.

---

Duplicate comments:
In `@backend/server.py`:
- Around line 42-68: The current execute_with_timeout wrapper spawns a new
multiprocessing.Process with a bound method (e.g., MCQGen.generate_mcq) which
forces pickling of heavy model-laden generator instances (MCQGenerator,
BoolQGenerator, ShortQGenerator), causing huge serialization overhead,
PicklingError risks, and unsafe resource cleanup on terminate; fix by replacing
this pattern with a persistent worker pool where each worker process initializes
its generator instances once at startup and exposes a lightweight task interface
(e.g., task IDs or plain data inputs) that execute_with_timeout (or a new
submit_with_timeout) uses to enqueue work, or alternatively switch to a
thread-based timeout/signal approach for CPU-bound tasks—ensure you stop passing
bound instances into Process/wrapper and instead reference worker-local
generators by name, and update execute_with_timeout/wrapper to accept only
serializable data payloads and return status via IPC (Queue) without pickling
model objects.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 460-464: The bare except in the document retrieval handler masks
root causes; modify the except Exception block that returns
jsonify({"error":"Failed to retrieve document
content","code":"document_fetch_error"}) to catch/log the real exception: add
logging (e.g., use the existing logger or app.logger) to record the exception
and stacktrace, and optionally narrow catches to known error types
(auth/network/NotFound) while still returning the same generic JSON error to the
client; ensure you reference the same exception handling block around the
document fetch logic so the logged error includes exception info and context
(document id/request params).
- Around line 47-48: The except block that currently does `except Exception as
e: q.put(("error", str(e)))` should preserve exception type and traceback for
diagnostics: import the traceback module and change the payload sent to the
queue from a single string to include the exception class name and the formatted
traceback (e.g., include type(e).__name__ and traceback.format_exc()) so callers
receiving `q.put(...)` (the error message producer in backend/server.py) get
full context for debugging.
- Around line 308-316: The ThreadPoolExecutor is just spawning subprocesses via
execute_with_timeout, adding unnecessary threading overhead and exposing
pickling problems for ML model instances; replace the ThreadPoolExecutor block
that submits run_mcq/run_boolq/run_shortq with a process-based executor (e.g.,
concurrent.futures.ProcessPoolExecutor or multiprocessing.Pool) and ensure the
target call uses execute_with_timeout in the worker process. Also make the tasks
picklable by loading any ML model instances inside the worker process (move
model initialization out of module/global scope into the body of
run_mcq/run_boolq/run_shortq or implement custom __getstate__/__setstate__ for
model objects) so the ProcessPoolExecutor can serialize the tasks correctly.
Ensure you reference/modify run_mcq, run_boolq, run_shortq, and
execute_with_timeout accordingly.
- Around line 581-588: The code redundantly uses qapair.get("options") or []
after already checking "options" in qapair and qapair["options"] is truthy;
replace that line with a direct access (options = qapair["options"]) to simplify
the logic and keep behavior identical, leaving the rest of the block
(valid_options, choices, choices_list) unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9af4dee4-f1dc-430f-8a38-bf6c40d7a8c9

📥 Commits

Reviewing files that changed from the base of the PR and between 87ab02b and 5893fd2.

📒 Files selected for processing (1)

backend/server.py

backend/server.py

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

backend/server.py (1)

42-52: ⚠️ Potential issue | 🔴 Critical

Per-request multiprocessing with bound model methods is still a high-risk execution model.

execute_with_timeout spins a new Process for each request while passing model-bound callables. This is expensive and can break under spawn-based start methods. Prefer long-lived worker processes (initialized once with models) and submit lightweight jobs with per-task timeouts.

#!/bin/bash
set -euo pipefail

echo "1) Locate timeout executor and process primitives"
rg -n "def execute_with_timeout|Process\(|Queue\(|set_start_method|get_context" backend/server.py

echo
echo "2) Locate generator singletons and timeout callsites"
rg -n "MCQGen =|BoolQGen =|ShortQGen =|qg =|execute_with_timeout\(" backend/server.py

echo
echo "3) AST check: nested function used as Process target"
python - <<'PY'
import ast, pathlib
p = pathlib.Path("backend/server.py")
tree = ast.parse(p.read_text())
for fn in [n for n in ast.walk(tree) if isinstance(n, ast.FunctionDef) and n.name == "execute_with_timeout"]:
    nested = [n.name for n in fn.body if isinstance(n, ast.FunctionDef)]
    print("nested defs in execute_with_timeout:", nested)
    for n in ast.walk(fn):
        if isinstance(n, ast.Call) and isinstance(n.func, ast.Name) and n.func.id == "Process":
            print("Process() call line:", n.lineno)
PY

Also applies to: 141-145, 186-190, 232-236, 288-306, 677-683, 737-743, 796-802

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 42 - 52, The current execute_with_timeout
creates a new Process per request using a nested wrapper and passes bound model
methods (wrapper, Process, Queue), which is expensive and unsafe with
spawn-based start; refactor to a long-lived worker pool or persistent worker
Process(es) that are initialized once with model instances and accept
lightweight task messages (use a manager Queue or
multiprocessing.Pool/ProcessPoolExecutor with apply_async) instead of spawning
per call, remove the nested wrapper as a Process target and instead send task
descriptors (function name + args) or callables that the worker resolves against
its local model, and update callsites (MCQGen, BoolQGen, ShortQGen, qg and all
execute_with_timeout usages) to submit tasks to the worker pool and enforce
per-task timeouts via the pool’s timeout support or Future.result(timeout=...)
rather than creating new Process objects each request.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 47-48: Replace instances where the code enqueues raw exception
text via q.put(("error", str(e))) with a stable public error indicator (for
example q.put(("error", {"code":"internal_server_error","message":"An internal
error occurred"}))) and log the full exception details server-side using
logging.exception or logger.exception so stack traces stay in server logs only;
locate all occurrences of q.put(("error", str(e))) in server.py (the q variable
and its error enqueue sites) and apply the same mapping so API responses never
contain raw exception strings but the server logs retain the detailed exception
information.
- Around line 461-465: The error handler in the /get_content endpoint currently
catches all exceptions and returns a 400, which hides server errors; update the
exception handling in the function that serves /get_content (the try/except
block around document retrieval) to explicitly catch expected client/validation
errors (e.g., ValueError, werkzeug.exceptions.BadRequest) and return a 400 with
the existing error JSON, and for all other unexpected exceptions log the full
exception (including stack trace) and return a 500 response with code
"internal_server_error" using jsonify; ensure you reference the same response
shape and use the existing logger instead of swallowing the exception.
- Around line 338-343: The handler that returns the "All generators failed"
response currently always uses HTTP 500; update the logic in the /get_problems
response where mcq_err, bool_err, and short_err are checked so that if all three
indicate a timeout (e.g., mcq_err == "timeout" && bool_err == "timeout" &&
short_err == "timeout" or using your TIMEOUT_ERROR constant), return the JSON
error with HTTP 504 instead of 500; otherwise keep the existing 500 return path.
Reference variables: mcq_err, bool_err, short_err and the existing
jsonify({...}) return.

---

Duplicate comments:
In `@backend/server.py`:
- Around line 42-52: The current execute_with_timeout creates a new Process per
request using a nested wrapper and passes bound model methods (wrapper, Process,
Queue), which is expensive and unsafe with spawn-based start; refactor to a
long-lived worker pool or persistent worker Process(es) that are initialized
once with model instances and accept lightweight task messages (use a manager
Queue or multiprocessing.Pool/ProcessPoolExecutor with apply_async) instead of
spawning per call, remove the nested wrapper as a Process target and instead
send task descriptors (function name + args) or callables that the worker
resolves against its local model, and update callsites (MCQGen, BoolQGen,
ShortQGen, qg and all execute_with_timeout usages) to submit tasks to the worker
pool and enforce per-task timeouts via the pool’s timeout support or
Future.result(timeout=...) rather than creating new Process objects each
request.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 93b490f6-a47e-4892-86ab-a1224f592603

📥 Commits

Reviewing files that changed from the base of the PR and between 5893fd2 and 6704c41.

📒 Files selected for processing (1)

backend/server.py

backend/server.py

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

backend/server.py (2)
475-647: ⚠️ Potential issue | 🟠 Major

Validate question_type and qa_pairs before indexing into them.

This handler only checks that the body is a dict. After that it assumes every qapair has the keys required by the selected branch (question, answer, and sometimes options), and unknown question_type values silently fall into the mixed-question path. Payloads like {"qa_pairs":[{}]} or {"qa_pairs":"oops"} will still become KeyError/TypeError 500s instead of the structured 400s the rest of this PR is adding.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 475 - 647, The handler currently trusts
request.get_json and indexes qa_pairs/question_type directly, causing
KeyError/TypeError for malformed payloads; validate that qa_pairs is a list and
question_type is a string, and validate each item in qa_pairs is a dict
containing required keys before building requests_list (check 'question' present
for all, and for MCQ paths ensure 'answer' exists and 'options' is a list),
return a 400 JSON error when validation fails; also treat unknown question_type
values explicitly (return 400) instead of falling into the generic else branch
so only supported types ("get_shortq","get_mcq","get_boolq") are accepted;
update code around qa_pairs/question_type parsing and before the branches that
reference qa_pairs (symbols: qa_pairs, question_type, get_shortq, get_mcq,
get_boolq, requests_list, NEW_QUESTION).
346-377: ⚠️ Potential issue | 🟠 Major

The answer endpoints still lack timeout protection and field type validation.

These handlers have top-level JSON validation but qa_model(...) and answer.predict_boolean_answer(...) are called directly without the execute_with_timeout wrapper available elsewhere in the codebase. The input fields (input_text, input_question, input_options) are not type-checked before use. Slow model inference can pin worker threads indefinitely, and malformed arrays can trigger unhandled 500 errors instead of the 400/504 behavior implemented in other endpoints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 346 - 377, Validate and type-check incoming
fields before use: ensure request.get_json(...) returns a dict, input_text is a
string, input_questions is a list of strings, and input_options is a list of
lists of strings and that len(input_questions) == len(input_options); if
validation fails return 400 with code "invalid_request". Wrap any blocking model
calls (qa_model(...) and any answer.predict_boolean_answer(...) usages) with the
existing execute_with_timeout(...) helper and catch TimeoutError to return 504
with code "model_timeout"; also catch other exceptions from the wrapped call and
return a 500/appropriate error. Use the same response key consistently (outputs
vs output) when returning results. Reference: request.get_json, qa_model,
answer.predict_boolean_answer, execute_with_timeout.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 653-688: The request handler currently uses len(input_questions)
directly for num_questions (passed to execute_with_timeout -> qg.generate)
without validating input_question; ensure you validate and sanitize it: check
that input_questions (from data.get("input_question")) is either missing/None or
a list, return a 400 if it is present but not a list (or coerce to an empty
list), validate that each entry is of an expected type (e.g., str) if
applicable, and enforce a hard cap like MAX_NUM_QUESTIONS (e.g., 20) before
computing num_questions; then pass this safe, bounded num_questions into
execute_with_timeout/qg.generate instead of raw len(input_questions).
- Around line 452-454: The endpoint handling the Google Doc import changed to
return {"content": ...}, breaking the extension which expects the response body
to be the raw string; update the /get_content handler (the block that builds
document_url and calls docs_service.get_document_content) to preserve backward
compatibility by returning both forms — include the nested key ({"content":
content}) and also return the raw content as the top-level JSON value (so the
previous client code that reads the response as the string still receives the
same value) or otherwise restore the previous raw-string response; ensure the
response uses docs_service.get_document_content(document_url) value in both
places so callers using data or data.content continue to work.
- Around line 131-137: The call to process_input_text(input_text, use_mediawiki)
performs a synchronous mediawikiapi.summary() outside the timeout boundary,
risking a hung Flask worker; wrap the MediaWiki enrichment in a timeout or move
the enrichment inside the execute_with_timeout call so it runs under the same
deadline. Specifically, either (A) create a small wrapper function (e.g.,
enrich_with_mediawiki) that calls process_input_text / mediawikiapi.summary and
invoke it via execute_with_timeout before calling MCQGen.generate_mcq, or (B)
expand the task passed to execute_with_timeout to perform both enrichment and
MCQGen.generate_mcq in one call; update all handlers that call
process_input_text (the occurrences around MCQGen.generate_mcq and
execute_with_timeout) to use one of these patterns so mediawikiapi.summary is
protected by a timeout.

---

Outside diff comments:
In `@backend/server.py`:
- Around line 475-647: The handler currently trusts request.get_json and indexes
qa_pairs/question_type directly, causing KeyError/TypeError for malformed
payloads; validate that qa_pairs is a list and question_type is a string, and
validate each item in qa_pairs is a dict containing required keys before
building requests_list (check 'question' present for all, and for MCQ paths
ensure 'answer' exists and 'options' is a list), return a 400 JSON error when
validation fails; also treat unknown question_type values explicitly (return
400) instead of falling into the generic else branch so only supported types
("get_shortq","get_mcq","get_boolq") are accepted; update code around
qa_pairs/question_type parsing and before the branches that reference qa_pairs
(symbols: qa_pairs, question_type, get_shortq, get_mcq, get_boolq,
requests_list, NEW_QUESTION).
- Around line 346-377: Validate and type-check incoming fields before use:
ensure request.get_json(...) returns a dict, input_text is a string,
input_questions is a list of strings, and input_options is a list of lists of
strings and that len(input_questions) == len(input_options); if validation fails
return 400 with code "invalid_request". Wrap any blocking model calls
(qa_model(...) and any answer.predict_boolean_answer(...) usages) with the
existing execute_with_timeout(...) helper and catch TimeoutError to return 504
with code "model_timeout"; also catch other exceptions from the wrapped call and
return a 500/appropriate error. Use the same response key consistently (outputs
vs output) when returning results. Reference: request.get_json, qa_model,
answer.predict_boolean_answer, execute_with_timeout.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 04c7fa64-0a0f-4978-860a-3a1afcc243fe

📥 Commits

Reviewing files that changed from the base of the PR and between 6704c41 and 7566149.

📒 Files selected for processing (1)

backend/server.py

backend/server.py

coderabbitai · 2026-03-22T06:16:12Z

backend/server.py

+        document_url = f"https://docs.google.com/document/d/{doc_id}/edit"
+        content = docs_service.get_document_content(document_url)
+        return jsonify({"content": content})


⚠️ Potential issue | 🟠 Major

This /get_content contract change breaks the extension client.

This route now returns {"content": ...}, but extension/src/pages/text_input/TextInput.jsx:78-80 still does setText(data || "Error in retrieving"). Until that caller is updated to use data.content (or this endpoint preserves backward compatibility during rollout), Google Doc imports in the extension will stop populating the text field correctly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 452 - 454, The endpoint handling the Google Doc import changed to return {"content": ...}, breaking the extension which expects the response body to be the raw string; update the /get_content handler (the block that builds document_url and calls docs_service.get_document_content) to preserve backward compatibility by returning both forms — include the nested key ({"content": content}) and also return the raw content as the top-level JSON value (so the previous client code that reads the response as the string still receives the same value) or otherwise restore the previous raw-string response; ensure the response uses docs_service.get_document_content(document_url) value in both places so callers using data or data.content continue to work.

backend/server.py

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

backend/server.py (1)
565-568: ⚠️ Potential issue | 🟠 Major

/get_content response-shape change is likely breaking existing clients.

The endpoint now returns {"content": ...}, but the provided frontend usage (eduaid_web/src/pages/Text_Input.jsx) sets text from data directly. That path will receive an object instead of a string.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 565 - 568, The endpoint now wraps the
document text in {"content": ...} which breaks callers expecting the raw string;
change the /get_content handler so it returns the document text in the original
shape (a plain string or the exact JSON previously used). Specifically, after
computing document_url and calling
docs_service.get_document_content(document_url) (and in the handler function
that uses document_url and docs_service.get_document_content), return the
content as the original value (e.g., return jsonify(content) or return
Response(content, mimetype="text/plain")) instead of jsonify({"content":
content}) so existing frontend code that reads data as a string continues
working.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 68-71: The limiter currently trusts the X-Forwarded-For header via
the custom key_func (lambda using request.headers.get("X-Forwarded-For",
request.remote_addr)), which allows header spoofing; instead install and
configure Werkzeug's ProxyFix middleware for your Flask app to normalize
X-Forwarded-For from trusted proxies and replace the custom key_func with
flask_limiter.util.get_remote_address when constructing the Limiter (refer to
Limiter and get_remote_address), ensuring you set ProxyFix's x_for to the
correct number of trusted proxies for your deployment.
- Around line 41-50: The timeout wrapper execute_with_timeout currently uses
ThreadPoolExecutor as a context manager which calls shutdown(wait=True) on exit
and blocks when a TimeoutError occurs; update execute_with_timeout to avoid
blocking on return by creating the ThreadPoolExecutor without the with-statement
(or explicitly call executor.shutdown(wait=False, cancel_futures=True) on
timeout), then submit the task to future and on catching TimeoutError return
immediately after cancelling/triggering non-waiting shutdown; reference
ThreadPoolExecutor, future, future.cancel(), executor.shutdown(wait=False,
cancel_futures=True) (or use cancel_futures=True for Python 3.9+) to ensure the
function does not hang when a timeout occurs.

---

Duplicate comments:
In `@backend/server.py`:
- Around line 565-568: The endpoint now wraps the document text in {"content":
...} which breaks callers expecting the raw string; change the /get_content
handler so it returns the document text in the original shape (a plain string or
the exact JSON previously used). Specifically, after computing document_url and
calling docs_service.get_document_content(document_url) (and in the handler
function that uses document_url and docs_service.get_document_content), return
the content as the original value (e.g., return jsonify(content) or return
Response(content, mimetype="text/plain")) instead of jsonify({"content":
content}) so existing frontend code that reads data as a string continues
working.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a1452d96-92e5-4993-8f09-783042496373

📥 Commits

Reviewing files that changed from the base of the PR and between 7566149 and b675ecb.

📒 Files selected for processing (1)

backend/server.py

backend/server.py

mahek2016 added 17 commits March 12, 2026 20:49

feat: add rate limiting and abuse protection for API endpoints using …

512c301

…Flask-Limiter

fix: address CodeRabbit review comments

44578dd

fix: address CodeRabbit nitpick suggestions and protect heavy endpoints

33c8a42

fix: add rate limiting to generate_gform endpoint

dc6b0c7

fix: improve transcript endpoint security and rate limiting

30a433f

fix: improve transcript endpoint security and timeout handling

47eafe0

feat(backend): add request payload size limits to API endpoints

3e8f61d

fix: address CodeRabbit review comments

a161db2

fix: use resolved yt-dlp path correctly

cf1331a

fix: support document_url and doc_id for get_content endpoint

048b175

fix: handle empty or invalid JSON request bodies across backend endpo…

b0cc7d1

…ints

chore: add yt-dlp dependency required for transcript endpoint

77265d1

fix: address CodeRabbit review comments and improve code consistency

aa949cd

fix: address CodeRabbit review comments and improve code style

cdd951e

fix: handle Google Docs errors and restore form_link for backward com…

747cea4

…patibility

fix: correct conditional structure in generate_gform and resolve Code…

b0b570e

…Rabbit review

fix: robust timeout handling with multiprocessing and parallel execution

87ab02b

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

backend/server.py Outdated Show resolved Hide resolved

backend/server.py Outdated Show resolved Hide resolved

fix: handle None outputs safely and improve rate limiting

5893fd2

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

backend/server.py Show resolved Hide resolved

backend/server.py Outdated Show resolved Hide resolved

fix: ensure all qg.generate calls use timeout wrapper

6704c41

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

backend/server.py Outdated Show resolved Hide resolved

backend/server.py Show resolved Hide resolved

backend/server.py Outdated Show resolved Hide resolved

mahek2016 added 2 commits March 17, 2026 21:09

fix: final corrections - timeout, error handling, response logic

1dc9c9b

fix: finalize thread-based timeout implementation and import corrections

7566149

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

fix: minor improvements

b675ecb

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

backend/server.py Outdated Show resolved Hide resolved

backend/server.py Outdated Show resolved Hide resolved

backend/server.py Show resolved Hide resolved

fix: add validation for max_questions in get_problems

167f27b

Uh oh!

Conversation

mahek2016 commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Addressed Issues:

Screenshots/Recordings:

Additional Notes:

AI Usage Disclosure:

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mahek2016 commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 17, 2026 •

edited

Loading