Skip to content

Proposal: Improving pipeline performance and scalability #9

@kunalbhardwaj2006

Description

@kunalbhardwaj2006

Problem

Currently, the LibrEd generator pipeline faces performance bottlenecks due to sequential LLM calls in:

  • Question classification
  • Theory/explanation generation

This leads to long execution times and inefficient resource usage.


Proposed Improvements

1. Async Batching (Partially Implemented)

  • Already implemented async batching for classification
  • Reduced execution time significantly
  • Plan to extend similar approach to theory generation

2. Async Theory Generation

  • Convert sequential theory generation → async batches
  • Avoid waiting for one LLM response before sending the next
  • Use controlled concurrency (semaphores)

3. Caching Layer

  • Store LLM responses (classification + theory) in SQLite
  • Use hash-based lookup to avoid repeated computation

4. Retry & Failure Handling

  • Add retry mechanism for failed LLM calls
  • Handle partial failures gracefully
  • Save intermediate results

5. Performance Metrics & Logging

  • Track execution time per pipeline stage
  • Log batch-level processing details

Goal

  • Reduce total pipeline runtime significantly
  • Improve scalability for large datasets
  • Make pipeline more robust and production-ready

Note

I have already created a working prototype demonstrating async processing and caching:
[link your repo]

I plan to integrate these improvements directly into the LibrEd codebase.

Would appreciate feedback on this direction before proceeding further.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions