-
-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Problem
Currently, the LibrEd generator pipeline faces performance bottlenecks due to sequential LLM calls in:
- Question classification
- Theory/explanation generation
This leads to long execution times and inefficient resource usage.
Proposed Improvements
1. Async Batching (Partially Implemented)
- Already implemented async batching for classification
- Reduced execution time significantly
- Plan to extend similar approach to theory generation
2. Async Theory Generation
- Convert sequential theory generation → async batches
- Avoid waiting for one LLM response before sending the next
- Use controlled concurrency (semaphores)
3. Caching Layer
- Store LLM responses (classification + theory) in SQLite
- Use hash-based lookup to avoid repeated computation
4. Retry & Failure Handling
- Add retry mechanism for failed LLM calls
- Handle partial failures gracefully
- Save intermediate results
5. Performance Metrics & Logging
- Track execution time per pipeline stage
- Log batch-level processing details
Goal
- Reduce total pipeline runtime significantly
- Improve scalability for large datasets
- Make pipeline more robust and production-ready
Note
I have already created a working prototype demonstrating async processing and caching:
[link your repo]
I plan to integrate these improvements directly into the LibrEd codebase.
Would appreciate feedback on this direction before proceeding further.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels