perf: Defer query-time imports in conversation_base#236
perf: Defer query-time imports in conversation_base#236KRRT7 wants to merge 3 commits intomicrosoft:mainfrom
Conversation
black is only used at runtime in two cold formatting paths: - create_context_prompt() in answers.py (LLM debug context) - format_code()/pretty_print() in utils.py (developer terminal output) Both format Python data structures, which is exactly what pprint does. Replace black.format_str with pprint.pformat + ast.literal_eval, eliminating the runtime dependency entirely. Move black from dependencies to dev dependency-group — it remains available for make format/check but is no longer required by library consumers.
answers, search_query_schema, searchlang, and answer_response_schema are only used in the query() method. Move their imports from module level into query() and use TYPE_CHECKING + __future__.annotations for the type hints. These modules pull in search, query, and schema initialization that isn't needed when creating or indexing conversations.
gvanrossum
left a comment
There was a problem hiding this comment.
I don't follow precisely what games you are playing with from __future__ import annotations, but I don't want it in my code -- while it's not yet deprecated, eventually it will be, and I don't want to add a future burden. If that means the whole optimization can't happen, so be it -- as I wrote before, these optimizations aren't all that valuable given the typical LLM roundtrip.
KRRT7
left a comment
There was a problem hiding this comment.
Good point — I can drop from __future__ import annotations and use string literals for the four annotations that reference the deferred modules instead:
_query_translator: "typechat.TypeChatJsonTranslator[search_query_schema.SearchQuery] | None" = None
_answer_translator: "typechat.TypeChatJsonTranslator[answer_response_schema.AnswerResponse] | None" = Noneand in the query() signature:
async def query(
self,
question: str,
search_options: "searchlang.LanguageSearchOptions | None" = None,
answer_options: "answers.AnswerContextOptions | None" = None,
) -> str:The TYPE_CHECKING block + runtime import inside query() stay the same — only the future import goes away. Type checkers still resolve the string annotations against the TYPE_CHECKING imports. I'll push that update.
On the value question — I was honestly hesitant to open import-time optimizations at all, which is why I asked first on #229 before going further. That said, if you'd rather keep the code straightforward and not defer these, I'm happy to close this one.
Stack: depends on #235. Merge #235 first, then this PR.
conversation_base.pyeagerly imported four query-time modules at module level:answers(20 ms cumulative — pulls insearch+query)search_query_schema(10 ms)searchlang(3 ms)answer_response_schema(1 ms)These are only used in the
query()method. Moved the imports insidequery()and usedTYPE_CHECKING+from __future__ import annotationsfor the type hints.Benchmark
Azure Standard_D2s_v5 — 2 vCPU, 8 GiB RAM, Python 3.13
Import Time (hyperfine, warmup 5, min-runs 30)
import typeagentCumulative since baseline (main):
import typeagentUpstream opportunity: pydantic_ai transitive imports
pydantic_aicontributes ~161 ms cumulative toimport typeagentthrough its transitive dependency chain:logfiregriffepydantic_ai.messagesgenai_pricesThis is the single largest remaining import cost and would need to be addressed upstream in pydantic-ai.
Generated by codeflash optimization agent