Skip to content

fix: harden SetModelResponseTool fallback to prevent infinite loops#5091

Open
vietnamesekid wants to merge 1 commit intogoogle:mainfrom
vietnamesekid:fix/harden-set-model-response-fallback
Open

fix: harden SetModelResponseTool fallback to prevent infinite loops#5091
vietnamesekid wants to merge 1 commit intogoogle:mainfrom
vietnamesekid:fix/harden-set-model-response-fallback

Conversation

@vietnamesekid
Copy link
Copy Markdown

Summary

This is a follow-up to #5057. It improves the fallback behavior of SetModelResponseTool to avoid infinite loops, especially when flash models (like gemini-2.5-flash and gemini-3-flash) ignore set_model_response and keep calling other tools.

The changes come from the investigation and discussion in #5054.

Changes

  • Stronger instructions for simple types (_output_schema_processor.py): primitive schemas like str and int are easy for models to ignore, so we now give clearer guidance in these cases.
  • Deterministic tool restriction: on the second-to-last round (N-1), we force the model to only call set_model_response using tool_config, so we can guarantee structured output.
  • Hard cutoff: at round N (_MAX_TOOL_ROUNDS=25), we stop execution completely to avoid runaway loops and unnecessary API usage.
  • Early return after success (base_llm_flow.py): once set_model_response succeeds, we skip extra steps like transfer_to_agent.

Test plan

  • All unit tests pass (19/19), including 6 new tests
  • All integration tests pass (4/4), covering BaseModel and str schemas across GOOGLE_AI and Vertex AI
  • No regressions in existing tests
  • Formatting and lint checks pass

Related

@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Apr 1, 2026
@rohityan rohityan self-assigned this Apr 1, 2026
@vietnamesekid vietnamesekid force-pushed the fix/harden-set-model-response-fallback branch from a27127b to 67307bf Compare April 2, 2026 02:37
Flash models (gemini-2.5-flash, gemini-3-flash) can ignore
set_model_response and loop indefinitely when output_schema is used
with tools. This adds a layered defense:

1. Type-aware instruction: primitive schemas (str, int) get a stronger
   prompt since their trivial tool signature is easily ignored by flash
   models.

2. Deterministic tool_choice guard: on round N-1 (_MAX_TOOL_ROUNDS-1),
   restrict the model to only call set_model_response via tool_config.

3. Hard cutoff: on round N, terminate the invocation entirely to
   prevent runaway API costs.

4. Early return after set_model_response: skip unnecessary
   transfer_to_agent processing in base_llm_flow.py after structured
   output is successfully produced.

Based on analysis by @surfai, @nino-robotfutures-co, and
@surajksharma07 on google#5054.
@vietnamesekid vietnamesekid force-pushed the fix/harden-set-model-response-fallback branch from 3e45d49 to 4dde4c1 Compare April 2, 2026 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants