AutoGen Python v0.6.2 Adds Streaming Tools and Tool Loop

Q: Do existing AgentTool and TeamTool implementations break with this update?

The changes to `AgentTool` and `TeamTool` are additive. Tools that do not implement streaming continue to execute through the standard path, so existing code does not require modification to remain functional [1].

Q: What happens if max_tool_iterations is not set on AssistantAgent?

The source notes describe `max_tool_iterations` as a constructor parameter that enables the inner loop when set. The release notes do not specify a default value or behavior when the parameter is omitted, so operators should consult the package documentation for the default [1].

Q: Is tool_choice available on all ChatCompletionClient subclasses?

The release notes state that `tool_choice` is added to `ChatCompletionClient` and its subclasses, covering both `create` and `create_stream` methods. Compatibility with specific backend clients should be verified against each subclass's implementation [1].

Q: Who contributed the tool_choice parameter?

The `tool_choice` parameter was contributed by `@copilot-swe-agent`, which the release notes identify as the first pull request from that contributor [1].

What Shipped in v0.6.2

Microsoft AutoGen Python v0.6.2 delivers three headline changes for agent and tool orchestration. First, AgentTool and TeamTool gain streaming support through a new run_json_stream interface. Second, ChatCompletionClient receives a tool_choice parameter on its create and create_stream methods. Third, AssistantAgent gains an inner tool-calling loop controlled by a new max_tool_iterations constructor parameter. The release also adds OpenTelemetry GenAI traces aligned with the GenAI Semantic Convention, covering create_agent, invoke_agent, and execute_tool spans [1].

How Streaming Tools Work

Prior to v0.6.2, calling an AgentTool or TeamTool from a parent agent returned only a final result, leaving inner agent and team events invisible to the caller. The new run_json_stream interface changes that behavior. When AssistantAgent detects that a tool supports streaming, it calls run_json_stream instead of the standard execution path, and the inner events produced by the nested agent or team are surfaced through the parent agent’s own run_stream output [1].

This means operators running nested agent pipelines can now observe intermediate steps, partial outputs, and sub-agent activity in real time rather than waiting for a terminal result. Because run_json_stream is invoked only when a tool is detected to support streaming, tools without streaming support follow the existing execution path [1].

Extending the Framework: BaseStreamTool and StreamWorkbench

Developers who build custom tools or workbenches have two new base classes available in v0.6.2. To create a custom streaming tool, a developer subclasses autogen_core.tools.BaseStreamTool and implements the run_stream method. To create a custom streaming workbench, the entry point is autogen_core.tools.StreamWorkbench, which requires implementing call_tool_stream [1].

Both base classes follow the same subclassing pattern already established in AutoGen’s tool architecture, so teams familiar with building standard tools should find the migration path straightforward. The framework handles the plumbing that connects a streaming tool’s output to the parent agent’s event stream, provided the subclass correctly implements the required method.

tool_choice Parameter and the Inner Tool-Calling Loop

The tool_choice parameter, added to ChatCompletionClient’s create and create_stream methods, gives operators finer control over whether and how a model selects tools during a completion call [1]. This is useful in scenarios where a pipeline needs to force tool use, suppress it, or allow the model to decide, without restructuring the agent configuration.

The max_tool_iterations parameter on AssistantAgent addresses a different problem: repeated tool invocation within a single agent turn. Setting this constructor parameter enables an inner loop in which AssistantAgent calls the model and executes tools continuously until the model stops generating tool calls or the iteration ceiling is reached [1]. The release notes describe this change as simplifying AssistantAgent usage, since previously orchestrating multi-step tool sequences required external logic or custom agent subclasses.

The loop terminates on either condition, whichever comes first, giving operators a safety bound on runaway tool execution while still allowing the model to chain tool calls autonomously.

Who Benefits and Practical Use Cases

The streaming tools addition most directly benefits teams building hierarchical or nested agent pipelines, where a top-level orchestrator delegates tasks to sub-agents. Before v0.6.2, those teams had no standard mechanism to stream inner events upward. With AgentTool and TeamTool now implementing run_json_stream, that visibility is available without custom wiring [1].

The tool_choice parameter serves teams that need deterministic control over model behavior at inference time, particularly in production pipelines where unpredictable tool selection introduces latency or cost variance. The max_tool_iterations loop simplifies agent configurations that previously required wrapper logic to handle sequences of dependent tool calls.

FAQ

Q. Do existing AgentTool and TeamTool implementations break with this update? AssistantAgent invokes run_json_stream only for tools detected to support streaming; tools without it follow the standard execution path [1].

Q. What happens if max_tool_iterations is not set on AssistantAgent? The source notes describe max_tool_iterations as a constructor parameter that enables the inner loop when set. The release notes do not specify a default value or behavior when the parameter is omitted, so operators should consult the package documentation for the default [1].

Q. How does a developer know whether a tool supports streaming? AssistantAgent detects streaming support automatically. When a tool supports run_json_stream, the agent uses that interface; otherwise it falls back to the standard execution path. Developers signal streaming support by subclassing BaseStreamTool and implementing run_stream [1].

Q. Is tool_choice available on all ChatCompletionClient subclasses? The release notes state that tool_choice is added to ChatCompletionClient and its subclasses, covering both create and create_stream methods. Compatibility with specific backend clients should be verified against each subclass’s implementation [1].

Q. Who contributed the tool_choice parameter? The release notes credit the tool_choice parameter to a contributor’s first pull request to the project [1].

Key Takeaways

AgentTool and TeamTool now expose inner agent and team events to parent agents via run_json_stream, enabling real-time visibility into nested agent pipelines [1].
Custom streaming tools and workbenches can be built by subclassing autogen_core.tools.BaseStreamTool or autogen_core.tools.StreamWorkbench and implementing the required methods [1].
The new tool_choice parameter on ChatCompletionClient gives operators explicit control over model tool selection at inference time [1].
AssistantAgent’s max_tool_iterations constructor parameter enables a bounded inner tool-calling loop, removing the need for external orchestration logic in multi-step tool sequences [1].
OpenTelemetry GenAI traces for create_agent, invoke_agent, and execute_tool were also added in this release, aligned with the GenAI Semantic Convention [1].