Programmatic Tool Calling

What is Programmatic Tool Calling?

Programmatic tool calling lets an AI agent write code that invokes tools inside a sandboxed execution environment — instead of requiring a separate model round-trip for every tool call. The agent writes a script, the runtime executes it, and tool calls happen directly from the code. Only the final result is returned to the model's context window.

Why It Matters

Traditional tool use requires the model to generate a tool call, wait for the result, then decide the next step — one round-trip at a time. Programmatic tool calling eliminates this overhead for multi-step workflows.

Fewer Round-Trips

Call multiple tools in a single code execution instead of one model turn per tool.

Lower Token Usage

Intermediate results stay in the sandbox — only the summary enters the context window.

Data Filtering

Process and filter large tool outputs in code before they reach the model.

Native Control Flow

Use loops, conditionals, and error handling — the model writes real code, not just JSON calls.

How It Works

The flow involves four steps between the agent, sandbox, and your tool server.

Agent Writes Code

The model generates a Python script that calls your tools as async functions.

Sandbox Executes

The code runs in a sandboxed container. When a tool function is called, execution pauses.

Tool Runs Externally

Your server receives the tool call, executes it, and returns the result to the sandbox.

Result to Model

Once the script finishes, only the final output is added to the model's context.

Traditional vs Programmatic

See how the two approaches differ for a task that queries three database regions.

Traditional Tool Use

Each tool call requires a full model round-trip. For N tools, that's N inference passes.

Model → tool call → result → Model → tool call → result → Model → tool call → result → Model → final answer

7 model turns

Programmatic Tool Calling

Agent writes one script calling N tools. Only 1 inference pass needed.

Model → code (3 tool calls + aggregation) → final answer

2 model turns

Example: Programmatic Database Query

The agent writes Python that loops over regions, calls a database tool, and aggregates results — all in one execution.

regions = ["West", "East", "Central", "North", "South"]
results = {}

for region in regions:
    data = await query_database(
        f"SELECT SUM(revenue) as total FROM sales WHERE region='{region}'"
    )
    results[region] = data[0]["total"]

# Aggregate in code — only the summary reaches the model
top_region = max(results, key=results.get)
print(f"Top region: {top_region} ({results[top_region]:,.0f})")
print(f"All regions total: {sum(results.values()):,.0f}")

Use Cases

Programmatic tool calling shines when agents need to do more than one-shot tool calls.

Batch Processing

Query a database for each of 50 regions in a loop, aggregate results, and return a summary — all in one execution.

Conditional Logic

Check file size first, then decide whether to read the full file or just a summary. No wasted round-trips.

Data Filtering

Fetch 10,000 log entries, filter to only errors, and return the last 10 — keeping the context window clean.

Early Termination

Check endpoints in sequence and stop as soon as a healthy one is found. No need to check all of them.

The allowed_callers Concept

When defining tools, you specify which contexts can invoke them. This controls whether a tool can only be called directly by the model, only from within code execution, or both.

For clarity, it's best to choose one mode per tool rather than enabling both. This gives the model clearer guidance on how to use each tool.

Direct Only

The model calls the tool directly via the standard tool-use flow. This is the default.

"allowed_callers": ["direct"]

Code Execution Only

The tool can only be invoked from within a sandboxed code execution environment.

"allowed_callers":
  ["code_execution"]

Both Modes

The tool can be called either directly or from code. Use sparingly — it can confuse tool selection.

"allowed_callers":
  ["direct", "code_execution"]

Key Takeaways

1Programmatic tool calling lets agents write code that calls tools, eliminating per-tool round-trips.
2Intermediate results stay in the sandbox and never enter the model's context window, saving tokens.
3Use allowed_callers to control whether tools are invoked directly, from code, or both.
4Best for batch processing, conditional workflows, data filtering, and multi-step tool chains.
5Multiple AI providers are implementing this pattern as a way to make agents faster and more efficient.