Tools & Function Calling

In December 2022, ChatGPT could write poetry, explain quantum physics, and generate code—but it couldn't tell you the current weather. It was a brain in a jar: infinitely knowledgeable about the past, yet completely blind to the present and utterly incapable of doing anything in the real world.

Then came function calling. This single capability transformed LLMs from impressive text generators into genuine decision-making systems that could interact with the world. Today, when you ask an AI assistant to "book a meeting with Sarah next Tuesday at 3pm," it doesn't just generate text about booking meetings—it actually calls your calendar API and creates the event.

This chapter covers the engineering fundamentals of tool use: how to give your AI agents hands to act in the world.

1. The Bridge to Action

The "Brain in a Jar" Problem

Consider what an LLM cannot do on its own:

Check the current stock price
Query your database
Send an email
Read a file on your computer
Know what time it is right now

These limitations exist because LLMs are frozen in time—they only know what was in their training data. They're also isolated—they have no mechanism to reach out and touch external systems.

Function calling solves this by creating a contract between the probabilistic model and your deterministic code. The model reasons about what tool to use and what arguments to pass; your code executes the actual operation and returns the result.

This is the fundamental pattern you'll implement hundreds of times as an agent engineer.

The Mental Model

Think of function calling as giving an LLM a phone directory. You describe what each "contact" (function) can do. When the model needs help, it tells you who to call and what to ask. You make the call and report back.

2. The Function Calling Mechanism

Before using any framework, you need to understand what's actually happening. The function calling loop has five steps:

Define: You describe a function's name, purpose, and parameters in a schema
Invoke: The model decides it needs that function and outputs a structured call request
Execute: Your code runs the actual function
Result: You send the function's output back to the model
Respond: The model incorporates the result into its final answer

The Raw API Loop

Here's the complete pattern with the Gemini API—no frameworks, just the protocol:

from google import genai
from google.genai import types
 
# Step 1: DEFINE the function schema
get_weather_declaration = {
    "name": "get_weather",
    "description": "Gets the current weather for a given city. Use this when the user asks about weather conditions.",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "The city name, e.g., 'Tokyo' or 'New York'"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature unit. Defaults to celsius."
            }
        },
        "required": ["city"]
    }
}
 
# The ACTUAL function that does the work
def get_weather(city: str, unit: str = "celsius") -> dict:
    """Mock implementation - replace with real API call."""
    return {"city": city, "temp": 25, "unit": unit, "condition": "Sunny"}
 
# Step 2: Send to model with function declarations
client = genai.Client()
tools = types.Tool(function_declarations=[get_weather_declaration])
config = types.GenerateContentConfig(tools=[tools])
 
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What's the weather like in Tokyo?",
    config=config
)
 
# Step 3: Check if model wants to call a function
if response.candidates[0].content.parts[0].function_call:
    function_call = response.candidates[0].content.parts[0].function_call
    print(f"Model wants to call: {function_call.name}")
    print(f"With arguments: {function_call.args}")
    
    # Step 4: EXECUTE the function
    if function_call.name == "get_weather":
        result = get_weather(**function_call.args)
    
    # Step 5: Send result back to model for final response
    function_response = types.Part.from_function_response(
        name=function_call.name,
        response={"result": result}
    )
    
    # Build conversation history
    contents = [
        types.Content(role="user", parts=[types.Part(text="What's the weather like in Tokyo?")]),
        response.candidates[0].content,  # Model's function call
        types.Content(role="user", parts=[function_response])  # Our function result
    ]
    
    # Get final response
    final_response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=contents,
        config=config
    )
    print(final_response.text)
    # Output: "It's currently sunny and 25°C in Tokyo!"

This is verbose, but it reveals everything:

You define the schema that tells the model what tools exist
The model decides when to use them (and with what arguments)
You execute the actual code and return results
The model weaves everything into a coherent response

Understanding this loop is essential—every framework just automates these steps.

3. Your First Tool with ADK

Now let's see how the Google Agent Development Kit (ADK) eliminates the boilerplate. With ADK, you just write normal Python functions:

def get_weather(city: str, unit: str = "celsius") -> dict:
    """Gets the current weather for a given city.
    
    Args:
        city: The city name, e.g., 'Tokyo' or 'New York'
        unit: Temperature unit, either 'celsius' or 'fahrenheit'
    
    Returns:
        A dictionary with temperature and conditions.
    """
    # Your actual implementation
    return {"city": city, "temp": 25, "unit": unit, "condition": "Sunny"}

ADK reads your type hints and docstring, then automatically generates the JSON schema. No manual schema writing!

The Simplest Agent

Here's a complete working agent:

from google.adk.agents import Agent
 
# Create an Agent with your function as a tool
weather_agent = Agent(
    name="weather_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful weather assistant.",
    tools=[get_weather]  # Just pass the function!
)

That's it. Three lines to create a tool-using agent.

Testing with `adk web`

The fastest way to test your agent is with ADK's built-in dev UI:

1. Create your project structure:

my_agent/
    __init__.py
    agent.py

2. Define your agent in agent.py:

# my_agent/agent.py
from google.adk.agents import Agent
 
def get_weather(city: str, unit: str = "celsius") -> dict:
    """Gets the current weather for a given city."""
    return {"city": city, "temp": 25, "unit": unit, "condition": "Sunny"}
 
root_agent = Agent(
    name="weather_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful weather assistant. Use the get_weather tool when asked about weather.",
    tools=[get_weather]
)

3. Add the init file:

# my_agent/__init__.py
from . import agent

4. Launch the dev UI:

cd parent_of_my_agent
adk web

5. Open http://localhost:8000 and chat with your agent!

Try: "What's the weather in Tokyo?"

You'll see the agent call your get_weather function and respond with the result. The adk web UI also shows you the function calls in the Events tab—perfect for debugging.

The Magic of Type Hints

ADK converts your Python function signature into the JSON schema automatically:

city: str → {"type": "string"}
unit: str = "celsius" → optional parameter with default
Docstring → function description
Docstring Args → parameter descriptions

No more writing schemas by hand!

4. Built-in Tools: Quick Wins

Before building custom tools, know that ADK provides powerful pre-built tools. These give you capabilities immediately, without writing code.

4.1 Google Search (Grounding)

Give your agent access to real-time web information:

from google.adk.agents import Agent
from google.adk.tools import google_search
 
research_agent = Agent(
    name="research_agent",
    model="gemini-2.0-flash",
    instruction="You are a research assistant. Search the web to answer questions about current events.",
    tools=[google_search]  # Built-in Google Search
)

The model automatically decides when to search. Ask about recent news, stock prices, or current events—it just works.

Combining built-in and custom tools:

from google.adk.agents import Agent
from google.adk.tools import google_search
 
def save_note(title: str, content: str) -> dict:
    """Saves important information for later reference."""
    print(f"Saving: {title}")
    return {"status": "saved", "title": title}
 
# Agent with both capabilities
research_agent = Agent(
    name="research_assistant",
    model="gemini-2.0-flash",
    instruction="Search for information and save important findings.",
    tools=[google_search, save_note]  # Mix and match!
)

4.2 Code Execution

Let the model write and run Python code:

from google.adk.agents import Agent
from google.adk.tools import built_in_code_execution
 
coding_agent = Agent(
    name="coding_agent",
    model="gemini-2.0-flash",
    instruction="You can write and run Python code to solve problems.",
    tools=[built_in_code_execution]
)

Ask "Calculate compound interest on $10,000 at 5% for 10 years" and the agent will write Python, execute it, and return the result.

4.3 What Else?

The ecosystem is expanding rapidly:

Anthropic Computer Use: Control desktop GUIs (clicking, typing)
Browser Use: Navigate websites, fill forms
MCP Servers: Connect to databases, APIs, file systems (covered in Chapter 7)

The same function-calling pattern underlies them all.

5. Designing Effective Tools

Here's a truth that surprises most engineers: the quality of your tool descriptions matters more than the quality of your tool code. The model can only use tools it understands.

Anthropic's research team improved Claude's tool use accuracy by 40%+ just by improving tool descriptions. Let's break down what makes a tool great.

5.1 The Golden Rule: Think Like a New Hire

Imagine hiring a brilliant new engineer. They're smart, fast, and eager—but they don't know your codebase, your domain, or your conventions. Would you hand them a function called query(q: str) and expect them to use it correctly?

Of course not. You'd explain when to use it, what format the input should be, and what comes back. The model is exactly this new hire—infinitely capable, but dependent on your documentation.

Bad:

def query(q: str) -> list:
    """Runs a query."""
    ...

Good:

def search_customers(
    query: str,
    status: str = "active",
    limit: int = 10
) -> list[dict]:
    """Searches the customer database by name or email.
    
    Use this tool when the user asks about specific customers,
    their orders, or account status. Do NOT use for aggregate 
    statistics—use get_customer_analytics instead.
    
    Args:
        query: Search term. Matches against customer name or email.
               Example: "john" or "john@example.com"
        status: Filter by account status. One of: "active", "inactive", "all"
        limit: Maximum results to return. Default 10, max 100.
    
    Returns:
        List of customer objects with keys: id, name, email, status, created_at
    """
    ...

The good version tells the model:

When to use the tool (and when not to)
What each parameter means with concrete examples
What to expect in the response

This pattern—explicit context, concrete examples, clear boundaries—applies to every tool you write.

5.2 Schema Best Practices

Beyond descriptions, the structure of your parameters matters. The model uses type hints to constrain its outputs, and clear naming to decide which arguments to fill.

Principle	Bad	Good
Use enums for fixed values	`status: str`	`status: Literal["active", "inactive", "pending"]`
Provide examples	`"The date"`	`"The date in YYYY-MM-DD format, e.g., '2026-01-15'"`
Name unambiguously	`id: str`	`customer_id: str`
Mark required vs optional	Implicit	Explicit in docstring

These small changes dramatically reduce tool-calling errors. The model can match Literal["active", "inactive", "pending"] exactly, but will often hallucinate invalid values for an unconstrained str.

5.3 Fewer Tools Are Often Better

With great power comes great responsibility—and great temptation. Once you realize how easy it is to add tools, you'll want to wrap every API endpoint. Resist this urge.

Every tool you add is another option the model must evaluate. More options mean more opportunities for wrong choices. Anthropic's engineering team found that agents with 5-10 focused tools consistently outperform those with 20+ granular tools.

"More tools don't always lead to better outcomes. A common error is tools that merely wrap existing API endpoints."

Don't create a tool for every API endpoint. Create tools for workflows:

# ❌ Too granular (3 tools, 3 calls needed)
def list_users() -> list: ...
def list_events() -> list: ...
def create_event(user_id: str, event: dict) -> dict: ...
 
# ✅ Workflow-oriented (1 tool, 1 call)
def schedule_meeting(
    attendee_names: list[str],
    topic: str,
    preferred_time: str
) -> dict:
    """Finds attendees by name, checks availability, and creates a calendar event."""
    ...

The workflow-oriented tool handles the complexity internally. The model just says what it wants; your code figures out how.

5.4 Return Meaningful Context

What your tool returns is just as important as how it's called. The model will use the return value to decide what to do next—or to answer the user directly. Help it succeed.

# ❌ The model can't use this easily
{"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}
 
# ✅ Human-readable AND machine-usable
{
    "customer_id": "cust_12345",
    "name": "Jane Smith",
    "summary": "Premium customer since 2023, 47 orders"
}

Notice the summary field. The model can quote it directly to the user without needing to interpret raw data. This is progressive disclosure—enough detail to act, enough simplicity to communicate.

5.5 Error Handling That Teaches

Tools fail. APIs time out. Users ask for data that doesn't exist. When this happens, your error response becomes the model's learning signal.

When tools fail, help the model self-correct:

# ❌ Unhelpful
raise ValueError("Invalid input")
 
# ✅ Actionable
return {
    "error": True,
    "message": "Customer not found with email 'jon@example.com'. "
               "Did you mean 'john@example.com'? "
               "Try searching by name instead."
}

The model reads this, learns, and tries a different approach.

The 5 Pillars of Tool Design

Write for the new hire — Explain when to use, when not to, with examples
Constrain the schema — Enums, examples, clear naming
Fewer is better — Workflow-oriented tools over granular endpoints
Return context — Human-readable summaries alongside machine data
Errors that teach — Guide the model toward recovery

6. Advanced Tool Patterns

6.1 Parallel Tool Calling

Modern models can call multiple tools simultaneously:

# User: "What's the weather in Tokyo and the stock price of GOOG?"
 
# Model generates TWO function calls at once:
[
    {"name": "get_weather", "args": {"city": "Tokyo"}},
    {"name": "get_stock_price", "args": {"symbol": "GOOG"}}
]

ADK executes these in parallel automatically, cutting latency in half.

6.2 Sequential (Compositional) Tool Calling

Some tasks require chaining tools:

User: "If it's warmer than 20°C in London, set my thermostat to 20°C"
 
Step 1: get_weather(city="London") → {"temp": 25}
Step 2: Model reasons: 25 > 20, so I should set thermostat
Step 3: set_thermostat(temperature=20) → {"status": "success"}
Step 4: Model: "It's 25°C in London, so I've set your thermostat to 20°C"

ADK handles this automatically—the agent loop continues until the task is complete.

6.3 Function Calling Modes

You can force or disable tool use:

from google.adk.agents import RunConfig
from google.genai import types
 
# Force the model to use a tool (no plain text response)
run_config = RunConfig(
    tool_config=types.ToolConfig(
        function_calling_config=types.FunctionCallingConfig(
            mode="ANY",  # Must use a tool
            allowed_function_names=["search_customers"]  # Only this tool
        )
    )
)

Modes:

AUTO (default): Model decides
ANY: Must use a tool
NONE: Disable all tools

6.4 The Bash Meta-Tool

Here's an insight that separates toy agents from production ones:

Instead of building 100 specific tools, give the agent one universal tool: bash.

An agent that can run shell commands can:

Read and write files
Make HTTP requests (curl)
Process data (jq, awk, grep)
Manage git repositories
Do almost anything a developer can do

This is the secret behind Devin and Claude Code.

import subprocess
 
def run_bash(command: str, timeout: int = 30) -> dict:
    """Executes a bash command and returns the output.
    
    Use this for file operations, git commands, running scripts,
    or any task that can be accomplished via command line.
    
    Args:
        command: The bash command to execute.
        timeout: Maximum seconds to wait. Default 30.
    
    Returns:
        {"stdout": "...", "stderr": "...", "return_code": 0}
    
    Examples:
        - "ls -la /project" - list files
        - "cat README.md" - read a file
        - "git status" - check git state
    """
    try:
        result = subprocess.run(
            command, shell=True, capture_output=True, text=True, timeout=timeout
        )
        return {
            "stdout": result.stdout,
            "stderr": result.stderr,
            "return_code": result.returncode
        }
    except subprocess.TimeoutExpired:
        return {"error": f"Command timed out after {timeout} seconds"}

Security: Sandboxing is Non-Negotiable

Never run agent-generated bash commands on your production server. Use sandboxed environments:

E2B (e2b.dev): Cloud sandboxes with full OS access
Docker: Containerized execution with resource limits
Modal/Fly.io: Ephemeral VMs for untrusted code

The agent writes the command; your sandbox runs it.

7. 🔨 Project: Multi-Tool Agent

Let's build an agent that can answer: "How many days until the next US Presidential Election?"

This requires three capabilities:

Know the current date (system time)
Find when the election is (web search)
Calculate the difference (math)

Create the agent structure:

mkdir -p election_agent && touch election_agent/__init__.py

Now create election_agent/agent.py:

from datetime import datetime
from google.adk.agents import Agent
from google.adk.tools import google_search
 
# Tool 1: System Time
def get_current_date() -> dict:
    """Returns the current date and time.
    
    Use this to know what "today" or "now" means.
    """
    now = datetime.now()
    return {
        "date": now.strftime("%Y-%m-%d"),
        "time": now.strftime("%H:%M:%S"),
        "day_of_week": now.strftime("%A")
    }
 
# Tool 2: Days Calculator
def days_between(start_date: str, end_date: str) -> dict:
    """Calculates the number of days between two dates.
    
    Args:
        start_date: Start date in YYYY-MM-DD format
        end_date: End date in YYYY-MM-DD format
    """
    try:
        start = datetime.strptime(start_date, "%Y-%m-%d").date()
        end = datetime.strptime(end_date, "%Y-%m-%d").date()
        delta = (end - start).days
        return {"days": delta, "start": start_date, "end": end_date}
    except ValueError as e:
        return {"error": f"Invalid date format: {e}"}
 
# Create the agent
root_agent = Agent(
    name="election_countdown",
    model="gemini-2.0-flash",
    instruction="""You help with date-related questions.
    1. Use get_current_date to know today's date
    2. Use Google Search to find event dates
    3. Use days_between to calculate differences
    Show your work.""",
    tools=[get_current_date, days_between, google_search]
)

Run with adk web:

adk web election_agent

Then ask: "How many days until the next US Presidential Election?"

What happens in the browser:

Agent calls get_current_date() → learns today's date
Agent uses Google Search → finds the election date (November 5, 2028)
Agent calls days_between("2026-01-08", "2028-11-05") → gets the countdown
Agent responds with the final answer

Watch the "Function Calls" panel in adk web to see each step as it happens.

What You Learned

This project demonstrates:

Multi-tool orchestration: Agent decides which tools to use and in what order
Built-in + custom tools: Google Search alongside your custom functions
Iterative development: adk web for rapid prototyping and debugging

Summary

Function calling transforms an LLM from a text generator into an agent. You learned:

The Mechanism: The 5-step loop (Define → Invoke → Execute → Result → Respond)
ADK Simplicity: Type hints + docstrings = automatic schemas
Built-in Tools: Google Search and Code Execution for quick wins
Tool Design: The 5 pillars—new hire mindset, constrained schemas, fewer tools, meaningful returns, teaching errors
Advanced Patterns: Parallel calls, sequential chains, the Bash meta-tool

The quality of your tools directly determines the quality of your agent. Invest time in clear descriptions and thoughtful schemas.

References

Next Chapter: We'll explore MCP (Model Context Protocol)—the universal standard for connecting AI agents to tools, enabling you to write a tool once and use it everywhere.