Tools & Function Calling
In December 2022, ChatGPT could write poetry, explain quantum physics, and generate code—but it couldn't tell you the current weather. It was a brain in a jar: infinitely knowledgeable about the past, yet completely blind to the present and utterly incapable of doing anything in the real world.
Then came function calling. This single capability transformed LLMs from impressive text generators into genuine decision-making systems that could interact with the world. Today, when you ask an AI assistant to "book a meeting with Sarah next Tuesday at 3pm," it doesn't just generate text about booking meetings—it actually calls your calendar API and creates the event.
This chapter covers the engineering fundamentals of tool use: how to give your AI agents hands to act in the world.
1. The Bridge to Action
The "Brain in a Jar" Problem
Consider what an LLM cannot do on its own:
- Check the current stock price
- Query your database
- Send an email
- Read a file on your computer
- Know what time it is right now
These limitations exist because LLMs are frozen in time—they only know what was in their training data. They're also isolated—they have no mechanism to reach out and touch external systems.
Function calling solves this by creating a contract between the probabilistic model and your deterministic code. The model reasons about what tool to use and what arguments to pass; your code executes the actual operation and returns the result.
This is the fundamental pattern you'll implement hundreds of times as an agent engineer.
Think of function calling as giving an LLM a phone directory. You describe what each "contact" (function) can do. When the model needs help, it tells you who to call and what to ask. You make the call and report back.
2. The Function Calling Mechanism
Before using any framework, you need to understand what's actually happening. The function calling loop has five steps:
- Define: You describe a function's name, purpose, and parameters in a schema
- Invoke: The model decides it needs that function and outputs a structured call request
- Execute: Your code runs the actual function
- Result: You send the function's output back to the model
- Respond: The model incorporates the result into its final answer
The Raw API Loop
Here's the complete pattern with the Gemini API—no frameworks, just the protocol:
from google import genai
from google.genai import types
# Step 1: DEFINE the function schema
get_weather_declaration = {
"name": "get_weather",
"description": "Gets the current weather for a given city. Use this when the user asks about weather conditions.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g., 'Tokyo' or 'New York'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit. Defaults to celsius."
}
},
"required": ["city"]
}
}
# The ACTUAL function that does the work
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Mock implementation - replace with real API call."""
return {"city": city, "temp": 25, "unit": unit, "condition": "Sunny"}
# Step 2: Send to model with function declarations
client = genai.Client()
tools = types.Tool(function_declarations=[get_weather_declaration])
config = types.GenerateContentConfig(tools=[tools])
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="What's the weather like in Tokyo?",
config=config
)
# Step 3: Check if model wants to call a function
if response.candidates[0].content.parts[0].function_call:
function_call = response.candidates[0].content.parts[0].function_call
print(f"Model wants to call: {function_call.name}")
print(f"With arguments: {function_call.args}")
# Step 4: EXECUTE the function
if function_call.name == "get_weather":
result = get_weather(**function_call.args)
# Step 5: Send result back to model for final response
function_response = types.Part.from_function_response(
name=function_call.name,
response={"result": result}
)
# Build conversation history
contents = [
types.Content(role="user", parts=[types.Part(text="What's the weather like in Tokyo?")]),
response.candidates[0].content, # Model's function call
types.Content(role="user", parts=[function_response]) # Our function result
]
# Get final response
final_response = client.models.generate_content(
model="gemini-2.5-flash",
contents=contents,
config=config
)
print(final_response.text)
# Output: "It's currently sunny and 25°C in Tokyo!"This is verbose, but it reveals everything:
- You define the schema that tells the model what tools exist
- The model decides when to use them (and with what arguments)
- You execute the actual code and return results
- The model weaves everything into a coherent response
Understanding this loop is essential—every framework just automates these steps.
3. Your First Tool with ADK
Now let's see how the Google Agent Development Kit (ADK) eliminates the boilerplate. With ADK, you just write normal Python functions:
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Gets the current weather for a given city.
Args:
city: The city name, e.g., 'Tokyo' or 'New York'
unit: Temperature unit, either 'celsius' or 'fahrenheit'
Returns:
A dictionary with temperature and conditions.
"""
# Your actual implementation
return {"city": city, "temp": 25, "unit": unit, "condition": "Sunny"}ADK reads your type hints and docstring, then automatically generates the JSON schema. No manual schema writing!
The Simplest Agent
Here's a complete working agent:
from google.adk.agents import Agent
# Create an Agent with your function as a tool
weather_agent = Agent(
name="weather_agent",
model="gemini-2.0-flash",
instruction="You are a helpful weather assistant.",
tools=[get_weather] # Just pass the function!
)That's it. Three lines to create a tool-using agent.
Testing with adk web
The fastest way to test your agent is with ADK's built-in dev UI:
1. Create your project structure:
my_agent/
__init__.py
agent.py2. Define your agent in agent.py:
# my_agent/agent.py
from google.adk.agents import Agent
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Gets the current weather for a given city."""
return {"city": city, "temp": 25, "unit": unit, "condition": "Sunny"}
root_agent = Agent(
name="weather_agent",
model="gemini-2.0-flash",
instruction="You are a helpful weather assistant. Use the get_weather tool when asked about weather.",
tools=[get_weather]
)3. Add the init file:
# my_agent/__init__.py
from . import agent4. Launch the dev UI:
cd parent_of_my_agent
adk web5. Open http://localhost:8000 and chat with your agent!
Try: "What's the weather in Tokyo?"
You'll see the agent call your get_weather function and respond with the result. The adk web UI also shows you the function calls in the Events tab—perfect for debugging.
ADK converts your Python function signature into the JSON schema automatically:
city: str→{"type": "string"}unit: str = "celsius"→ optional parameter with default- Docstring → function description
- Docstring Args → parameter descriptions
No more writing schemas by hand!
4. Built-in Tools: Quick Wins
Before building custom tools, know that ADK provides powerful pre-built tools. These give you capabilities immediately, without writing code.
4.1 Google Search (Grounding)
Give your agent access to real-time web information:
from google.adk.agents import Agent
from google.adk.tools import google_search
research_agent = Agent(
name="research_agent",
model="gemini-2.0-flash",
instruction="You are a research assistant. Search the web to answer questions about current events.",
tools=[google_search] # Built-in Google Search
)The model automatically decides when to search. Ask about recent news, stock prices, or current events—it just works.
Combining built-in and custom tools:
from google.adk.agents import Agent
from google.adk.tools import google_search
def save_note(title: str, content: str) -> dict:
"""Saves important information for later reference."""
print(f"Saving: {title}")
return {"status": "saved", "title": title}
# Agent with both capabilities
research_agent = Agent(
name="research_assistant",
model="gemini-2.0-flash",
instruction="Search for information and save important findings.",
tools=[google_search, save_note] # Mix and match!
)4.2 Code Execution
Let the model write and run Python code:
from google.adk.agents import Agent
from google.adk.tools import built_in_code_execution
coding_agent = Agent(
name="coding_agent",
model="gemini-2.0-flash",
instruction="You can write and run Python code to solve problems.",
tools=[built_in_code_execution]
)Ask "Calculate compound interest on $10,000 at 5% for 10 years" and the agent will write Python, execute it, and return the result.
4.3 What Else?
The ecosystem is expanding rapidly:
- Anthropic Computer Use: Control desktop GUIs (clicking, typing)
- Browser Use: Navigate websites, fill forms
- MCP Servers: Connect to databases, APIs, file systems (covered in Chapter 7)
The same function-calling pattern underlies them all.
5. Designing Effective Tools
Here's a truth that surprises most engineers: the quality of your tool descriptions matters more than the quality of your tool code. The model can only use tools it understands.
Anthropic's research team improved Claude's tool use accuracy by 40%+ just by improving tool descriptions. Let's break down what makes a tool great.
5.1 The Golden Rule: Think Like a New Hire
Imagine hiring a brilliant new engineer. They're smart, fast, and eager—but they don't know your codebase, your domain, or your conventions. Would you hand them a function called query(q: str) and expect them to use it correctly?
Of course not. You'd explain when to use it, what format the input should be, and what comes back. The model is exactly this new hire—infinitely capable, but dependent on your documentation.
Bad:
def query(q: str) -> list:
"""Runs a query."""
...Good:
def search_customers(
query: str,
status: str = "active",
limit: int = 10
) -> list[dict]:
"""Searches the customer database by name or email.
Use this tool when the user asks about specific customers,
their orders, or account status. Do NOT use for aggregate
statistics—use get_customer_analytics instead.
Args:
query: Search term. Matches against customer name or email.
Example: "john" or "john@example.com"
status: Filter by account status. One of: "active", "inactive", "all"
limit: Maximum results to return. Default 10, max 100.
Returns:
List of customer objects with keys: id, name, email, status, created_at
"""
...The good version tells the model:
- When to use the tool (and when not to)
- What each parameter means with concrete examples
- What to expect in the response
This pattern—explicit context, concrete examples, clear boundaries—applies to every tool you write.
5.2 Schema Best Practices
Beyond descriptions, the structure of your parameters matters. The model uses type hints to constrain its outputs, and clear naming to decide which arguments to fill.
| Principle | Bad | Good |
|---|---|---|
| Use enums for fixed values | status: str | status: Literal["active", "inactive", "pending"] |
| Provide examples | "The date" | "The date in YYYY-MM-DD format, e.g., '2026-01-15'" |
| Name unambiguously | id: str | customer_id: str |
| Mark required vs optional | Implicit | Explicit in docstring |
These small changes dramatically reduce tool-calling errors. The model can match Literal["active", "inactive", "pending"] exactly, but will often hallucinate invalid values for an unconstrained str.
5.3 Fewer Tools Are Often Better
With great power comes great responsibility—and great temptation. Once you realize how easy it is to add tools, you'll want to wrap every API endpoint. Resist this urge.
Every tool you add is another option the model must evaluate. More options mean more opportunities for wrong choices. Anthropic's engineering team found that agents with 5-10 focused tools consistently outperform those with 20+ granular tools.
"More tools don't always lead to better outcomes. A common error is tools that merely wrap existing API endpoints."
Don't create a tool for every API endpoint. Create tools for workflows:
# ❌ Too granular (3 tools, 3 calls needed)
def list_users() -> list: ...
def list_events() -> list: ...
def create_event(user_id: str, event: dict) -> dict: ...
# ✅ Workflow-oriented (1 tool, 1 call)
def schedule_meeting(
attendee_names: list[str],
topic: str,
preferred_time: str
) -> dict:
"""Finds attendees by name, checks availability, and creates a calendar event."""
...The workflow-oriented tool handles the complexity internally. The model just says what it wants; your code figures out how.
5.4 Return Meaningful Context
What your tool returns is just as important as how it's called. The model will use the return value to decide what to do next—or to answer the user directly. Help it succeed.
# ❌ The model can't use this easily
{"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}
# ✅ Human-readable AND machine-usable
{
"customer_id": "cust_12345",
"name": "Jane Smith",
"summary": "Premium customer since 2023, 47 orders"
}Notice the summary field. The model can quote it directly to the user without needing to interpret raw data. This is progressive disclosure—enough detail to act, enough simplicity to communicate.
5.5 Error Handling That Teaches
Tools fail. APIs time out. Users ask for data that doesn't exist. When this happens, your error response becomes the model's learning signal.
When tools fail, help the model self-correct:
# ❌ Unhelpful
raise ValueError("Invalid input")
# ✅ Actionable
return {
"error": True,
"message": "Customer not found with email 'jon@example.com'. "
"Did you mean 'john@example.com'? "
"Try searching by name instead."
}The model reads this, learns, and tries a different approach.
- Write for the new hire — Explain when to use, when not to, with examples
- Constrain the schema — Enums, examples, clear naming
- Fewer is better — Workflow-oriented tools over granular endpoints
- Return context — Human-readable summaries alongside machine data
- Errors that teach — Guide the model toward recovery
6. Advanced Tool Patterns
6.1 Parallel Tool Calling
Modern models can call multiple tools simultaneously:
# User: "What's the weather in Tokyo and the stock price of GOOG?"
# Model generates TWO function calls at once:
[
{"name": "get_weather", "args": {"city": "Tokyo"}},
{"name": "get_stock_price", "args": {"symbol": "GOOG"}}
]ADK executes these in parallel automatically, cutting latency in half.
6.2 Sequential (Compositional) Tool Calling
Some tasks require chaining tools:
User: "If it's warmer than 20°C in London, set my thermostat to 20°C"
Step 1: get_weather(city="London") → {"temp": 25}
Step 2: Model reasons: 25 > 20, so I should set thermostat
Step 3: set_thermostat(temperature=20) → {"status": "success"}
Step 4: Model: "It's 25°C in London, so I've set your thermostat to 20°C"ADK handles this automatically—the agent loop continues until the task is complete.
6.3 Function Calling Modes
You can force or disable tool use:
from google.adk.agents import RunConfig
from google.genai import types
# Force the model to use a tool (no plain text response)
run_config = RunConfig(
tool_config=types.ToolConfig(
function_calling_config=types.FunctionCallingConfig(
mode="ANY", # Must use a tool
allowed_function_names=["search_customers"] # Only this tool
)
)
)Modes:
- AUTO (default): Model decides
- ANY: Must use a tool
- NONE: Disable all tools
6.4 The Bash Meta-Tool
Here's an insight that separates toy agents from production ones:
Instead of building 100 specific tools, give the agent one universal tool:
bash.
An agent that can run shell commands can:
- Read and write files
- Make HTTP requests (
curl) - Process data (
jq,awk,grep) - Manage git repositories
- Do almost anything a developer can do
This is the secret behind Devin and Claude Code.
import subprocess
def run_bash(command: str, timeout: int = 30) -> dict:
"""Executes a bash command and returns the output.
Use this for file operations, git commands, running scripts,
or any task that can be accomplished via command line.
Args:
command: The bash command to execute.
timeout: Maximum seconds to wait. Default 30.
Returns:
{"stdout": "...", "stderr": "...", "return_code": 0}
Examples:
- "ls -la /project" - list files
- "cat README.md" - read a file
- "git status" - check git state
"""
try:
result = subprocess.run(
command, shell=True, capture_output=True, text=True, timeout=timeout
)
return {
"stdout": result.stdout,
"stderr": result.stderr,
"return_code": result.returncode
}
except subprocess.TimeoutExpired:
return {"error": f"Command timed out after {timeout} seconds"}Never run agent-generated bash commands on your production server. Use sandboxed environments:
- E2B (e2b.dev): Cloud sandboxes with full OS access
- Docker: Containerized execution with resource limits
- Modal/Fly.io: Ephemeral VMs for untrusted code
The agent writes the command; your sandbox runs it.
7. 🔨 Project: Multi-Tool Agent
Let's build an agent that can answer: "How many days until the next US Presidential Election?"
This requires three capabilities:
- Know the current date (system time)
- Find when the election is (web search)
- Calculate the difference (math)
Create the agent structure:
mkdir -p election_agent && touch election_agent/__init__.pyNow create election_agent/agent.py:
from datetime import datetime
from google.adk.agents import Agent
from google.adk.tools import google_search
# Tool 1: System Time
def get_current_date() -> dict:
"""Returns the current date and time.
Use this to know what "today" or "now" means.
"""
now = datetime.now()
return {
"date": now.strftime("%Y-%m-%d"),
"time": now.strftime("%H:%M:%S"),
"day_of_week": now.strftime("%A")
}
# Tool 2: Days Calculator
def days_between(start_date: str, end_date: str) -> dict:
"""Calculates the number of days between two dates.
Args:
start_date: Start date in YYYY-MM-DD format
end_date: End date in YYYY-MM-DD format
"""
try:
start = datetime.strptime(start_date, "%Y-%m-%d").date()
end = datetime.strptime(end_date, "%Y-%m-%d").date()
delta = (end - start).days
return {"days": delta, "start": start_date, "end": end_date}
except ValueError as e:
return {"error": f"Invalid date format: {e}"}
# Create the agent
root_agent = Agent(
name="election_countdown",
model="gemini-2.0-flash",
instruction="""You help with date-related questions.
1. Use get_current_date to know today's date
2. Use Google Search to find event dates
3. Use days_between to calculate differences
Show your work.""",
tools=[get_current_date, days_between, google_search]
)Run with adk web:
adk web election_agentThen ask: "How many days until the next US Presidential Election?"
What happens in the browser:
- Agent calls
get_current_date()→ learns today's date - Agent uses Google Search → finds the election date (November 5, 2028)
- Agent calls
days_between("2026-01-08", "2028-11-05")→ gets the countdown - Agent responds with the final answer
Watch the "Function Calls" panel in adk web to see each step as it happens.
This project demonstrates:
- Multi-tool orchestration: Agent decides which tools to use and in what order
- Built-in + custom tools: Google Search alongside your custom functions
- Iterative development:
adk webfor rapid prototyping and debugging
Summary
Function calling transforms an LLM from a text generator into an agent. You learned:
- The Mechanism: The 5-step loop (Define → Invoke → Execute → Result → Respond)
- ADK Simplicity: Type hints + docstrings = automatic schemas
- Built-in Tools: Google Search and Code Execution for quick wins
- Tool Design: The 5 pillars—new hire mindset, constrained schemas, fewer tools, meaningful returns, teaching errors
- Advanced Patterns: Parallel calls, sequential chains, the Bash meta-tool
The quality of your tools directly determines the quality of your agent. Invest time in clear descriptions and thoughtful schemas.
References
- Writing Effective Tools for Agents — Anthropic Engineering
- Function Calling with the Gemini API — Google AI
- Google ADK — Quickstart
- Google ADK — Custom Tools
- Tool Use Overview — Anthropic Docs
Next Chapter: We'll explore MCP (Model Context Protocol)—the universal standard for connecting AI agents to tools, enabling you to write a tool once and use it everywhere.