Friday, December 19, 2025

Tool Calling in AI: Turning Language Models into Action-Driven Agents

Large language models (LLMs) have transformed how humans interact with machines. However, the real breakthrough comes when these models stop being passive responders and start becoming active problem solvers. This transition is powered by a capability known as tool calling.

In this blog, we will explore what tool calling is, why it matters, how it works internally, and how it enables agentic AI systems capable of real-world action.

What Is Tool Calling?

At its core, tool calling refers to an AI model’s ability to interact with external tools, APIs, databases, or systems to extend its native capabilities.

Traditional LLMs operate purely on pretrained knowledge. They generate answers based on patterns learned during training. But, this approach has a hard limit:

they cannot access real-time data, perform live computations, or take direct actions.

Tool calling removes this limitation.

With tool calling enabled, an AI system can:

  • Query live databases

  • Fetch real-time information (weather, stock prices, system status)

  • Execute functions or scripts

  • Trigger workflows and automation

  • Interact with enterprise systems

This capability is sometimes called function calling, and it is one of the foundational pillars of agentic AI.

Instead of merely answering questions, LLMs with tool calling can decide, act, and iterate—much like a digital agent.

NOTE:  Agent is a system with an LLM at its core that is able to make decisions on what actions to take as it works to answer the prompt it received. The most common actions LLM agents can be built to take are: sending text or other media to the user, calling a tool to help answer the user, and calling another agent to help answer the user. Generally speaking, an LLM agent will also have a system prompt explaining what its role is and giving it some rules over when to call tools and/or reply to the user. For most Agents, the control flow can be shown as follows:


source

Why is Tool Calling Important 

Limits if static knowledge: Even the most advanced LLMs are constrained by:

  • Training data cutoffs

  • Lack of real-time awareness

  • Inability to perform live computations

  • No direct access to user-specific systems

Early models such as GPT-2 were entirely static. They produced impressive text but had no concept of now.
Ask them about today’s weather or current stock prices, and they simply could not answer accurately.

The Need for Real-World Interaction

As AI moved into production systems—finance, healthcare, DevOps, customer support—the need for:

  • Live data

  • External computation

  • User-specific actions

became unavoidable.

This led to the introduction of tool calling, where models are trained to:

  1. Recognize when external help is needed

  2. Select the correct tool

  3. Generate structured requests

  4. Interpret structured responses

Critically, tools often expect strict input schemas, not free-form text. Tool calling ensures model outputs conform to these schemas, making AI-system integration reliable and safe.

How Does Tool Calling Work?

Modern LLMs such as Claude, Llama 3, Mistral, and IBM Granite - all support tool calling, though implementation details may vary.

At a high level, the process involves six steps.

Step 1: Recognizing the Need for a Tool

Imagine a user asks:

“What’s the weather in San Francisco right now?”

The model immediately understands:

  • This requires real-time data

  • The answer cannot come from its static training set

At this point, the model decides to invoke a tool.
A unique tool call ID is generated to track the request and its eventual response.

Step 2: Selecting the Right Tool

Next, the model chooses the most appropriate tool—perhaps a weather API.

Each tool is described using metadata, including:

  • Tool (or function) name

  • Description

  • Input parameters

  • Input and output data types

This metadata allows the model to reason about:

  • Which tool to use

  • What arguments it must provide

Tool selection is not random—it is a learned decision based on context.

Step 3: Preparing the Arguments (Args)

Once the tool is selected, the model constructs structured arguments (often called args).

For example:

  • City: San Francisco

  • Units: Celsius

  • Timestamp: current

These arguments must strictly match the tool’s expected schema.

To ensure consistency, developers often use templates or structured prompts that guide the model on:

  • Which tool to call

  • What arguments to pass

This is where tool calling differs from free-form prompting—it is contract-driven.

Tool Calling + RAG: A Powerful Combination

Tool calling becomes even more effective when combined with Retrieval Augmented Generation (RAG).

With RAG:

  • The model retrieves relevant structured and unstructured data

  • Then uses that data to generate a grounded response

Benefits include:

  • Higher contextual accuracy

  • Reduced hallucinations

  • Lower API overhead

  • Greater flexibility across domains

Unlike rigid tool calls, RAG allows more fluid reasoning by blending retrieved knowledge with generation.

Step 4: Making the API Call

Each tool is backed by an API, documented via:

  • Endpoints

  • HTTP methods

  • Request/response formats

Many APIs require authentication via an API key.

Once arguments are prepared, the model (or orchestration layer) sends an HTTP request to the external system.

Step 5: Receiving and Processing the Response

The external tool returns structured data—commonly in JSON format.

For a weather API, this might include:

  • Temperature

  • Humidity

  • Wind speed

The AI then:

  • Parses the response

  • Filters relevant fields

  • Transforms raw data into a human-friendly explanation

Step 6: Acting or Responding

Finally, the AI either:

  • Presents the information to the user, or

  • Confirms an action (e.g., “Your reminder has been scheduled.”)

If the user asks follow-up questions, the model can repeat the cycle with refined parameters—enabling iterative reasoning.


How Do LLMs Call Tools?

For an LLM (Large Language Model) to call a tool, it needs a structured way to specify which tool it wants to use and what arguments to pass. Since an LLM outputs plain text tokens, an external system must parse this output and execute the tool call. This means the LLM should produce structured or semi-structured data consistently.

Different APIs implement this differently, but the concept is the same across platforms. Let’s look at how the OpenAI Chat API handles this.

When using the OpenAI Chat API, you provide a list of tools the LLM can access. Each tool is defined with:

  • Name of the tool
  • Description of what it does
  • Parameters (including type, description, and whether they are required)

Here’s an example tool definition:

{
    "type": "function",
    "function": {
        "name": "calculate_distance",
        "description": "Calculate the distance between two cities",
        "parameters": {
            "type": "object",
            "properties": {
                "city_a": {
                    "type": "string",
                    "description": "Name of the first city, e.g., New York"
                },
                "city_b": {
                    "type": "string",
                    "description": "Name of the second city, e.g., Los Angeles"
                },
                "unit": {
                    "type": "string",
                    "enum": ["kilometers", "miles"]
                }
            },
            "required": ["city_a", "city_b"]
        }
    }

}

This JSON would be included in the API call so the LLM knows it can use calculate_distance. If you don’t include it, the LLM won’t know the tool exists.

How Does the LLM Decide to Call a Tool?

When the LLM responds, you check the tool_calls property in the response. For example, in Python:

response_message = response.choices[0].message
tool_calls = response_message.tool_calls

tool_calls will contain an array of tools the LLM wants to invoke, along with the arguments. Your system then executes the corresponding function or method with those arguments.This approach allows the LLM to reason about when to use a tool and provide structured arguments, while your application handles the actual execution

source

Agent Workflow as described below :

  1. Receive the Query
    The agent gets a natural-language request or task from the user or an external system.

  2. Discover Available Tools
    It looks up internal metadata or a tool registry to find relevant tools, schemas, and capabilities.

  3. Select and Invoke the Right Tool
    The LLM processes the query along with tool metadata (such as function names, input types, and descriptions).
    It chooses the most appropriate tool, prepares the input arguments, and generates a structured function call.

  4. Execute the Tool
    The agent shell or tool runner runs the selected function and retrieves the output (e.g., API response, database value, or computation result).

  5. Return the Final Response
    The LLM incorporates the tool’s result into its prompt and produces a natural-language answer for the user
    .

    =====================================

    Key Capabilities :

  • Dynamic Tool Selection
    Automatically picks the right tool based on the context of the task.

  • Schema-Aware Prompting
    Supports structured interfaces like OpenAPI, JSON Schema, and AWS function definitions for precise interactions.

  • Intelligent Output Handling
    Interprets results and chains outputs into logical reasoning for complex workflows.

  • Flexible Execution Modes
    Works in both stateless and session-aware environments.


Common Use Cases :

  • Virtual Assistants with External Data Access
    Enhance assistants by connecting them to APIs and real-time data sources.

  • Financial Calculators and Estimators
    Perform dynamic computations and provide accurate projections.

  • API-Driven Knowledge Workers
    Automate tasks that require pulling and processing data from multiple services.

  • LLM-Powered Integrations
    Invoke AWS Lambda, Amazon SageMaker endpoints, and SaaS tools for advanced functionality.

==================================================

LangChain and Tool Calling

LangChain is one of the most widely used frameworks for implementing tool calling.

It provides:

  • Tool registration

  • Argument parsing

  • Context-aware routing

  • Memory across multiple interactions

Unlike basic tool calling, LangChain can:

  • Chain multiple tools together

  • Store previous tool outputs

  • Enable complex, multi-step agent workflows

For example:

  1. Call a weather API

  2. Use results to trigger a clothing recommendation tool

  3. Generate a final personalized response

This is a practical implementation of agentic AI.

Common Types of Tool Calling Use Cases

While possibilities are endless, most applications fall into a few major categories.

1. Information Retrieval and Search

AI pulls real-time data from:

  • Web search engines

  • Financial markets

  • Academic databases

  • News sources

Example: Fetching live stock prices or breaking news inside a chatbot.

2. Code Execution and Computation

AI executes:

  • Mathematical calculations

  • Simulations

  • Scripts via Python or engines like Wolfram Alpha

Useful for analytics, engineering, and scientific domains.

3. Process Automation

AI automates workflows by integrating with:

  • Calendars

  • Email systems

  • CRM tools (Salesforce)

  • Finance platforms (QuickBooks)

This enables AI-driven business operations.

4. Smart Devices and IoT Control

Agentic systems can monitor and control:

  • Smart homes

  • Industrial sensors

  • Robotics platforms

This opens the door to fully autonomous, end-to-end workflows.


Final Thoughts

Tool calling is not just a feature, it is a paradigm shift.

It allows LLMs to:

  • Know when they don’t know

  • Reach outside themselves

  • Act in the real world

  • Continuously refine outcomes

As AI systems evolve, tool calling will be the foundation that turns language models into true digital agents—capable of reasoning, acting, and collaborating across complex environments.If language is intelligence, tool calling is agency.

--------------------------------------BACKUP INFO-------------------------------

Sample code:

Below is a complete Python example that shows:

  1. Defining tools
  2. Making a chat completion request
  3. Reading tool_calls and parsing JSON arguments
  4. Executing your local functions
  5. Returning tool outputs back to the model for a final answer
# pip install openai
from openai import OpenAI
import json

# Initialize client
client = OpenAI(api_key="YOUR_API_KEY")

# Define one tool
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_city_coordinates",
            "description": "Return approximate latitude and longitude for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    }
]

# Local function to execute
def get_city_coordinates(city):
    # Hardcoded example
    coords = {
        "New York": (40.7128, -74.0060),
        "Los Angeles": (34.0522, -118.2437),
        "Bangalore": (12.9716, 77.5946)
    }
    lat, lon = coords.get(city, (None, None))
    return {"city": city, "latitude": lat, "longitude": lon}

# User message
messages = [{"role": "user", "content": "Give me the coordinates of Bangalore"}]

# First call: model decides if it needs the tool
resp = client.chat.completions.create(
    model="gpt-4o-mini",  # Use a tool-capable model
    messages=messages,
    tools=tools,
    tool_choice="auto"
)

# Check if model requested a tool
tool_calls = resp.choices[0].message.tool_calls or []
tool_msgs = []
for call in tool_calls:
    # Parse JSON arguments
    args = json.loads(call.function.arguments)
    result = get_city_coordinates(args["city"])
    # Send tool result back to model
    tool_msgs.append({
        "role": "tool",
        "tool_call_id": call.id,
        "content": json.dumps(result)
    })

# Second call: model uses tool result to finish answer
finalfinal = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages + [resp.choices[0].message] + tool_msgs
)

How It Works

  1. tools: Defines the schema so the model knows what arguments to provide.
  2. First API call: Model decides if it needs the tool and returns tool_calls.
  3. Parse arguments: call.function.arguments is a JSON string → json.loads().
  4. Execute local function: get_city_coordinates(city).
  5. Send result back: Add a message with role="tool" and tool_call_id.
  6. Second API call: Model uses the tool output to generate the final answer.
============================
EXAMPLE 2: How to add functions ?


# pip install openai
from openai import OpenAI
import json

client = OpenAI(api_key="YOUR_API_KEY")

# 1) Define your tool schema. The model uses this to learn how to call your functions.
tools = [
    {
        "type": "function",
        "function": {
            "name": "calculate_distance",
            "description": "Calculate great-circle distance between two cities",
            "parameters": {
                "type": "object",
                "properties": {
                    "city_a": {"type": "string", "description": "First city, e.g., New York"},
                    "city_b": {"type": "string", "description": "Second city, e.g., Los Angeles"},
                    "unit": {
                        "type": "string",
                        "enum": ["kilometers", "miles"],
                        "description": "Distance unit"
                    }
                },
                "required": ["city_a", "city_b"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_city_coordinates",
            "description": "Return approximate latitude and longitude for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    }
]

# 2) Implement the actual Python functions the model can call.
# In production, replace these with real logic (DB lookup, API calls, etc.)
CITY_DB = {
    "New York": (40.7128, -74.0060),
    "Los Angeles": (34.0522, -118.2437),
    "San Francisco": (37.7749, -122.4194),
    "Bangalore": (12.9716, 77.5946),
}

from math import radians, sin, cos, sqrt, atan2

def get_city_coordinates(city: str):
    if city not in CITY_DB:
        raise ValueError(f"Unknown city: {city}")
    lat, lon = CITY_DB[city]
    return {"city": city, "latitude": lat, "longitude": lon}

def calculate_distance(city_a: str, city_b: str, unit: str = "kilometers"):
    # Get coords
    lat1, lon1 = CITY_DB.get(city_a, (None, None))
    lat2, lon2 = CITY_DB.get(city_b, (None, None))
    if lat1 is None or lat2 is None:
        raise ValueError("One or both cities unknown")

    # Haversine
    R_km = 6371.0
    R_mi = 3958.8

    phi1, phi2 = radians(lat1), radians(lat2)
    dphi = radians(lat2 - lat1)
    dlambda = radians(lon2 - lon1)

    a = sin(dphi / 2) ** 2 + cos(phi1) * cos(phi2) * sin(dlambda / 2) ** 2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    d_km = R_km * c
    d_mi = R_mi * c

    if unit == "miles":
        return {"city_a": city_a, "city_b": city_b, "distance": round(d_mi, 2), "unit": "miles"}
    else:
        return {"city_a": city_a, "city_b": city_b, "distance": round(d_km, 2), "unit": "kilometers"}

# 3) Start the conversation. The user asks something that likely requires tools.
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {
        "role": "user",
        "content": "What is the distance between New York and Los Angeles in miles? Also share coordinates for Los Angeles."
    },
]

# 4) First model call with tools listed. The model may decide to call one or more tools.
first_response = client.chat.completions.create(
    model="gpt-4o-mini",   # Choose a tool-capable model
    messages=messages,
    tools=tools,
    tool_choice="auto"      # Let the model decide whether and which tools to call
)

# 5) Check if the model wants to call tools.
# This is where the JSON arguments live: tool_call.function.arguments is a JSON string.
tool_calls = first_response.choices[0].message.tool_calls or []

# If there are tool calls, execute them and gather results.
tool_results_messages = []
for call in tool_calls:
    tool_name = call.function.name
    # The arguments come as a JSON string. Parse it to a dict.
    args = json.loads(call.function.arguments)

    try:
        if tool_name == "calculate_distance":
            result = calculate_distance(
                city_a=args.get("city_a"),
                city_b=args.get("city_b"),
                unit=args.get("unit", "kilometers"),
            )
        elif tool_name == "get_city_coordinates":
            result = get_city_coordinates(args.get("city"))
        else:
            result = {"error": f"Unknown tool {tool_name}"}
    except Exception as e:
        result = {"error": str(e)}

    # Add a tool result message. The role must be "tool" and the tool_call_id must match.
    tool_results_messages.append({
        "role": "tool",
        "tool_call_id": call.id,
        "content": json.dumps(result)
    })

# 6) Send the tool outputs back to the model to let it finish the answer.
final_messages = messages + [first_response.choices[0].message] + tool_results_messages


final_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=final_messages
)

Key points to notice :

  • Where do the JSON input args go?
    The model returns them in tool_calls[i].function.arguments as a JSON string. You must json.loads(...) that string to get a Python dict to call your function.

  • Returning tool outputs back to the model
    You send a new message with role="tool", include the tool_call_id from the original call, and put your tool’s output in content(commonly JSON).

  • Finalization step
    After you add the tool result messages, call the model again so it can synthesize a natural language answer using the tool outputs.

====================

Example 3 : Minimal example showing a remote API call as the “execute locally” step

workflow: 

define tools → let the model decide → parse arguments → execute locally → return results → finalize answer

NOTE: execute locally = execute in your code. Where your code reaches out is your choice.Local code that calls external systems, code makes a request to a remote DB or API.Your code enqueues a job to a worker or serverless function

from openai import OpenAI
import json
import requests  # pip install requests

client = OpenAI(api_key="YOUR_API_KEY")

# Tool schema that asks the model to provide a username
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_user_profile",
            "description": "Fetch a user's profile from a remote service",
            "parameters": {
                "type": "object",
                "properties": {
                    "username": {"type": "string", "description": "Login name"}
                },
                "required": ["username"]
            }
        }
    }
]

# Local function that calls an external API
def get_user_profile(username):
    try:
        # External call. This is the execute locally step in your app.
        resp = requests.get(
            f"https://api.example.com/users/{username}",
            timeout=5
        )
        resp.raise_for_status()
        data = resp.json()
        # Always return a JSON-serializable object
        return {
            "username": data.get("username"),
            "full_name": data.get("full_name"),
            "email": data.get("email"),
            "status": "ok"
        }
    except requests.exceptions.Timeout:
        return {"error": "timeout", "status": "failed"}
    except requests.exceptions.HTTPError as e:
        return {"error": f"http {e.response.status_code}", "status": "failed"}
    except Exception as e:
        return {"error": str(e), "status": "failed"}

messages = [{"role": "user", "content": "Show the profile for user alice"}]

# First call: model decides to call the tool and provides JSON args
first = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    tools=tools,
    tool_choice="auto"
)

tool_calls = first.choices[0].message.tool_calls or []
tool_messages = []

for call in tool_calls:
    args = json.loads(call.function.arguments)
    result = get_user_profile(args["username"])
    tool_messages.append({
        "role": "tool",
        "tool_call_id": call.id,
        "content": json.dumps(result)
    })

# Second call: model reads tool outputs and finalizes the answer
final_messages = messages + [first.choices[0].message] + tool_messages

final = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=final_messages
)

Example 4: Multi-agent setup with orchestration demonstrates a two-agent workflow using tool calls. It shows:
  • Agent 1 (DataAgent) calculates distance between two cities using tools
  • Agent 2 (ReportAgent) formats the result using its own tool
  • An orchestrator glues the two agents together

# pip install openai
from openai import OpenAI
import json
from math import radians, sin, cos, sqrt, atan2

# Initialize OpenAI client
client = OpenAI(api_key="YOUR_API_KEY")

# ---------------------------------------------
# Shared data for tools
# ---------------------------------------------
CITY_DB = {
    "Bangalore": (12.9716, 77.5946),
    "Los Angeles": (34.0522, -118.2437),
    "New York": (40.7128, -74.0060),
    "San Francisco": (37.7749, -122.4194),
}

# ---------------------------------------------
# Local tool implementations for Agent 1
# ---------------------------------------------
def get_city_coordinates(city: str):
    if city not in CITY_DB:
        raise ValueError(f"Unknown city: {city}")
    lat, lon = CITY_DB[city]
    return {"city": city, "latitude": lat, "longitude": lon}

def calculate_distance(city_a: str, city_b: str, unit: str = "kilometers"):
    lat1, lon1 = CITY_DB.get(city_a, (None, None))
    lat2, lon2 = CITY_DB.get(city_b, (None, None))
    if lat1 is None or lat2 is None:
        raise ValueError("One or both cities unknown")

    # Haversine formula
    R_km = 6371.0
    R_mi = 3958.8

    phi1, phi2 = radians(lat1), radians(lat2)
    dphi = radians(lat2 - lat1)
    dlambda = radians(lon2 - lon1)

    a = sin(dphi / 2) ** 2 + cos(phi1) * cos(phi2) * sin(dlambda / 2) ** 2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    d_km = R_km * c
    d_mi = R_mi * c

    if unit == "miles":
        return {"city_a": city_a, "city_b": city_b, "distance": round(d_mi, 2), "unit": "miles"}
    else:
        return {"city_a": city_a, "city_b": city_b, "distance": round(d_km, 2), "unit": "kilometers"}

# ---------------------------------------------
# Local tool implementations for Agent 2
# ---------------------------------------------
def format_summary(city_a: str, city_b: str, distance: float, unit: str):
    bullets = [
        f"Route: {city_a} to {city_b}",
        f"Distance: {distance} {unit}",
        "Method: Great circle approximation",
        "Use case: Travel planning and logistics"
    ]
    conclusion = f"In summary, the distance between {city_a} and {city_b} is {distance} {unit}."
    return {"bullets": bullets, "conclusion": conclusion}

# ---------------------------------------------
# Tool schemas
# ---------------------------------------------
DATA_AGENT_TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "get_city_coordinates",
            "description": "Return latitude and longitude for a known city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate_distance",
            "description": "Calculate great circle distance between two cities",
            "parameters": {
                "type": "object",
                "properties": {
                    "city_a": {"type": "string"},
                    "city_b": {"type": "string"},
                    "unit": {"type": "string", "enum": ["kilometers", "miles"]}
                },
                "required": ["city_a", "city_b"]
            }
        }
    }
]

REPORT_AGENT_TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "format_summary",
            "description": "Create a short bullet list and conclusion from computed distance",
            "parameters": {
                "type": "object",
                "properties": {
                    "city_a": {"type": "string"},
                    "city_b": {"type": "string"},
                    "distance": {"type": "number"},
                    "unit": {"type": "string"}
                },
                "required": ["city_a", "city_b", "distance", "unit"]
            }
        }
    }
]

# ---------------------------------------------
# Agent class: runs a single tool-capable turn
# ---------------------------------------------
class Agent:
    def __init__(self, name: str, system_prompt: str, tools: list, tool_impls: dict):
        self.name = name
        self.system_prompt = system_prompt
        self.tools = tools
        self.tool_impls = tool_impls

    def run_turn(self, user_or_context_messages: list, model: str = "gpt-4o-mini"):
        """
        Takes incoming messages, lets the model decide tool calls,
        executes tools locally, returns final assistant text and the tool results.
        """
        # First call: let model decide whether to call tools
        first = client.chat.completions.create(
            model=model,
            messages=[{"role": "system", "content": self.system_prompt}] + user_or_context_messages,
            tools=self.tools,
            tool_choice="auto"
        )

        assistant_msg = first.choices[0].message
        tool_calls = assistant_msg.tool_calls or []
        tool_result_messages = []
        collected_results = []  # Keep structured results for orchestration

        # Execute each requested tool
        for call in tool_calls:
            fn_name = call.function.name
            args = json.loads(call.function.arguments)

            try:
                fn = self.tool_impls.get(fn_name)
                if fn is None:
                    result = {"error": f"Unknown tool {fn_name}"}
                else:
                    result = fn(**args)
            except Exception as e:
                result = {"error": str(e)}

            # Store the structured result for the orchestrator
            collected_results.append({"name": fn_name, "args": args, "result": result})

            # Return result to the model using role="tool" with matching tool_call_id
            tool_result_messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": json.dumps(result)
            })

        # Second call: model reads tool outputs and produces a final assistant message
        final_messages = [{"role": "system", "content": self.system_prompt}] + user_or_context_messages +[assistant_msg] + tool_result_messages
        final = client.chat.completions.create(model=model, messages=final_messages)
        final_text = final.choices[0].message.content

        return final_text, collected_results

# ---------------------------------------------
# Instantiate two agents
# ---------------------------------------------
data_agent = Agent(
    name="DataAgent",
    system_prompt=(
        "You are a precise data agent. Use tools to compute distances and coordinates. "
        "Answer concisely and include structured context if helpful."
    ),
    tools=DATA_AGENT_TOOLS,
    tool_impls={
        "get_city_coordinates": get_city_coordinates,
        "calculate_distance": calculate_distance
    }
)

report_agent = Agent(
    name="ReportAgent",
    system_prompt=(
        "You are a clear technical writer. Format results with helpful bullets and a concise conclusion. "
        "Prefer calling the formatting tool when raw values are given."
    ),
    tools=REPORT_AGENT_TOOLS,
    tool_impls={
        "format_summary": format_summary
    }
)

# ---------------------------------------------
# Orchestration: two agent workflow
# ---------------------------------------------
def run_two_agent_workflow(city_a: str, city_b: str, unit: str = "miles"):
    # Step 1: User asks for a report that requires computation
    user_message = {
        "role": "user",
        "content": f"Compute the distance between {city_a} and {city_b} in {unit}. Then present a short professional report."
    }

    # Step 2: DataAgent computes using its tools
    data_text, data_results = data_agent.run_turn([user_message])

    # Extract the distance result from DataAgent tool outputs
    # We search for calculate_distance result
    distance_payload = None
    for item in data_results:
        if item["name"] == "calculate_distance" and "result" in item:
            distance_payload = item["result"]
            break

    if distance_payload is None:
        # Fallback if model did not call the tool
        # In a production system you would retry or ask the model to call the tool explicitly
        raise RuntimeError("DataAgent did not produce a distance result")

    # Step 3: Prepare input for ReportAgent
    # We provide both the narrative from DataAgent and the structured JSON payload
    report_messages = [
        {"role": "user", "content": "Create a short professional report from the following computed data."},
        {"role": "user", "content": json.dumps(distance_payload)}
    ]

    # Step 4: ReportAgent formats the result using its tool
    report_text, _ = report_agent.run_turn(report_messages)

    # Step 5: Return final report to the caller
    return {
        "data_agent_answer": data_text,
        "report_agent_answer": report_text,
        "distance_payload": distance_payload
    }

# ---------------------------------------------
# Demo run
# ---------------------------------------------
if __name__ == "__main__":
    result = run_two_agent_workflow("Bangalore", "Los Angeles", unit="miles")
    print("\n=== DataAgent output ===\n")
    print(result["data_agent_answer"])
    print("\n=== ReportAgent final report ===\n")
    print(result["report_agent_answer"])
    print("\n=== Raw computed payload ===\n")

How this works

  1. Each agent has its own system prompt, tool schema, and Python functions.
  2. For each agent, we make a first call with tools=... and tool_choice="auto".
  3. We parse assistant_msg.tool_calls[i].function.arguments which is a JSON string.
  4. We execute the requested local function and return a tool message with role="tool" and tool_call_id.
  5. We make a second call for the agent to finalize the answer.
  6. The orchestrator passes the computed JSON payload from Agent 1 to Agent 2, which formats the report via its own tool.
NOTE: This is a minimal synchronous pattern. In production you might add retries, timeouts, logging, and guardrails.
-------------------------------------------------

How LLM Agents Combine Memory, Planning, and Tools for Intelligent Task Execution


  1. Receive query and fetch memory

    • The agent takes the user query.
    • It retrieves relevant session memory or long-term memory (for user preferences, past decisions, cached results).
  2. Discover tools via MCP

    • The agent searches the MCP Tools Registry for available tools, schemas, and capabilities.
    • Examples: OpenAPI specs, JSON Schema, AWS Lambda functions, SageMaker endpoints, SaaS connectors.
  3. Planning, reflection, and tool choice

    • The LLM plans steps, decomposes goals, validates arguments, and routes the query.
    • Uses self-critique to ensure correct tool selection and schema compliance.
    • Planning references memory to avoid redundant calls and to personalize results.
  4. Execute tools and collect observations

    • The MCP Tool Runner executes the chosen function.
    • The agent receives tool outputs, checks units, ranges, and business rules.
    • If errors occur, planning adapts, retries, or switches tools.
  5. Respond and update memory

    • The LLM synthesizes a natural-language answer.
    • Optionally writes key facts or decisions back to memory for future use.

=======================================================

 Role of Memory in Tool Calling

Memory acts as the context backbone for the agent. It ensures that the LLM doesn’t operate in isolation but instead uses relevant historical and contextual data to make better decisions during planning and tool invocation.

Types of Memory

  1. Short-Term Memory (Session Memory)

    • Tracks the current conversation flow and intermediate steps.
    • Example: If the user asks, “Add a new test case for DAWR,” and later says, “Make it similar to the last one,” short-term memory recalls what “last one” refers to.
    • Stored in the agent’s working context (like a conversation buffer).
  2. Long-Term Memory

    • Stores persistent knowledge across sessions.
    • Example: Past tool calls, user preferences, previous bug reports, or test harness details.
    • Typically implemented using vector databases (e.g., Pinecone, Weaviate, FAISS) for semantic search.
    • Enables retrieval of relevant references during planning or content generation.

Where Memory Fits in the Flow

Referencing your diagram and updated version:

  • Step 1 (Receive Query):
    Memory is accessed immediately to enrich the query with historical context.
    Example: “User often works on Linux RAS components → prioritize related tools.”

  • Step 3 (Planning & Tool Choice):
    Memory helps the LLM plan better by recalling previous tool usage patterns, schema details, and user-specific constraints.
    Example: “Last time, the user preferred JSON schema-based prompts → use that format.”

  • Step 4 (Tool Execution):
    Memory can store execution results for future reuse.
    Example: Cache API responses or computed estimates to avoid redundant calls.

  • Step 5 (Response):
    Memory updates with new facts, decisions, and tool outputs for long-term learning.

Why Memory Matters

  • Personalization: Tailors responses based on user history.
  • Efficiency: Avoids repeated tool calls by caching results.
  • Accuracy: Provides richer context for reasoning and planning.
  • Scalability: Enables complex workflows by chaining past knowledge.

Practical Implementation

  • Short-Term: Conversation buffer in the agent shell.
  • Long-Term:
    • Vector DB for semantic retrieval.
    • Store tool metadata, execution logs, and user preferences.
    • Use embeddings to link queries with relevant past interactions.
========================================
  • Conclusion :

    LLMs can dynamically decide when to use a tool, pass the right arguments, and incorporate the results into their final response. This approach transforms LLMs from passive text generators into active problem-solvers that can query APIs, run computations, or fetch real-time data.

    Understanding this workflow—define tools → let the model decide → parse arguments → execute locally → return results → finalize answer—is key to building powerful AI-driven applications. Whether you’re integrating with APIs, automating workflows, or creating intelligent assistants, tool calling is the foundation for making LLMs truly useful in real-world scenarios.


No comments:

Post a Comment