AI agents represent an exciting frontier in automation, offering the potential for intelligent systems that can reason, plan, and execute actions autonomously. As this field rapidly evolves, developers and researchers are exploring various approaches to building effective agent systems, each with their own trade-offs and considerations.
This article shares one practical approach to understanding AI agent fundamentals through hands-on examples. While there's no single "correct" way to build agents, we'll explore a straightforward methodology that can help you grasp the core concepts and build a simple but functional agent from scratch.
The techniques and insights presented here represent one perspective among many in the rapidly evolving agent development landscape. By the end of this guide, you'll have hands-on experience with a working agent and foundational knowledge to explore other approaches and optimization strategies.
What Are AI Agents?
AI agents are autonomous systems that can perceive their environment, reason about problems, and take actions to achieve specific goals. Unlike traditional software that follows predetermined paths, agents can adapt their behavior based on context and outcomes.
The key characteristics that define an AI agent include:
- Autonomy: Ability to operate independently without constant human intervention
- Reactivity: Responding appropriately to environmental changes
- Proactivity: Taking initiative to achieve goals
- Social ability: Interacting with other agents or humans when necessary
These capabilities make AI agents particularly powerful for tasks requiring decision-making, multi-step problem solving, and dynamic adaptation to changing conditions.
Understanding Agent Fundamentals
Before diving into optimization strategies, let's establish a clear understanding of how AI agents work. The diagram below illustrates the simplest agent processing workflow:
- Receive Request: User provides input or query
- Analyze & Plan: AI model determines what actions to take
- Execute Tools: Agent calls appropriate functions or APIs
- Return Results: Processed information is delivered to the user
Notice: It is important to note that this is just the basic agent workflow. More comprehensive and powerful agent systems often adopt methods such as the ReAct (Reasoning and Acting) paradigm or Plan-and-Execute.
This simplified workflow masks the complexity that emerges when agents need to handle:
- Multiple tool calls in sequence
- Error handling and recovery
- Context management across conversations
- Dynamic decision making based on intermediate results
A Simple Agent Example
Let's start with a basic example to understand where performance issues originate. We'll use LangGraph to demonstrate, but these principles apply to any agent framework:
Creating A Tool
First, let's set up our development environment and create a simple tool:
import time
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from dotenv import load_dotenv
import os
load_dotenv()
# Define a simple tool
@tool
def get_user_info(user_id: str) -> str:
"""
Get basic information about a user.
Parameters:
- user_id: The ID or name of the user to look up.
Returns:
A string containing the user's information.
"""
# Simulate API call
return f"User {user_id}: Active account, Premium tier"
Creating the Agent
Now let's create our basic agent using LangGraph's high-level API:
# Create basic agent using react pattern
model = ChatOpenAI(
model=os.getenv("MODEL_NAME", "openai/gpt-4o"),
api_key=os.getenv("OPENAI_API_KEY"),
base_url=os.getenv("MODEL_BASE_URL")
)
simple_agent = create_react_agent(
model=model,
tools=[get_user_info],
prompt="You are a helpful assistant that can look up user information."
)
This agent is built using the LangGraph create_react_agent function, which implements the ReAct. This pattern allows the agent to alternate between reasoning about what to do and acting by using tools.
Testing the Agent
Let's test our agent and examine its behavior:
# Measure total time consumption
start_time = time.time()
# Test the agent
result = simple_agent.invoke({
"messages": [("user", "What's the status of user Bob?")]
})
end_time = time.time()
elapsed = end_time - start_time
print(result) # Print out all result
print(f"Total time consumption: {elapsed:.2f} seconds")
Understanding the Agent Output
When we run this code, the agent produces a structured response that reveals how it processes requests internally. Here's the key output details (with extraneous log information omitted for clarity):
{
"messages": [
HumanMessage(content="What's the status of user Bob?"),
AIMessage(
content="",
response_metadata={
"token_usage": {
"completion_tokens": 57,
"prompt_tokens": 176,
"total_tokens": 233,
},
"finish_reason": "tool_calls",
},
tool_calls=[
{
"name": "get_user_info",
"args": {"user_id": "Bob"},
}
],
),
ToolMessage(content="User Bob: Active account, Premium tier"),
AIMessage(
content="User **Bob** currently has an **active** account and is on the **Premium tier**.",
response_metadata={
"token_usage": {
"completion_tokens": 62,
"prompt_tokens": 218,
"total_tokens": 280,
},
"finish_reason": "stop",
},
)
]
}
Total time consumption: 3.79 seconds
Breaking Down the Agent's Reasoning Process
Let's analyze the output structure in detail. The agent's response consists of four main message types, each representing a step in the agent's reasoning and execution process:
- HumanMessage
- This is the initial request from the user.
- The agent receives this message and forwards it to the AI model for processing. - First AIMessage
- The AI model reviews the user's input and decides a tool call is required.
- This message includes:
- Tool call details: the selected tool's name and its arguments.
- Metadata such as token usage (prompt, completion, and total tokens), model name, and other response information. In this example, the first request uses a total of 233 tokens. - ToolMessage
- The agent executes the specified tool (function) with the provided parameters.
- The result of the tool execution is returned in this message. - Second AIMessage
- The AI model processes the tool's output and formulates the final response for the user.
- This message includes:
- The complete, user-facing answer, typically formatted for readability.
- Updated token usage and metadata for this step. In this example, the second response used 280 tokens.
Summary:
This query completed in 3.79 seconds. The four-part message sequence in the response is standard for agent frameworks using the ReAct pattern. Token usage and response times remain reasonable at this stage.
Understanding What You've Built
Congratulations! You've successfully built a functional AI agent that demonstrates the core principles of autonomous systems. Your agent can:
- Interpret requests intelligently and decide when tools are needed
- Execute functions automatically with proper parameters
- Format responses in a user-friendly way
- Complete tasks efficiently with reasonable resource usage
This simple example illustrates the fundamental agent workflow: receive → reason → act → respond. While basic, this pattern scales to much more complex systems.
Key Insights for Agent Development
From building this agent, you've learned several essential concepts:
- Tool Integration: The `@tool` decorator and clear documentation enable the AI model to understand and use your functions effectively.
- ReAct Pattern: The alternating sequence of reasoning and acting allows agents to break down problems and execute solutions systematically.
- Performance Awareness: Monitoring token usage and response times helps you understand the computational costs of agent operations.
What's Next
This foundation prepares you to tackle more sophisticated challenges. In Part 2, we'll explore what happens when agents need multiple tools and complex decision-making—and why performance optimization becomes critical as systems grow.