AI agents in Azure Cosmos DB|Complete Guide (2025)

Rohit.Rs
Jan 5
31 min read

AI agents are intelligent systems designed to help users by performing specific tasks, answering questions, and automating various processes. These agents come in different forms, from simple chatbots to copilots and advanced AI assistants capable of managing complex workflows independently.

This article dives into the concept of AI agents, exploring their core features and providing practical examples to help you understand their implementation better.

AI agents in Azure Cosmos DB|Complete Guide (2025)

What Are AI Agents?

AI agents are more than just standalone software or rule-based systems. They combine intelligence and adaptability, powered by features like:

Planning:AI agents can set goals and plan actions to achieve them. With advancements like Large Language Models (LLMs), their planning abilities have reached new heights, enabling them to handle complex tasks effectively.
Tool Usage:Sophisticated AI agents use tools such as code execution, data searches, and computational functions. They rely on these tools to complete tasks efficiently, often through techniques like function calling.
Perception:These agents can interpret information from their surroundings, whether it’s visual, auditory, or sensory data, making them interactive and aware of the context in which they operate.
Memory:AI agents remember past interactions, including their actions and the information they’ve processed. This memory allows them to learn from experience, reflect on their performance, and improve over time. Unlike traditional computer memory (RAM or hard drives), this type of memory focuses on continuity and decision-making.

Copilots: Your AI Partners

Copilots are a special type of AI agent designed to work alongside you rather than independently. They provide suggestions and assistance, acting as a supportive partner rather than taking full control.

For example, imagine you’re writing an email. A copilot might suggest phrases or even entire paragraphs to make your message clearer or more professional. It can also retrieve relevant information from your files or past emails to strengthen your message. You remain in control, choosing to accept, reject, or edit its suggestions.

Why AI Agents Matter

AI agents bring a sense of adaptability and intelligence to our daily lives. They aren’t just tools; they’re companions that learn from interactions, understand our needs, and help us achieve more with less effort.

By blending advanced planning, memory, and perception, these agents make technology feel more human, creating a seamless and engaging experience that evolves alongside us.

Understanding Autonomous Agents and Multi-Agent Systems:

In today’s world of advanced technology, autonomous agents are like digital assistants that can work on their own, making decisions and completing tasks without constant human input. But what makes them truly special is their ability to handle complex tasks thoughtfully and efficiently, almost like a reliable team member you can trust.

Imagine having an autonomous agent help you with something as routine yet important as writing emails. Here’s how it could support you:

Understanding the Context: The agent can go through your existing emails, chats, files, and even public sources related to the topic. It doesn’t just skim but carefully analyzes the data to understand what’s most relevant.
Analyzing the Information: It performs both qualitative and quantitative analysis on the collected data, ensuring it draws meaningful insights.
Composing the Email: Using these insights, the agent drafts a complete email. It includes clear arguments, supports them with evidence, and ensures the tone matches your intent.
Attaching Files: It finds and attaches any relevant documents, saving you the trouble of hunting them down.
Fact-Checking: Before finalizing, the agent reviews everything to ensure accuracy and validity. It’s like having a meticulous proofreader by your side.
Choosing Recipients: The agent identifies the right recipients, whether they belong in the To, Cc, or Bcc field, and ensures their email addresses are correct.
Scheduling Delivery: It determines the perfect time to send the email, optimizing for when recipients are most likely to respond.
Following Up: If no response is received within a reasonable time, the agent can send polite follow-ups, ensuring nothing falls through the cracks.

You can customize these tasks, deciding whether the agent acts independently or seeks your approval at every step.

The Power of Multi-Agent Systems

Now, take this idea further with multi-agent systems, where multiple autonomous agents work together like a coordinated team. These agents, whether digital or robotic, can interact, share knowledge, and collaborate to achieve individual or collective goals. Think of it as having a group of experts, each with its own specialty, working together to solve a problem or complete a project.

Here’s what makes multi-agent systems so effective:

Autonomy: Each agent operates independently, making its own decisions without needing constant guidance.
Collaboration: Agents communicate and negotiate with one another, ensuring tasks are coordinated and no effort is wasted.
Goal-Oriented: Each agent focuses on specific objectives, either working solo or aligning with the team’s shared goals.
Distributed System: There’s no single point of control. This decentralized approach makes the system more robust, scalable, and efficient.

Why Multi-Agent Systems Stand Out

Compared to a single autonomous agent or even advanced tools like large language models (LLMs), multi-agent systems offer several unique advantages:

Dynamic Thinking: While single systems often follow linear reasoning, multi-agent systems explore multiple reasoning paths, adapting dynamically to challenges.
Handling Complexity: These systems break down large or complex problems into smaller tasks, distributing them among agents for faster and more thorough results.
Enhanced Memory: Multi-agent systems can retain and reference information over time, overcoming the limitations of LLMs’ context windows.

The Human Touch

What makes all this technology exciting is how it mirrors teamwork in real life. Just as humans collaborate, brainstorm, and rely on each other’s strengths, autonomous agents and multi-agent systems work in harmony. They’re not just tools; they’re partners, designed to make our lives easier by handling repetitive or complex tasks, so we can focus on what truly matters.

In a way, these systems are a glimpse into the future, where technology doesn’t replace us but complements us, amplifying our abilities and helping us achieve more than we ever thought possible.

Implementation of AI Agents: A Human Perspective

In the world of Artificial Intelligence (AI), reasoning and planning are the heart of what makes autonomous agents truly smart. These advanced capabilities allow AI to think critically, make decisions, and adapt to complex situations. Let’s break down how these fascinating systems work, adding a touch of human understanding to each method.

Reasoning and Planning: The Core of AI Agents

Imagine teaching a child not just to solve a problem but also to think, plan, and adjust their approach. That’s what reasoning and planning bring to AI agents. These systems are designed to approach tasks with methods that mimic human thought processes, giving them the ability to learn, adapt, and improve over time. Here are some ways this happens:

1. Self-Ask

Think of this as a moment of self-reflection for the AI. Before jumping to an answer, the system asks itself follow-up questions to explore the problem further. It’s like pausing to think, “Am I missing something?” This thoughtful approach helps the AI refine its response and improve accuracy.

2. Reason and Act (ReAct)

Imagine an AI keeping a diary of its thoughts while performing tasks. It jots down reasoning traces—notes about its thought process—and uses these to plan its next move. Alongside this, it takes actions like retrieving facts or interacting with external data sources. This combination of reasoning and action makes the AI more adaptable, helping it tackle unexpected challenges or gather extra information when needed.

3. Plan and Solve

This method is like making a to-do list. The AI breaks down a big task into smaller, manageable steps and addresses each one methodically. It’s a bit like preparing for an exam—study one chapter at a time to avoid feeling overwhelmed. This reduces mistakes, ensures clarity, and prevents misunderstandings along the way.

4. Reflect/Self-Critique

Picture someone reviewing their work and noting what went well and what didn’t. Reflexion agents do just that. They reflect on feedback from their tasks, store these reflections in a memory system, and use them to make better decisions in the future. It’s a process of continuous improvement that feels very human.

Frameworks: Building Smarter AI Agents

To bring these ideas to life, various tools and frameworks help developers create and manage AI agents. These tools range from simple orchestrators to advanced systems that can handle complex workflows. Here’s how they help:

Basic Tool Usage and Perception

For straightforward tasks that don’t require deep planning, frameworks like LangChain, LlamaIndex, Prompt Flow, and Semantic Kernel are widely used. These tools make it easier to design AI agents that interact with environments or manage simple tasks effectively.

Advanced Planning and Execution

When tasks demand more intelligence and autonomy, tools like AutoGen lead the way. Since late 2022, AutoGen has powered the rise of multi-agent systems, enabling AI to work collaboratively like a team of experts. Other tools, like OpenAI’s Assistants API, LangChain Agents, and LlamaIndex Agents, also provide powerful solutions for building smarter, more dynamic AI agents.

Practical Application: Bringing It All Together

To truly understand the magic, it’s essential to see these concepts in action. By using one of these frameworks and a unified memory system, you can create a simple yet effective multi-agent system. This setup allows agents to think, plan, and adapt as they work, just like humans navigating real-life challenges.

A Final Thought

AI agents are much more than lines of code—they’re evolving systems that mirror the way we reason, plan, and learn from experience. By implementing these thoughtful methodologies, we’re not just building smarter machines; we’re bringing a touch of humanity to technology, making it more relatable and impactful for everyone.

Rethinking AI Agent Memory Systems: A Human Perspective

From 2022 to 2024, developers experimenting with AI-enhanced applications often relied on separate database management systems to handle different data workflows. For instance, an in-memory database might be used for caching, a relational database for operational data like logs and conversation history, and a vector database for managing embeddings. While this approach worked for specific tasks, it also created a tangled web of standalone databases that made things complicated—and slowed AI agents down.

Imagine a situation where an AI agent needs to act swiftly, pulling from various types of data. But instead of having a seamless memory system, it's stuck juggling between incompatible databases. That’s frustrating for developers and limits what the AI can achieve.

Why the Current Approach Falls Short

Many commonly used databases aren’t built for the speed and flexibility AI systems demand. Here’s why:

1. In-Memory Databases: Fast but Fragile

Think of in-memory databases as sprinters—they’re incredibly fast but can’t handle long-distance runs. For AI agents, which often need to store vast amounts of data over time, this lack of persistence becomes a problem.

2. Relational Databases: Stuck in the Old Ways

Relational databases are like trying to fit a square peg into a round hole. They’re not designed for the fluid, ever-changing nature of AI data. Managing them often requires manual interventions, leading to downtime and frustration. Imagine having to pause your AI system every time you need to update its memory structure—inefficient and impractical.

3. Pure Vector Databases: Limited by Design

Vector databases are great for specific tasks like semantic search, but they struggle with real-time updates and large-scale transactional operations. They often come with limitations like:

Unreliable reads and writes: Imagine trusting something that doesn’t always deliver.
Low availability: Even a few hours of downtime can be a major setback.
Limited security and scalability: This leaves developers feeling exposed and restricted.

Building a Better Memory System

A robust memory system for AI agents needs to feel like a well-organized library—everything in its place and easy to find when needed. It should seamlessly store and retrieve all kinds of data, whether it’s code syntax, tabular data, or meaningful insights.

Why Vector Search Alone Isn’t Enough

While vector search is useful for general information retrieval, it often misses the mark for specific tasks. For example:

Writing code: A vector search might fail to retrieve crucial details like syntax trees or API signatures. Without these, the AI can’t generate accurate or coherent code.
Handling tabular data: It may overlook schemas, foreign keys, or stored procedures that are essential for analyzing structured data.

Moving Toward Unified Memory Systems

Relying on separate in-memory, relational, and vector databases might work for basic prototypes, but it’s far from ideal for advanced AI agents. Such setups are prone to performance bottlenecks, complexity, and inefficiency.

Key Characteristics of a Strong AI Memory System

To truly empower AI agents, a memory system must:

Handle varied data types seamlessly.
Be fast, scalable, and reliable.
Offer robust security and multitenancy.

By rethinking how we design memory systems, we can unlock the full potential of AI agents. Instead of being limited by outdated tools, they can become smarter, faster, and better equipped to handle the complexities of real-world tasks.

This isn’t just about technology—it’s about creating systems that feel intuitive and human-like, systems that work with us, not against us. And isn’t that the whole point of AI?

How Multimodal AI Agent Memory Systems Work and Why They Matter:

Imagine a world where AI systems can "think," "learn," and "remember" just like humans—but in their own unique way. For this to happen, memory systems play a crucial role. These systems aren’t just about storing information; they’re about creating meaningful collections of data that help AI agents perform a variety of tasks across different domains.

Think about it this way: memory can be organized based on the structure of the data, like documents, tables, or code. Or it can focus on the content and meaning of the data, such as concepts, relationships, or step-by-step processes. But memory systems aren’t only vital for AI—they’re equally important for the humans interacting with them.

The Human Touch in AI Memory Systems

Let’s dive into how humans and AI work together. Picture a scenario where a human supervises an AI agent’s planning or execution in real time. Maybe the AI is handling a complex task, like creating a project plan or analyzing a dataset. During this process, the human might step in to adjust the AI’s reasoning, offer suggestions, or even rewrite portions of its output.

These interactions often happen in natural or programming languages—ways that make sense to humans. Meanwhile, the AI “thinks” and “remembers” using embeddings—a mathematical way of storing and understanding data. This fundamental difference highlights the need for memory systems that keep everything consistent, no matter the type of data.

Operational Memory: The AI’s Short- and Long-Term Brain

Operational memory is like the short- and long-term memory we humans rely on every day. For an AI, this might include remembering what a user said in a conversation, keeping track of preferences, analyzing sensory inputs, recording decisions, or managing facts it’s learned. Imagine how frustrating it would be if an AI kept repeating itself or forgot what task it was working on!

A robust memory system prevents these problems. It enables the AI to stay coherent, no matter how many tasks it’s juggling. For instance, if an AI is switching between helping you draft an email and then solving a math problem, it can still remember the context of both tasks without getting confused. In some advanced cases, the AI might explore multiple possibilities (or “branch plans”) and decide the best course of action based on what it has learned.

Shared Memory: Collaboration Without Losing Identity

Now, picture multiple AI agents working together on a single problem. Maybe one agent is analyzing data while another is creating a report. They need shared memory to exchange information smoothly and coordinate their actions. This shared memory acts like a communal workspace where all agents can collaborate effectively.

At the same time, each agent needs to maintain its unique identity and personality. For example, one agent might specialize in creative writing, while another focuses on coding. Their individual memory systems ensure they retain their distinct abilities and work styles, even while sharing relevant information with others.

Why This Matters

The beauty of AI memory systems lies in their ability to bridge the gap between human understanding and machine intelligence. By enabling smooth interactions, maintaining coherence, and supporting collaboration, these systems ensure that AI agents can perform tasks with precision and adaptability.

And for us humans? These systems make it easier to guide, supervise, and trust the AI we rely on. After all, a well-functioning memory isn’t just critical for AI—it’s essential for building stronger human-AI partnerships.

Building a Strong and Efficient AI Agent Memory System:

Creating a reliable memory system for AI agents requires scalability, speed, and simplicity. Early-stage applications may rely on combining different databases like in-memory, relational, and vector systems. While this might work initially, it often leads to unnecessary complexity and performance issues, making it unsuitable for advanced AI systems.

Instead of juggling multiple databases, Azure Cosmos DB offers a unified solution. This powerful database system supported OpenAI’s ChatGPT to scale effortlessly while maintaining high reliability and low maintenance. Azure Cosmos DB is powered by an atom-record-sequence engine, making it the first globally distributed database that integrates NoSQL, relational, and vector capabilities. It provides an ideal foundation for AI agents to perform efficiently at scale.

Speed

Imagine an AI agent that responds instantly to user queries or makes split-second decisions. Azure Cosmos DB delivers single-digit millisecond latency, enabling real-time data processing for tasks like caching, transactions, and operational workloads.

For example, its use of the DiskANN algorithm ensures fast and precise vector searches with minimal memory usage. This speed is essential for AI agents to think, reason, and act quickly without delays, making interactions seamless and intuitive.

Scale

As AI agents grow and handle more complex tasks, their memory systems must expand effortlessly. Azure Cosmos DB is built for global distribution and horizontal scalability, supporting multiple regions and tenants.

This scalability ensures reliability, offering less than 5 minutes of downtime per year, compared to the 9+ hours typical for pure vector databases. This kind of reliability is critical for mission-critical AI workloads. Additionally, flexible models like Reserved Capacity or Serverless help reduce costs, making scaling both practical and economical.

Simplicity

Managing data for AI agents often involves juggling multiple formats, relationships, and schemas. Azure Cosmos DB simplifies this by integrating all database functionalities into a single platform.

Its vector database capabilities allow for storing, indexing, and querying embeddings alongside natural language or code-based data. This feature ensures consistency and performance while making data handling easier.

For example, AI agents often work with data like user preferences, chat history, or newly learned facts. Azure Cosmos DB automatically indexes all this data without requiring manual schema management. This simplifies workflows and lets developers focus on building smarter AI agents instead of spending time on database administration.

Advanced Features

Azure Cosmos DB isn’t just fast and scalable—it’s also packed with advanced features. The change feed allows AI agents to monitor and respond to real-time data changes, keeping them updated and responsive to new information.

Its built-in support for multi-master writes ensures continuous operation, even during regional failures. This resilience is essential for maintaining high availability in critical applications.

Furthermore, the five consistency levels (ranging from strong to eventual) allow you to customize the database's behavior to suit different workloads.

Choosing the Right API

Azure Cosmos DB offers two APIs for building AI agent memory systems:

Azure Cosmos DB for NoSQL: Provides 99.999% availability and supports advanced vector search algorithms like IVF, HNSW, and DiskANN.
vCore-based Azure Cosmos DB for MongoDB: Offers 99.995% availability and supports IVF and HNSW, with DiskANN coming soon.

Why Azure Cosmos DB?

Building AI agents can be challenging, but with Azure Cosmos DB, it feels like you have a dependable partner. Its speed ensures AI agents can think fast, its scalability ensures they can grow without limitations, and its simplicity ensures developers can focus on innovation.

By choosing Azure Cosmos DB, you give your AI agents a reliable memory system that can handle anything—from real-time decisions to massive data workloads—helping them become smarter, faster, and more efficient.

Bringing AI to Travel Assistance: A Realistic Implementation

Imagine booking a cruise vacation and having your queries answered instantly by an AI agent that feels as intuitive as talking to a friend. This section dives into how such an AI-powered travel assistant can be implemented, making it not just a concept but a helpful reality for travelers.

A Smarter Travel Companion

Chatbots have been around for years, but modern AI agents are pushing boundaries. They don’t just mimic conversations; they take action. Tasks that once required complex coding are now handled through natural language processing.

In this example, the AI travel agent is built using the LangChain Agent framework. This framework enables the agent to plan, use tools, and understand user needs efficiently. What makes it even more powerful is its memory system, which relies on Azure Cosmos DB. This technology ensures the system can respond quickly and scale seamlessly while keeping the process simple for both developers and users.

How It Works

The AI travel agent operates behind the scenes with a Python FastAPI backend, while users interact through a sleek, easy-to-use React JavaScript interface. This combination ensures a smooth experience, whether you’re asking about cruise schedules or booking your next adventure.

Prerequisites: Getting Started with the Setup

To recreate this AI travel assistant, here’s what you’ll need:

Azure Subscription: Don’t have one? No worries—you can try Azure Cosmos DB free for 30 days without needing a credit card.
OpenAI API Account: Alternatively, an Azure OpenAI Service account works too.
Azure Cosmos DB vCore Cluster for MongoDB: Follow a quickstart guide to set one up.
Development Tools: Use an integrated environment like Visual Studio Code.
Python 3.11.4: Make sure you’ve installed this version in your environment.

Downloading the Project

The project code and datasets are available in a GitHub repository. Here’s what you’ll find inside:

loader: Python code to load sample travel documents and vector embeddings into Azure Cosmos DB.
api: A Python FastAPI project to host the AI travel agent.
web: React code for building the web interface.

Loading Travel Data into Azure Cosmos DB

The travel assistant starts by populating Azure Cosmos DB with information. Here’s how to set it up:

Set Up Your Environment:Navigate to the loader directory and create a Python virtual environment:
python -m venv venv
Activate and Install Dependencies:Activate the virtual environment and install the required packages:
venv\Scripts\activate python -m pip install -r requirements.txt
Configure Your Environment Variables:Create a .env file in the loader directory and add the following details:
OPENAI_API_KEY="<your OpenAI key>" MONGO_CONNECTION_STRING="mongodb+srv:<your Azure Cosmos DB connection string>"

A Human Perspective

Imagine the convenience: no more waiting on hold for customer service or navigating confusing websites. This AI travel agent simplifies the process, offering personalized, real-time support that feels almost magical. Its implementation isn’t just about technology; it’s about making travel planning stress-free and enjoyable for real people.

With a few tools and some guided steps, this technology can redefine how we explore the world.

The Python file main.py acts as the backbone of a travel booking system powered by Azure Cosmos DB. This file handles the loading of essential travel data, such as information about ships and destinations, directly from a GitHub repository. Beyond just organizing data, it also crafts personalized travel itinerary packages, making it easier for travelers to book unique experiences through an AI-powered agent.

The CosmosDBLoader tool plays a critical role here. It manages the creation of collections, vector embeddings, and indexes in the Azure Cosmos DB instance, ensuring the data is not just stored but also efficiently searchable.

Imagine this: you're planning a cruise vacation. The system takes raw information—like ship details and breathtaking destinations—and transforms it into ready-to-book packages. Every step, from crafting itineraries to enabling quick searches for specific ships, is powered by carefully written Python code.

Here's a breakdown of the main.py:

from cosmosdbloader import CosmosDBLoader
from itinerarybuilder import ItineraryBuilder
import json

# Initialize CosmosDBLoader for the travel database
cosmosdb_loader = CosmosDBLoader(DB_Name='travel')

# Load ship data from a JSON file
with open('documents/ships.json') as file:
    ship_json = json.load(file)

# Load destination data from a JSON file
with open('documents/destinations.json') as file:
    destinations_json = json.load(file)

# Build itineraries using the ItineraryBuilder
builder = ItineraryBuilder(ship_json['ships'], destinations_json['destinations'])

# Create five unique travel packages
itinerary = builder.build(5)

# Save the itineraries to Cosmos DB
cosmosdb_loader.load_data(itinerary, 'itinerary')

# Save destination data to Cosmos DB
cosmosdb_loader.load_data(destinations_json['destinations'], 'destinations')

# Save ship data to Cosmos DB and create a vector store
collection = cosmosdb_loader.load_vectors(ship_json['ships'], 'ships')

# Add a text search index to make ship names easily searchable
collection.create_index([('name', 'text')])

Once the code is ready, you can load the documents, vectors, and create indexes by executing this command:

python main.py

Output of main.py:

When the script runs, you'll see the following messages, confirming each step is completed:

--build itinerary--
--load itinerary--
--load destinations--
--load vectors ships--

Why is this process so significant?

This script simplifies a complex task—organizing and optimizing travel data. The emotional connection here lies in the experience it builds for travelers. Think about how it turns abstract data into meaningful itineraries for people seeking adventures. It’s not just about code; it’s about creating moments where technology meets human aspiration, helping travelers explore the world seamlessly.

Building an AI Travel Agent with Python FastAPI: A Step-by-Step Guide

Imagine planning your dream trip with the help of an AI travel agent that understands your needs perfectly. This smart system is built using Python's FastAPI framework, which powers the backend, connecting it seamlessly with the user interface. Its intelligence comes from processing requests and aligning them with a robust data layer, including vectors and documents stored in Azure Cosmos DB.

This guide takes you through creating such an AI travel agent while making the process simple and exciting. Let’s break it down step by step.

What Makes the AI Travel Agent Work?

The AI travel agent relies on several key components:

FastAPI for Backend: This framework is the backbone, managing the agent's communication with the front-end interface.
Data Layer: Azure Cosmos DB stores documents and vectors, which the API references to process intelligent prompts.
Python Functions: These tools form the core of the agent's operations, enabling it to interact efficiently with the data and user requests.

The project structure is thoughtfully designed to ensure smooth functioning:

Data Models: Pydantic models define how data is structured.
Web Layer: Handles routing and communication with the agent.
Service Layer: Implements business logic and connects with tools like LangChain and the data layer.
Data Layer: Handles interactions with Azure Cosmos DB for storing and searching information.

Step 1: Setting Up the Environment

Before diving into coding, let’s prepare the environment. For this project, we used Python version 3.11.4 to ensure compatibility.

Create a Virtual Environment: Navigate to the api directory and run:
python -m venv venv
Activate the Environment and Install Dependencies: Activate the virtual environment and install the required packages using the requirements.txt file:
venv\Scripts\activate python -m pip install -r requirements.txt
Set Up Environment Variables: Create a .env file in the api directory to store sensitive information:
OPENAI_API_KEY="<your OpenAI API key>" MONGO_CONNECTION_STRING="mongodb+srv:<your Azure Cosmos DB connection string>"

Step 2: Running the Server

Once the setup is complete, you’re ready to bring your AI travel agent to life.

Start the Server: Run the following command from the api directory:
python app.py
Access the API Documentation: The FastAPI server runs on http://127.0.0.1:8000 by default. To explore the Swagger documentation, open your browser and visit:
http://127.0.0.1:8000/docs

This interactive interface makes it easy to test the API endpoints and understand how the agent processes requests.

Why It Matters

Imagine the excitement of building an AI travel agent that simplifies trip planning for users. The seamless integration of Python FastAPI, Azure Cosmos DB, and LangChain tools creates a powerful system capable of delivering personalized travel recommendations. It’s not just about coding; it’s about creating something that adds value to people’s lives.

With this guide, you now have the knowledge to start building your own AI travel agent. The journey may involve challenges, but the satisfaction of seeing your creation in action is unmatched.

To make the explanation more engaging, realistic, and infused with human emotions, here's a rewritten version:

Using Sessions for AI Agent Memory: Enhancing Conversations:

Imagine chatting with an AI travel agent that remembers your preferences, like a helpful friend who recalls your favorite spots or past travel questions. That’s where memory comes into play for AI agents using large language models (LLMs). It’s not just a technical feature—it’s the key to making conversations feel natural and personalized.

To achieve this, the AI agent needs to track the chat history for each ongoing session. This is handled using a session ID, which ensures the agent only accesses messages from the current conversation. Behind the scenes, this information is stored in an Azure Cosmos DB instance. For the system to function seamlessly, there’s a method called Get Session in the API, which serves as a placeholder to manage these sessions. Think of it as the glue that holds your chat history together.

Here’s how it works:

Step 1: Simulating a Session

The agent generates a unique session ID for every new interaction. This ID acts like a digital bookmark, tracking the conversation's flow. In a real-world setup, this session ID would be saved in Azure Cosmos DB or possibly in React localStorage for easy access. For now, the method simply returns a session ID for testing purposes.

Here’s a snippet from the web/session.py file in Python:

@router.get("/")
def get_session():
    return {'session_id': str(uuid.uuid4().hex)}

When you run this method, it generates a session ID, like this one:"session_id": "0505a645526f4d68a3603ef01efaab19"

Step 2: Starting a Conversation

With your session ID in hand, you’re ready to chat with the AI travel agent. Let’s say you start with a simple request:"I want to take a relaxing vacation."

The agent uses this input and the session ID to craft a response. By calling the agent_chat method, the AI dives into its memory and provides personalized recommendations. For instance, it might suggest:

Tranquil Breeze Cruise
Fantasy Seas Adventure Cruise

These options come from a similarity search in the system’s database, which ranks the most relevant travel ideas based on your request. The similarity scores for these options might look like this:

0.839 for Tranquil Breeze Cruise
0.808 for Fantasy Seas Adventure Cruise

Step 3: Fine-Tuning the Results

If you find the suggestions off-target, you can tweak the system by adjusting the similarity search parameters. For example, you can modify the filter value to something like score >= 0.78 to refine the results.

Step 4: Building Conversation History

Every time you chat with the agent, it creates a new collection called history in the Azure Cosmos DB. This collection stores all your past messages from that session. The next time you interact, the agent can refer back to this history, making the experience feel even more seamless and personalized. Over time, the AI becomes like a travel companion who truly understands your preferences.

Why This Matters:

Imagine planning a trip with an agent that remembers not just what you ask but also how you ask. It feels like a real connection—a blend of technology and human-like interaction. This memory feature transforms the AI from a simple tool into a thoughtful assistant, making every conversation more meaningful and effective.

So go ahead, test it out, and let your journey with the AI travel agent begin!

A Walkthrough of the AI Travel Agent: Simplified and Humanized

Imagine you're building a smart AI travel assistant that feels like a helpful friend. Integrating this AI into a system involves different layers working together seamlessly, each playing a crucial role in making the assistant functional and efficient. Let’s break it down step by step, adding a human touch and emotions to make the process engaging and easy to grasp.

Starting with the Basics

When you connect the AI agent to the system's API, everything begins with the web search components. These act like the starting point, sending out requests to fetch the required data. Next, the search service takes over, filtering and processing the data. Finally, the data components handle the actual information exchange.

In this setup, we use MongoDB for data searches, which connects to Azure Cosmos DB. Think of this connection as a bridge that lets different parts of the system talk to each other smoothly. This setup isn’t rigid—it’s flexible enough to switch data sources or add more advanced tools when needed.

The Role of the Service Layer

At the heart of the system lies the service layer, the brain of the operation. This layer holds all the critical business logic and serves as the home for the code that powers the AI travel agent.

For example, in our scenario, the service layer uses LangChain to integrate user inputs, manage conversation memory, and connect with the database. This layer ensures that when a user asks something—like "What’s the best cruise deal?"—the AI not only understands but also remembers the context of the conversation.

To make everything run smoothly, the service layer uses a singleton pattern in a file called init.py. This setup ensures that all essential parts, like tools and prompts, are initialized only once, keeping the system efficient and organized.

How the Code Works

Here’s a sneak peek into what happens in service/init.py—the file that brings everything to life:

Environment Setup: The program first loads variables from a .env file, like database connection details.
Creating the Chat Agent: It initializes the AI agent using the GPT-3.5 model. Specific parameters, like “temperature,” control how creative or precise the responses are.
Adding Tools: The agent is equipped with tools like a vacation lookup, itinerary finder, and cruise booking feature. These make the AI capable of handling real-world travel tasks.
Customizing Prompts: To guide the AI, a tailored prompt is used. For instance:
- "You are a helpful and friendly travel assistant for a cruise company. Answer travel questions with only relevant details. To book a cruise, make sure to capture the person’s name."

The responses are formatted in HTML, making them visually appealing when displayed on the website.

Keeping Conversations Alive

A standout feature is the ability to maintain a conversation history. Imagine asking about cruises today and continuing the chat next week—your AI assistant remembers everything! This is made possible by MongoDBChatMessageHistory, which stores chat data in Azure Cosmos DB.

The method LLM_init() ties it all together. It sets up the chat agent, tools, and memory, creating a seamless experience for users who can interact with the assistant as if talking to a real person.

Why the Prompt Matters

Initially, the AI's instructions were straightforward: "You are a helpful and friendly travel assistant for a cruise company." But through testing, it became clear that adding more specific guidance improved the results. For example:

"Answer travel questions to the best of your ability, providing only relevant information. To book a cruise, capturing the person’s name is essential."

These adjustments made the AI more consistent, focused, and visually appealing when delivering answers.

Final Thoughts:

Building an AI travel agent isn’t just about coding—it’s about creating an experience. Every layer, from the service logic to the memory system, works together to make the assistant feel intuitive and human-like. Whether it’s answering questions, recommending itineraries, or booking a dream cruise, this AI is designed to make travel planning simple, engaging, and stress-free.

By adding thoughtful design and emotional intelligence, we transform a technical system into something that truly connects with users.

Agent Tools and Their Functionality: A Simplified Guide

Imagine you're building an AI agent that acts like a virtual assistant. For it to function well, it needs tools—just like a chef needs utensils to cook. These tools allow the agent to interact with the world and perform specific tasks. Let’s break down how you can create these tools and make them work seamlessly.

Creating Tools for an Agent

When you design an agent, you need to provide it with tools to handle different tasks. These tools are essentially functions equipped with specific purposes. The @tool decorator is the easiest way to create a custom tool.

By default, the tool's name matches the function’s name, but you can change it by giving a custom name in the decorator.
Each tool requires a docstring—a short description of what the tool does. This description is essential, as it helps users understand the tool’s purpose.

Here’s an example from a file named TravelAgentTools.py:

Code Example: TravelAgentTools

from langchain_core.tools import tool
from langchain.docstore.document import Document
from data.mongodb import travel
from model.travel import Ship

@tool
def vacation_lookup(input: str) -> list[Document]:
    """Find information on vacations and trips."""
    ships: list[Ship] = travel.similarity_search(input)
    content = ""
    for ship in ships:
        content += f" Cruise ship {ship.name}, description: {ship.description}, with amenities: {'/n-'.join(ship.amenities)} "
    return content

@tool
def itinerary_lookup(ship_name: str) -> str:
    """Retrieve cruise packages, itineraries, and destinations by ship name."""
    itineraries = travel.itinerary_search(ship_name)
    results = ""
    for i in itineraries:
        results += f" Cruise Package {i.Name}, room prices: {'/n-'.join(i.Rooms)}, schedule: {'/n-'.join(i.Schedule)}"
    return results

@tool
def book_cruise(package_name: str, passenger_name: str, room: str) -> str:
    """Book a cruise package using the package name, passenger's name, and room."""
    if passenger_name == "John Doe":
        return "To book a cruise, I need your name."  
    if not room:
        return "Please specify the room you'd like to book."  
    return "Cruise has been booked! Your reference number is 343242."

Explanation of Tools

vacation_lookup: This tool searches for vacation-related information by scanning a database for similar matches.
itinerary_lookup: It retrieves cruise details, such as schedules and package prices, based on the ship name.
book_cruise: This tool finalizes bookings by confirming the passenger's name, room, and package.

It’s important to include clear instructions (like prompting users to provide their name or room details) to avoid incomplete inputs.

How AI Agents Work

At their core, AI agents rely on language models to decide what actions to take. These actions are executed step by step to achieve the desired outcome.

Here’s an example from TravelAgent.py:

Code Example: TravelAgent

from .init import agent_with_chat_history
from model.prompt import PromptResponse
import time
from dotenv import load_dotenv

load_dotenv(override=False)

def agent_chat(input: str, session_id: str) -> str:
    """Handles user input and returns the agent's response."""
    start_time = time.time()
    results = agent_with_chat_history.invoke(
        {"input": input},
        config={"configurable": {"session_id": session_id}},
    )
    response_time = time.time() - start_time
    return PromptResponse(text=results["output"], ResponseSeconds=response_time)

This function takes user input and a session ID to maintain conversation history. It processes the input, calls the agent, and returns a response along with the time taken to generate it.

Integrating the AI Agent with a React Interface

To make this agent more user-friendly, you can integrate it into a web interface using React. With React, you can create a conversational assistant for travel-related queries and bookings.

Setting Up the React Environment

Install Node.js: Ensure Node.js is installed on your system.
Install Dependencies: Run the following command in your project’s web directory:
npm ci
Create an .env File: Add environment variables in a file named .env:
REACT_APP_API_HOST=http://127.0.0.1:8000
Start the React Application: Run the following command to launch the interface:
npm start

Once the setup is complete, your React application will provide users with a seamless way to interact with the AI agent. They can look up vacations, check cruise schedules, and book trips—all with just a few clicks or through a chat-like interface.

Why This Matters

Building tools and integrating them into a friendly interface makes AI accessible and functional. By combining backend intelligence with a sleek UI, you can revolutionize how people interact with technology—turning complex processes like travel planning into simple, enjoyable experiences.

Walkthrough of the React Web Interface

Imagine stepping into a simple yet functional web application designed to let you interact seamlessly with an AI travel assistant. This project, hosted on a GitHub repository, uses React to bring its interface to life. At its core, two main components – TravelAgent.js and ChatLayout.js – handle the interaction with the AI agent, while the Main.js file acts as the gateway, greeting users and tying everything together.

Here’s a closer look at how it all works, with a touch of human perspective to make it relatable:

Main Component: The Heart of the Application

Think of the Main component as the welcoming host of a cozy café. It’s the first thing users see, offering a friendly layout with easy navigation. It organizes the app’s flow, ensuring everything is in place – from the branding elements like logos and links to the sections where the travel agent AI works its magic.

The Main component doesn’t just display content; it sets the tone, like a host inviting guests to explore. It creates placeholders for key elements such as:

Navigation links (e.g., "Ships" and "Destinations"), thoughtfully styled for visibility.
The Travel Agent Section, where the AI shines.
A Footer, gently reminding users that this is a demo app.

Code Highlights

Let’s dive into the Main.js file, breaking it down in a way that feels like exploring a recipe for something delightful:

import React, { Component } from 'react'
import { Stack, Link, Paper } from '@mui/material'
import TravelAgent from './TripPlanning/TravelAgent'
import './Main.css'

class Main extends Component {
  constructor() {
    super()
  }

  render() {
    return (
      <div className="Main">
        {/* Header Section */}
        <div className="Main-Header">
          <Stack direction="row" spacing={5}>
            <img src="/mainlogo.png" alt="Logo" height={'120px'} />
            <Link href="#" sx={{ color: 'white', fontWeight: 'bold', fontSize: 18 }} underline="hover">
              Ships
            </Link>
            <Link href="#" sx={{ color: 'white', fontWeight: 'bold', fontSize: 18 }} underline="hover">
              Destinations
            </Link>
          </Stack>
        </div>

        {/* Body Section */}
        <div className="Main-Body">
          <div className="Main-Content">
            <Paper elevation={3} sx={{ p: 1 }}>
              <Stack direction="row" justifyContent="space-evenly" alignItems="center" spacing={2}>
                <Link href="#">
                  <img src={require('./images/destinations.png')} width={'400px'} />
                </Link>
                <TravelAgent />
                <Link href="#">
                  <img src={require('./images/ships.png')} width={'400px'} />
                </Link>
              </Stack>
            </Paper>
          </div>
        </div>

        {/* Footer Section */}
        <div className="Main-Footer">
          <b>Disclaimer: Sample Application</b>
          <br />
          Please note that this sample application is provided for demonstration purposes only and should not be used in production environments without proper validation and testing.
        </div>
      </div>
    )
  }
}

export default Main

Breaking Down the Experience

Header:The top section feels like the opening credits of a movie. The logo stands proudly, and links like "Ships" and "Destinations" invite users to explore further.
Body:Here lies the main attraction – a sleek, interactive section showcasing the travel agent. Images on either side create a balanced, visually engaging interface.
Footer:A small yet thoughtful note at the bottom serves as a polite nudge, reminding users this is a demo – like a friendly shopkeeper telling you it’s just a sample product.

Why It Matters

This React-based interface isn’t just functional; it’s designed with a human touch. Every element, from the layout to the disclaimer, shows consideration for the user. It’s a space where technology meets thoughtful design, ensuring users feel comfortable while interacting with the AI agent.

By blending straightforward coding principles with a user-centric approach, this app exemplifies how technology can feel welcoming, almost like having a conversation with a helpful friend.

Travel Agent Component: Your Personal Vacation Planner

Imagine having a friendly travel expert at your fingertips, ready to help you plan the perfect getaway. That’s exactly what the Travel Agent component does! It’s not just about inputting details and getting responses – it feels like you’re chatting with someone who genuinely cares about making your trip special.

This component acts as a bridge between you and the AI-powered travel assistant. It listens to your requests, forwards them to the backend (via FastAPI), and then delivers thoughtful responses. Everything it collects and communicates is organized neatly, thanks to the ChatLayout component that displays the conversation like a smooth-flowing dialogue.

Let’s break it down step-by-step:

What the Travel Agent Does

Captures Your Ideas:Whether you want to "explore snowy mountains" or "find the best beach for sunsets," the Travel Agent component listens carefully and records your input.
Chats with the Backend AI:It sends your requests to the backend FastAPI service, ensuring the AI understands your needs.
Displays the Conversation:Like a friendly chat, the responses are beautifully displayed in a chat-like interface, making the entire process feel natural and engaging.

Code Insights: How It All Works

Here’s how the magic happens behind the scenes. This code doesn’t just handle inputs and outputs – it’s designed to create a smooth, human-like interaction:

import React, { useState, useEffect } from 'react'
import { Button, Box, Link, Stack, TextField } from '@mui/material'
import SendIcon from '@mui/icons-material/Send'
import { Dialog, DialogContent } from '@mui/material'
import ChatLayout from './ChatLayout'
import './TravelAgent.css'

export default function TravelAgent() {
  const [open, setOpen] = useState(false)
  const [session, setSession] = useState('')
  const [chatPrompt, setChatPrompt] = useState('I want to take a relaxing vacation.')
  const [message, setMessage] = useState([
    { message: 'Hello, how can I assist you today?', direction: 'left', bg: '#E7FAEC' },
  ])

  const handlePrompt = (prompt) => {
    setChatPrompt('')
    setMessage((prevMessages) => [
      ...prevMessages,
      { message: prompt, direction: 'right', bg: '#E7F4FA' },
    ])
    fetch(process.env.REACT_APP_API_HOST + '/agent/agent_chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ input: prompt, session_id: session }),
    })
      .then((response) => response.json())
      .then((res) => {
        setMessage((prevMessages) => [
          ...prevMessages,
          { message: res.text, direction: 'left', bg: '#E7FAEC' },
        ])
      })
  }

  const handleSession = () => {
    fetch(process.env.REACT_APP_API_HOST + '/session/')
      .then((response) => response.json())
      .then((res) => {
        setSession(res.session_id)
      })
  }

  const handleClickOpen = () => setOpen(true)

  const handleClose = () => setOpen(false)

  useEffect(() => {
    if (!session) handleSession()
  }, [session])

  return (
    <Box>
      <Dialog onClose={handleClose} open={open} maxWidth="md" fullWidth>
        <DialogContent>
          <Stack>
            <Box sx={{ height: '500px' }}>
              <div className="AgentArea">
                <ChatLayout messages={message} />
              </div>
            </Box>
            <Stack direction="row" spacing={0}>
              <TextField
                sx={{ width: '80%' }}
                variant="outlined"
                label="Message"
                helperText="Chat with AI Travel Agent"
                value={chatPrompt}
                onChange={(e) => setChatPrompt(e.target.value)}
              />
              <Button
                variant="contained"
                endIcon={<SendIcon />}
                sx={{ mb: 3, ml: 3, mt: 1 }}
                onClick={() => handlePrompt(chatPrompt)}
              >
                Submit
              </Button>
            </Stack>
          </Stack>
        </DialogContent>
      </Dialog>
      <Link href="#" onClick={handleClickOpen}>
        <img src={require('.././images/planvoyage.png')} width={'400px'} alt="Plan Your Voyage" />
      </Link>
    </Box>
  )
}

Breaking Down the Experience

A Warm Greeting:The conversation starts with a friendly message: "Hello, how can I assist you today?" It feels personal, like talking to a real travel planner.
Your Ideas, Instantly Understood:When you type something like "I want to take a relaxing vacation," the AI doesn’t just understand your words – it seems to get the feelings behind them.
Smooth Conversations:Messages flow back and forth in a visually pleasant way. The Travel Agent component makes sure the AI’s responses are displayed clearly, almost as if you’re chatting with a person.
A Single Click Away:Opening the assistant is as simple as clicking a button. This intuitive design ensures you don’t feel overwhelmed or lost.

Why It Feels Human

The Travel Agent component is more than just a piece of code. It’s a thoughtfully designed feature that prioritizes user comfort and engagement. From its friendly greetings to its intuitive interface, it creates a sense of companionship – like having a caring travel expert right on your screen.

Whether you’re dreaming of a peaceful escape or an adventurous journey, this tool makes planning feel effortless and enjoyable. It’s not just functional; it’s welcoming, relatable, and genuinely helpful – exactly what you’d expect from a great travel assistant!

Chat Layout: The Heartbeat of Conversations

The chat layout is more than just an arrangement of messages—it’s the stage where your interaction with the AI truly comes to life. It carefully processes each message, ensuring everything is displayed in an organized, visually appealing, and meaningful way. Whether you're sending a quick prompt or reading a thoughtful response, the layout creates a seamless and engaging experience that feels personal and real.

Imagine this: the user’s prompts appear on the right in a calming blue, while the AI’s thoughtful replies are neatly displayed on the left in a soothing green. This thoughtful design isn’t just functional—it feels warm and inviting, like a real back-and-forth conversation.

How the Chat Layout Works

The Chat Layout component isn’t just a technical tool—it’s like a silent organizer that makes sure every message looks and feels just right. Here's how it works:

Processes Messages:Each message is structured according to the formatting rules defined in the JSON object. This ensures clarity and consistency, making the conversation easy to follow.
Displays Messages Visually:The user’s messages are aligned on the right, while the AI’s responses are on the left. Colors are used intentionally—blue for user inputs and green for AI replies—to keep the conversation visually distinct and organized.
Handles HTML Formatting:If the AI generates HTML-formatted responses, this component ensures they’re displayed correctly, making even complex replies easy to read.

Code Breakdown: Bringing Conversations to Life

Here’s the code that makes this magic happen:

import React from 'react'
import { Box, Stack } from '@mui/material'
import parse from 'html-react-parser'
import './ChatLayout.css'

export default function ChatLayout(messages) {
  return (
    <Stack direction="column" spacing="1">
      {messages.messages.map((obj, i = 0) => (
        <div className="bubbleContainer" key={i}>
          <Box
            key={i++}
            className="bubble"
            sx={{ float: obj.direction, fontSize: '10pt', background: obj.bg }}
          >
            <div>{parse(obj.message)}</div>
          </Box>
        </div>
      ))}
    </Stack>
  )
}

How It Feels to Use

When you use the Chat Layout, it doesn’t feel like you’re interacting with a robotic system. Instead, it feels human—like a carefully crafted design meant to make you feel comfortable and understood.

User Prompts: When you type something like, "Find me a cozy mountain cabin," it appears on the right in blue, reflecting your role in steering the conversation.
AI Responses: The AI replies in green on the left, offering suggestions like, "Here’s a list of peaceful cabins that match your criteria." The clear formatting makes it easy to follow and feels like a natural exchange.

Why This Matters

A well-organized chat layout isn’t just about aesthetics—it’s about creating a connection. The thoughtful alignment and color choices make you feel seen and heard, while the seamless display of messages ensures nothing gets lost in translation.

Pro Tip for Production

If you’re planning to take your AI agent to the next level, consider implementing semantic caching. This technique can:

Boost query performance by 80%,
Cut down on API call costs, and
Make your AI interactions even faster and more efficient.

For a deep dive into semantic caching, check out the Stochastic Coder blog.

The Chat Layout is more than just a technical component—it’s the soul of your interaction, ensuring every conversation feels smooth, personal, and enjoyable. Whether you’re asking for travel tips or getting detailed itineraries, this component transforms the experience into something truly memorable.