📄 Page
1
Building Applications with AI Agents Designing and Implementing Multiagent Systems Michael Albada
📄 Page
2
ISBN: 978-1-098-17650-1 US $79.99 CAN $99.99 DATA Generative AI has revolutionized how organizations tackle problems, accelerating the journey from concept to prototype to solution. As the models become increasingly capable, we have witnessed a new design pattern emerge: AI agents. By combining tools, knowledge, memory, and learning with advanced foundation models, we can now sequence multiple model inferences together to solve ambiguous and difficult problems. From coding agents to research agents to analyst agents and more, we’ve already seen agents accelerate teams and organizations. While these agents enhance efficiency, they often require extensive planning, drafting, and revising to complete complex tasks, and deploying them remains a challenge for many organizations, especially as technology and research rapidly develops. This book is your indispensable guide through this intricate and fast-moving landscape. Author Michael Albada provides a practical and research-based approach to designing and implementing single- and multiagent systems. It simplifies the complexities and equips you with the tools to move from concept to solution efficiently. • Understand the distinct features of foundation model-enabled AI agents • Discover the core components and design principles of AI agents • Explore design trade-offs and implement effective multiagent systems • Design and deploy tailored AI solutions, enhancing efficiency and innovation in your field Michael Albada is a seasoned machine learning engineer with expertise in deploying large-scale solutions for major tech firms including Uber, ServiceNow, and Microsoft. He holds degrees from Stanford University, the University of Cambridge, and Georgia Tech, specializing in machine learning. Building Applications with AI Agents “The best single-volume introduction to building AI agent systems—you can read hundreds of papers or this one book.” Arun Rao Ex-Meta GenAI group, adjunct professor at UCLA
📄 Page
3
Praise for Building Applications with AI Agents Finally, a book about really scaling AI into the human workforce. Michael does a great job leveraging his expertise at scalable organizations like Uber and Microsoft to teach any technical leader in a small and medium business how to really create scalable agentic solutions for their transformation. —Birju Shah, professor of product management and AI at Kellogg School of Management, Northwestern University, former head of Uber AI product team A sharp, practical guide, Building Applications with AI Agents equips leaders to move from generative AI hype to real-world systems. It distills complex concepts into actionable strategies, bridging vision and execution for organizations seeking measurable efficiency and competitive edge. —Amanda Cheng, partner of Founders Bay As a clinician working at the intersection of medicine and technology, I found this to be an essential read for anyone building AI agents—clear, practical, and rich with insight into tools, orchestration, and design patterns relevant to healthcare use cases like intake, triage, and workflow integration. —Carrie Ho, MD, assistant professor, hematologist/oncologist, UCSF
📄 Page
4
This is the book I wish every team had before deploying agents, a clear, rigorous approach to architecture, safety, and measurement that accelerates delivery and reduces risk. —Brad Sarsfield, senior director, Microsoft Security AI Research & Development The best single-volume introduction to building AI agent systems—you can read hundreds of papers or this one book. —Arun Rao, ex-Meta GenAI group, adjunct professor at UCLA
📄 Page
5
Michael Albada Building Applications with AI Agents Designing and Implementing Multiagent Systems
📄 Page
6
978-1-098-17650-1 [LSI] Building Applications with AI Agents by Michael Albada Copyright © 2025 Advance AI LLC. All rights reserved. Published by O’Reilly Media, Inc., 141 Stony Circle, Suite 195, Santa Rosa, CA 95401. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (https://oreilly.com). For more information, contact our corporate/institu‐ tional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Nicole Butterfield Development Editor: Shira Evans Production Editor: Ashley Stussy Copyeditor: nSight, Inc. Proofreader: Piper Content Partners Indexer: nSight, Inc. Cover Designer: Karen Montgomery Cover Illustrator: José Marzan Jr. Interior Designer: David Futato Interior Illustrator: Kate Dullea September 2025: First Edition Revision History for the First Edition 2025-09-16: First Release See https://oreilly.com/catalog/errata.csp?isbn=9781098176501 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Building Applications with AI Agents, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
📄 Page
7
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1. Introduction to Agents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Defining AI Agents 1 The Pretraining Revolution 2 Types of Agents 3 Model Selection 5 From Synchronous to Asynchronous Operations 6 Practical Applications and Use Cases 7 Workflows and Agents 8 Principles for Building Effective Agentic Systems 11 Organizing for Success in Building Agentic Systems 12 Agentic Frameworks 13 LangGraph 13 AutoGen 14 CrewAI 14 OpenAI Agents Software Development Kit (SDK) 14 Conclusion 15 2. Designing Agent Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Our First Agent System 17 Core Components of Agent Systems 20 Model Selection 21 Tools 24 Designing Capabilities for Specific Tasks 24 Tool Integration and Modularity 25 Memory 25 Short-Term Memory 26 v
📄 Page
8
Long-Term Memory 26 Memory Management and Retrieval 26 Orchestration 27 Design Trade-Offs 27 Performance: Speed/Accuracy Trade-Offs 27 Scalability: Engineering Scalability for Agent Systems 28 Reliability: Ensuring Robust and Consistent Agent Behavior 29 Costs: Balancing Performance and Expense 30 Architecture Design Patterns 32 Single-Agent Architectures 32 Multiagent Architectures: Collaboration, Parallelism, and Coordination 32 Best Practices 34 Iterative Design 34 Evaluation Strategy 35 Real-World Testing 37 Conclusion 39 3. User Experience Design for Agentic Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Interaction Modalities 42 Text-Based 43 Graphical Interfaces 46 Speech and Voice Interfaces 50 Video-Based Interfaces 53 Combining Modalities for Seamless Experiences 54 The Autonomy Slider 55 Synchronous Versus Asynchronous Agent Experiences 58 Design Principles for Synchronous Experiences 58 Design Principles for Asynchronous Experiences 59 Finding the Balance Between Proactive and Intrusive Agent Behavior 59 Context Retention and Continuity 60 Maintaining State Across Interactions 61 Personalization and Adaptability 62 Communicating Agent Capabilities 63 Communicating Confidence and Uncertainty 64 Asking for Guidance and Input from Users 65 Failing Gracefully 65 Trust in Interaction Design 66 Conclusion 68 4. Tool Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 LangChain Fundamentals 72 Local Tools 73 vi | Table of Contents
📄 Page
9
API-Based Tools 75 Plug-In Tools 78 Model Context Protocol 81 Stateful Tools 84 Automated Tool Development 85 Foundation Models as Tool Makers 85 Real-Time Code Generation 86 Tool Use Configuration 87 Conclusion 88 5. Orchestration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Agent Types 90 Reflex Agents 90 ReAct Agents 90 Planner-Executor Agents 91 Query-Decomposition Agents 91 Reflection Agents 91 Deep Research Agents 92 Tool Selection 93 Standard Tool Selection 94 Semantic Tool Selection 97 Hierarchical Tool Selection 101 Tool Execution 105 Tool Topologies 105 Single Tool Execution 106 Parallel Tool Execution 107 Chains 107 Graphs 109 Context Engineering 112 Conclusion 113 6. Knowledge and Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Foundational Approaches to Memory 116 Managing Context Windows 116 Traditional Full-Text Search 117 Semantic Memory and Vector Stores 119 Introduction to Semantic Search 119 Implementing Semantic Memory with Vector Stores 119 Retrieval-Augmented Generation 121 Semantic Experience Memory 122 GraphRAG 123 Using Knowledge Graphs 123 Table of Contents | vii
📄 Page
10
Building Knowledge Graphs 124 Promise and Peril of Dynamic Knowledge Graphs 130 Note-Taking 133 Conclusion 134 7. Learning in Agentic Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Nonparametric Learning 135 Nonparametric Exemplar Learning 135 Reflexion 137 Experiential Learning 141 Parametric Learning: Fine-Tuning 146 Fine-Tuning Large Foundation Models 146 The Promise of Small Models 151 Supervised Fine-Tuning 153 Direct Preference Optimization 158 Reinforcement Learning with Verifiable Rewards 161 Conclusion 162 8. From One Agent to Many. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 How Many Agents Do I Need? 163 Single-Agent Scenarios 163 Multiagent Scenarios 170 Swarms 177 Principles for Adding Agents 178 Multiagent Coordination 180 Democratic Coordination 180 Manager Coordination 181 Hierarchical Coordination 182 Actor-Critic Approaches 182 Automated Design of Agent Systems 184 Communication Techniques 189 Local Versus Distributed Communication 189 Agent-to-Agent Protocol 189 Message Brokers and Event Buses 192 Actor Frameworks: Ray, Orleans, and Akka 195 Orchestration and Workflow Engines 199 Managing State and Persistence 201 Conclusion 202 9. Validation and Measurement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Measuring Agentic Systems 205 Measurement Is the Keystone 206 viii | Table of Contents
📄 Page
11
Integrating Evaluation into the Development Lifecycle 207 Creating and Scaling Evaluation Sets 207 Component Evaluation 209 Evaluating Tools 209 Evaluating Planning 210 Evaluating Memory 212 Evaluating Learning 213 Holistic Evaluation 214 Performance in End-to-End Scenarios 214 Consistency 216 Coherence 217 Hallucination 218 Handling Unexpected Inputs 219 Preparing for Deployment 220 Conclusion 221 10. Monitoring in Production. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Monitoring Is How You Learn 224 Monitoring Stacks 226 Grafana with OpenTelemetry, Loki, and Tempo 227 ELK Stack (Elasticsearch, Logstash/Fluentd, Kibana) 227 Arize Phoenix 228 SigNoz 229 Langfuse 229 Choosing the Right Stack 230 OTel Instrumentation 230 Visualization and Alerting 232 Monitoring Patterns 235 Shadow Mode 235 Canary Deployments 235 Regression Trace Collection 236 Self-Healing Agents 236 User Feedback as an Observability Signal 236 Distribution Shifts 237 Metric Ownership and Cross-Functional Governance 239 Conclusion 241 11. Improvement Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Feedback Pipelines 245 Automated Issue Detection and Root Cause Analysis 250 Human-in-the-Loop Review 251 Prompt and Tool Refinement 254 Table of Contents | ix
📄 Page
12
Aggregating and Prioritizing Improvements 259 Experimentation 260 Shadow Deployments 261 A/B Testing 262 Bayesian Bandits 263 Continuous Learning 265 In-Context Learning 265 Offline Retraining 267 Conclusion 268 12. Protecting Agentic Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 The Unique Risks of Agentic Systems 272 Emerging Threat Vectors 273 Securing Foundation Models 275 Defensive Techniques 276 Red Teaming 278 Threat Modeling with MAESTRO 281 Protecting Data in Agentic Systems 283 Data Privacy and Encryption 283 Data Provenance and Integrity 285 Handling Sensitive Data 286 Securing Agents 288 Safeguards 288 Protections from External Threats 290 Protections from Internal Failures 292 Conclusion 296 13. Human-Agent Collaboration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Roles and Autonomy 297 The Changing Role of Humans in Agent Systems 298 Aligning Stakeholders and Driving Adoption 299 Scaling Collaboration 300 Agent Scope and Organizational Roles 302 Shared Memory and Context Boundaries 303 Trust, Governance, and Compliance 305 The Lifecycle of Trust 305 Accountability Frameworks 306 Escalation Design and Oversight 309 Privacy and Regulatory Compliance 310 Conclusion: The Future of Human-Agent Teams 312 x | Table of Contents
📄 Page
13
Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Table of Contents | xi
📄 Page
14
(This page has no text content)
📄 Page
15
Preface When I first started connecting language models, tools, orchestration, and memory together into what we now call an agent, I was surprised by how capable this design pattern was, and just how much confusion there was about this topic. During my time building agents and sharing my findings on incident investigation, threat hunt‐ ing, vulnerability detection, and more, I found that this latest design pattern enabled us to solve whole new classes of problems, but also came with many practical hurdles to making them reliable for real-world applications. Engineers, scientists, product managers, and leadership all wanted to know more. “How do I get my agent to work?” “I can get my agent to work some of the time, but how do I get it to work most or all of the time?” “How do I choose a model for my use case?” “How do I design good tools for my agent?” “What kind of memory do I need?” “Should I use RAG?” “Should I build a single-agent or multiagent system?” “What architecture should I use?” “Do I need to fine-tune?” “How do I enable agents to learn from expe‐ rience and improve over time?” While there are many blog posts and research papers that focus on specific aspects of the topic of designing agent systems, I realized there were a lack of accessible, holistic, trustworthy guides for this. I couldn’t find the book that I wanted to share with my colleagues, so I set out to write it. Through in-depth discussions, I’ve helped teams navigate the complexities of AI agents, considering their unique goals, constraints, and environments. AI agent sys‐ tems are intricate, blending autonomy, decision making, and interaction in ways that traditional software doesn’t. They’re data-driven, adaptive, and involve multiple com‐ ponents like perception, reasoning, action, and learning, all while interfacing with users, tools, and other agents. Complicating matters, the foundation models that power these agents are probabilistic and stochastic by nature, making evaluation and testing more challenging. This book takes a comprehensive approach to building applications with AI agents. It covers the entire lifecycle, from conceptualization to deployment and maintenance, xiii
📄 Page
16
illustrated with real-world case studies, supported by references, and reviewed by practitioners in the field. Sections on advanced topics—like agent architectures, tool integration, memory systems, orchestration, multiagent coordination, measurement, monitoring, security, and ethical considerations—are further refined by expert input. Writing this book has been a journey of discovery for me as well. The initial drafts sparked conversations that challenged my views and introduced new ideas. I hope this process continues as you read it, bringing your own insights. Feel free to share any feedback you might have for this book via Twitter (X), LinkedIn, my personal website, or any other channels that you can find. What This Book Is About This book provides a practical framework for building robust applications using AI agents. It addresses key challenges and offers solutions to questions such as: • What defines an AI agent, and when should I use one? How do agents differ from traditional machine learning (ML) systems? • How do I design agent architectures for specific use cases, including scenario selection, and core components like tools, memory, planning, and orchestration? • What are effective strategies for agent planning, reasoning, execution, tool selec‐ tion, and topologies like chains, trees, and graphs? • How can I enable agents to learn from experience through nonparametric meth‐ ods, fine-tuning, and transfer learning? • How do I scale from single-agent to multiagent systems, including coordination patterns like democratic, hierarchical, or actor-critic approaches? • How do I evaluate and improve agent performance with metrics, testing, and production monitoring? • What tools and frameworks are best for development, deployment, and securing agents against risks? • How do I ensure agents are safe, ethical, and scalable, with considerations for user experience (UX), trust, bias, fairness, and regulatory compliance? The content draws from established engineering principles and emerging practices in AI agents, with case studies (such as customer support, personal assistants, legal, advertising, and code review agents) and discussions on trade-offs to help you tailor solutions to your needs. xiv | Preface
📄 Page
17
What This Book Is Not This book isn’t an introduction to AI or ML basics. It assumes familiarity with con‐ cepts like neural networks, natural language processing, and basic programming in languages like Python. If you’re new to these, pointers to resources are provided, but the focus is on applied agent building. It’s also not a step-by-step tutorial for specific tools, as technologies evolve rapidly. Instead, it offers guidance on evaluating and selecting tools, with pseudocode and examples to illustrate concepts. For hands-on implementation, online tutorials and documentation are recommended, including frameworks like LangChain and AutoGen. Who This Book Is For This book is for engineers, developers, and technical leaders aiming to build AI agent-based applications. It’s geared toward roles like AI engineers, software develop‐ ers, ML engineers, data scientists, and product managers with a technical bent. You might relate to scenarios like the following: • You’re tasked with building an autonomous system for decision support, or inter‐ active services. • You have a working agent prototype and you want to harden it and get it ready for production. • Your team struggles with agent reliability—handling failures, adapting to dynamic environments, or orchestrating complex tasks—and you want system‐ atic approaches including orchestration, memory, and learning from experience. • You’re integrating agents into existing workflows and seek best practices for scal‐ ability, multiagent coordination, UX design, measurement, validation, monitor‐ ing, and security. You can also benefit if you’re a tool builder identifying gaps in the agent ecosystem, a researcher exploring applications, or a job seeker preparing for AI agent roles. Navigating This Book The chapters follow the lifecycle of building an AI agent application, organized into three main sections. The first three chapters cover core concepts, design principles, and essential components: Preface | xv
📄 Page
18
• Chapter 1 introduces agents, their promise, use cases, how they compare to tradi‐ tional ML, and recent advancements. • Chapter 2 provides an overview of designing agent systems, including scenario selection, core components (model selection, tools, memory, planning), design trade-offs, architecture patterns (single-agent, multiagent, modular), and best practices. • Chapter 3 focuses on UX design, covering interaction modalities (text, graphical, speech, video), synchronous versus asynchronous experiences, context retention, communicating capabilities, trust, and key UX principles. The next five chapters focus on creating, orchestrating, and scaling agents: • Chapter 4 dives into tools, including design (local, API-based, plug-in, hierar‐ chies) and automated tool development (code generation, imitation learning, tool learning from rewards). • Chapter 5 covers orchestration, with fundamentals (parameterization, tool selec‐ tion, execution), tool selection methods (generative, semantic, hierarchical, machine-learned), tool topologies (decomposition, single/parallel/sequential exe‐ cution, chains, trees, graphs), and planning strategies (incremental execution, zero-shot, few-shot, ReAct). • Chapter 6 explores memory, including foundational approaches (context win‐ dows, keyword-based), semantic memory and vector stores (semantic search, RAG, experience memory), GraphRAG (knowledge graphs), and working mem‐ ory (whiteboards, note-taking). • Chapter 7 addresses learning from experience, with nonparametric learning (experiences as examples, exploration/exploitation, reflection), parametric learn‐ ing (fine-tuning large/small models), and transfer learning. • Chapter 8 discusses scaling from one agent to many, including when to use mul‐ tiagents, coordination (democratic, manager, hierarchical, actor-critic, automated design), and frameworks such as LangChain. The final five chapters address validation, monitoring, security, improvement, and human-agent integration: • Chapter 9 covers measurement and validation, with key objectives (accuracy, robustness, efficiency, etc.), evaluation sets, unit tests (tools, planning, memory, learning), integration tests (end-to-end, consistency, hallucinations), limitations, and deployment preparation. • Chapter 10 focuses on production monitoring, including causes of failures, agent metrics (system health, automated/human evaluation, feedback), distribution shifts, and monitoring at scale (analytics, alerting, logging). xvi | Preface
📄 Page
19
• Chapter 11 explores improvement loops, with feedback pipelines (issue detec‐ tion, human review, refinement, prioritization), experimentation (shadow deployments, A/B testing, adaptive, gating), and continuous learning (in-context, offline retraining, online reinforcement). • Chapter 12 addresses protecting agent systems, covering unique risks, securing LLMs (model selection, defenses, red teaming, fine-tuning), data protection (pri‐ vacy, provenance), securing agents (safeguards, external/internal protections), and governance/compliance. • Chapter 13 discusses humans and agents, with ethical principles (oversight, transparency, fairness, explainability, privacy), building trust/oversight, address‐ ing bias, and accountability/regulatory considerations. Feel free to skip sections you’re familiar with—the book is modular by design. Note: I often use “we” to refer to you (the reader) and me, fostering a collaborative learning vibe. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://oreil.ly/building-applications-with-ai-agents-supp. If you have a technical question or a problem using the code examples, please email support@oreilly.com. Preface | xvii
📄 Page
20
This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission. We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Building Applications with AI Agents by Michael Albada (O’Reilly). Copyright 2025 Advance AI LLC, 978-1-098-17650-1.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 141 Stony Circle, Suite 195 Santa Rosa, CA 95401 800-889-8969 (in the United States or Canada) 707-827-7019 (international or local) 707-829-0104 (fax) support@oreilly.com https://oreilly.com/about/contact.html xviii | Preface