Principles of Building AI Agents (Sam Bhagwat) (Z-Library)

Sam Bhagwat Cofounder & CEO Mastra.ai Sam Bhagw at Principles of Building A I A gents Principles of Building AI Agents 2nd Edition Understanding frontier tech is essential for building the future. Sam has done it once with Gatsby, and now again with Mastra - Paul Klein, CEO of Browserbase Rapid advances in large language models (LLMs) have made new kinds of AI applications, known as agents, possible. Written by a veteran of web development, Principles of Building AI Agents focuses on the substance without hype or buzzwords. This book walks through: • The key building blocks of agents: providers, models, prompts, tools, memory • How to break down complex tasks with agentic workflows • Giving agents access to knowledge bases with RAG (retrieval-augmented generation) If you’re trying to build agents or assistants into your product, you need to read “Principles” ASAP - Peter Kazanjy, Author of Founding Sales and CEO of Atrium Sam Bhagwat is the founder of Mastra, an open-source JavaScript agent framework, and previously the co-founder of Gatsby, the popular React framework. Get a free digital copy for your e-reader: mastra.ai/book

PRINCIPLES OF BUILDING AI AGENTS SAM BHAGWAT

(This page has no text content)

CONTENTS Foreword ix Sam Bhagwat Introduction xiii PART I PROMPTING A LARGE LANGUAGE MODEL (LLM) 1. A BRIEF HISTORY OF LLMS 3 2. CHOOSING A PROVIDER AND MODEL 5 Hosted vs open-source 5 Model size: accuracy vs cost/latency 6 Context window size 6 Reasoning models 7 Providers and models (May 2025) 8 3. WRITING GREAT PROMPTS 9 Give the LLM more examples 9 A “seed crystal” approach 10 Use the system prompt 10 Weird formatting tricks 11 Example: a great prompt 11 PART II BUILDING AN AGENT 4. AGENTS 101 15 Levels of Autonomy 15 Code Example 16

5. MODEL ROUTING AND STRUCTURED OUTPUT 18 Structured output 19 6. TOOL CALLING 20 Designing your tools: the most important step 22 Real-world example: Alana’s book recommendation agent 22 7. AGENT MEMORY 25 Working memory 25 Hierarchical memory 26 Memory processors 28 `TokenLimiter` 28 `ToolCallFilter` 29 8. DYNAMIC AGENTS 31 What are Dynamic Agents? 31 Example: Creating a Dynamic Agent 32 Agent middleware 33 9. AGENT MIDDLEWARE 34 Guardrails 34 Agent authentication and authorization 35 PART III TOOLS & MCP 10. POPULAR THIRD-PARTY TOOLS 39 Web scraping & computer use 39 Third-party integrations 40

11. MODEL CONTEXT PROTOCOL (MCP): CONNECTING AGENTS AND TOOLS 42 What is MCP 42 MCP Primitives 43 The MCP Ecosystem 43 When to use MCP 44 Building an MCP Server and Client 45 What’s next for MCP 46 Conclusion 47 PART IV GRAPH-BASED WORKFLOWS 12. WORKFLOWS 101 51 13. BRANCHING, CHAINING, MERGING, CONDITIONS 52 Branching 52 Chaining 53 Merging 54 Conditions 55 Best Practices and Notes 56 14. SUSPEND AND RESUME 57 15. STREAMING UPDATES 61 How to streaming from within functions 62 Why streaming matters 63 How to Build This 63 16. OBSERVABILITY AND TRACING 65 Observability 65 Tracing 66

Evals 68 Final notes on observability and tracing 68 PART V RETRIEVAL-AUGMENTED GENERATION (RAG) 17. RAG 101 73 18. CHOOSING A VECTOR DATABASE 75 19. SETTING UP YOUR RAG PIPELINE 77 Chunking 77 Embedding 78 Upsert 78 Indexing 78 Querying 79 Reranking 79 Code Example 80 20. ALTERNATIVES TO RAG 83 Agentic RAG 83 Reasoning-Augmented Generation (ReAG) 84 Full Context Loading 85 Conclusion 86 PART VI MULTI-AGENT SYSTEMS 21. MULTI-AGENT 101 89 22. AGENT SUPERVISOR 92 23. CONTROL FLOW 93 24. WORKFLOWS AS TOOLS 94

25. COMBINING THE PATTERNS 95 26. MULTI-AGENT STANDARDS 97 How A2A works 98 A2A vs. MCP 98 PART VII EVALS 27. EVALS 101 103 28. TEXTUAL EVALS 105 Accuracy and reliability 105 Understanding context 106 Output 107 Code Example 108 29. OTHER EVALS 109 Classification or Labeling Evals 109 Agent Tool Usage Evals 109 Prompt Engineering Evals 110 A/B testing 110 Human data review 111 PART VIII DEVELOPMENT & DEPLOYMENT 30. LOCAL DEVELOPMENT 115 Building an agentic web frontend 115 Building an agent backend 116 31. DEPLOYMENT 118 Deployment challenges 118 Using a managed platform 119

PART IX EVERYTHING ELSE 32. MULTIMODAL 123 Image Generation 124 Use Cases 125 Voice 126 Video 128 33. CODE GENERATION 129 34. WHAT’S NEXT 131

FOREWORD SAM BHAGWAT 2nd edition Two months is a short time to write a new edition of a book, but life moves fast in AI. This edition has new content on MCP, image gen, voice, A2A, web browsing and computer use, workflow streaming, code generation, agentic RAG, and deployment. AI engineering continues to get hotter and hotter. Mastra’s weekly downloads have doubled each of the last two months. At a typical SF AI evening meetup, I give away a hundred copies of this book. Then two days ago, a popular developer news- letter tweeted about this book and 3,500 people (!) downloaded a digital copy (available for free at mastra.ai/book if you are reading a paper copy).

x SAM BHAGWAT So yes, 2025 is truly the year of agents. Thanks for reading, and happy building! Sam Bhagwat San Francisco, CA May 2025 1st edition For the last three months, I’ve been holed up in an apartment in San Francisco’s Dogpatch district with my cofounders, Shane Thomas and Abhi Aiyer. We’re building an open-source JavaScript frame- work called Mastra to help people build their own AI agents and assistants. We’ve come to the right spot. We’re in the Winter 2025 batch of YCombinator, the most popular startup incubator in the world (colloquially, YC W25). Over half of the batch is building some sort of “vertical agent” — AI application generating CAD diagrams for aerospace engineers, Excel financials for private equity, a customer support agent for iOS apps. These three months have come at some personal sacrifice. Shane has traveled from South Dakota with his girlfriend Elizabeth, their three-year-old daughter

Foreword xi and newborn son. I usually have 50-50 custody of my seven-year-old son and five-year-old daughter, but for these three months I’m down to every-other- weekend. Abhi’s up from LA, where he bleeds Lakers purple and gold. Our backstory is that Shane, Abhi and I met while building a popular open-source JavaScript website framework called Gatsby. I was the co- founder, and Shane and Abhi were key engineers. While OpenAI and Anthropic’s models are widely available, the secrets of building effective AI applications are hidden in niche Twitter/X accounts, in-person SF meetups, and founder groupchats. But AI engineering is just a new domain, like data engineering a few years ago, or DevOps before that. It’s not impossibly complex. An engineer with a framework like Mastra should be able to get up to speed in a day or two. With the right tools, it’s easy to build an agent as it is to build a website. This book is intentionally a short read, even with the code examples and diagrams we’ve included. It should fit in your back pocket, or slide into your purse. You should be able to use the code examples and get something simple working in a day or two. Sam Bhagwat San Francisco, CA March 2025

(This page has no text content)

INTRODUCTION We’ve structured this book into a few different sections. Prompting a Large Language Model (LLM) provides some background on what LLMs are, how to choose one, and how to talk to them. Building an Agent introduces a key building block of AI development. Agents are a layer on top of LLMs: they can execute code, store and access memory, and communicate with other agents. Chat- bots are typically powered by agents. Graph-based Workflows have emerged as a useful technique for building with LLMs when agents don’t deliver predictable enough output. Retrieval-Augmented Generation (RAG), covers a common pattern of LLM-driven search. RAG helps you search through large corpuses of

xiv Introduction (typically proprietary) information in order to send the relevant bits to any particular LLM call. Multi-agent systems cover the coordination aspects of bringing agents into production. The problems often involve a significant amount of orga- nizational design! Testing with Evals is important in checking whether your application is delivering users suffi- cient quality. Local dev and serverless deployment are the two places where your code needs to work. You need to be able to iterate quickly on your machine, then get code live on the Internet. Note that we don’t talk about traditional machine learning (ML) topics like reinforcement learning, training models, and fine-tuning. Today most AI applications only need to use LLMs, rather than build them.

PART I PROMPTING A LARGE LANGUAGE MODEL (LLM)

(This page has no text content)

A 1 A BRIEF HISTORY OF LLMS I has been a perennial on-the-horizon technology for over forty years. There have been notable advances over the 2000s and 2010s: chess engines, speech recognition, self-driving cars. The bulk of the progress on “generative AI” has come since 2017, when eight researchers from Google wrote a paper called “Attention is All You Need”. It described an architecture for generating text where a “large language model” (LLM) was given a set of “tokens” (words and punctuation) and was focused on predicting the next “token”. The next big step forward happened in November 2022. A chat interface called ChatGPT, produced by a well-funded startup called OpenAI, went viral overnight.

4 SAM BHAGWAT Today, there are several different providers of LLMs, which provide both consumer chat interfaces and developer APIs: OpenAI. Founded in 2015 by eight people including AI researcher Ilya Sutskever, software engineer Greg Brockman, Sam Altman (the head of YC), and Elon Musk. Anthropic (Claude). Founded in 2020 by Dario Amodei and a group of former OpenAI researchers. Produces models popular for code writing, as well as API- driven tasks. Google (Gemini). The core LLM is being produced by the DeepMind team acquired by Google in 2014. Meta (Llama). The Facebook parent company produces the Llama class of open-source models. Considered the leading US open-source AI group. Others include Mistral (an open-source French company), DeepSeek (an open- source Chinese company).

O 2 CHOOSING A PROVIDER AND MODEL ne of the first choices you’ll need to make building an AI application is which model to build on. Here are some considerations: Hosted vs open-source The first piece of advice we usually give people when building AI applications is to start with a hosted provider like OpenAI, Anthropic, or Google Gemini. Even if you think you will need to use open- source, prototype with cloud APIs, or you’ll be debugging infra issues instead of actually iterating on your code. One way to do this without rewriting a lot of code is to use a model routing library (more on that later).

Statistics

Uploader

Principles of Building AI Agents (Sam Bhagwat) (Z-Library)

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Comments 0

Statistics

Uploader

Principles of Building AI Agents (Sam Bhagwat) (Z-Library)

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Comments 0

Reply to Comment

Edit Comment