John Berryman & Albert Ziegler Prompt Engineering for LLMs The Art and Science of Building Large Language Model–Based Applications
ISBN: 978-1-098-15615-2 US $79.99 CAN $99.99 DATA John Berryman is founder of Arcturus Labs, specializing in LLM-based application development. He was an early GitHub Copilot engineer, working on chat and code completions. John is also a search expert and author of Relevant Search (Manning). Albert Ziegler is Head of AI for the AI cybersecurity firm XBOW. As founding engineer for GitHub Copilot, the first successful industry-scale LLM product, he designed its model interaction and prompt engineering systems. Large language models (LLMs) are revolutionizing the world, promising to automate tasks and solve complex problems. A new generation of software applications are using these models as building blocks to unlock new potential in almost every domain, but reliably accessing these capabilities requires new skills. This book will teach you the art and science of prompt engineering—the key to unlocking the true potential of LLMs. Industry experts John Berryman and Albert Ziegler share how to communicate effectively with AI, transforming your ideas into a language model–friendly format. By learning both the philosophical foundation and practical techniques, you’ll be equipped with the knowledge and confidence to build the next generation of LLM-powered applications. • Understand LLM architecture and learn how to best interact with it • Design a complete prompt-crafting strategy for an application • Gather, triage, and present context elements to make an efficient prompt • Master specific prompt-crafting techniques like few-shot learning, chain-of-thought prompting, and RAG Prompt Engineering for LLMs “Albert and John are behind one of the most successful commercial generative AI products in history—GitHub Copilot—which makes them great people to learn from. Their writing makes the topic of prompt engineering accessible to everyone.” Hamel Husain Independent AI researcher and consultant
John Berryman and Albert Ziegler Prompt Engineering for LLMs The Art and Science of Building Large Language Model–Based Applications
978-1-098-15615-2 LSI Prompt Engineering for LLMs by John Berryman and Albert Ziegler Copyright © 2025 Johnathan Berryman and Albert Ziegler. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Nicole Butterfield Development Editor: Sara Hunter Production Editor: Katherine Tozer Copyeditor: Doug McNair Proofreader: Stephanie English Indexer: Judith McConville Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea November 2025: First Edition Revision History for the First Edition 2023-11-04: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098156152 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Prompt Engineering for LLMs, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors and do not represent the publisher’s views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Part I. Foundations 1. Introduction to Prompt Engineering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 LLMs Are Magic 2 Language Models: How Did We Get Here? 4 Early Language Models 5 GPT Enters the Scene 9 Prompt Engineering 11 Conclusion 13 2. Understanding LLMs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 What Are LLMs? 16 Completing a Document 18 Human Thought Versus LLM Processing 19 Hallucinations 21 How LLMs See the World 22 Difference 1: LLMs Use Deterministic Tokenizers 24 Difference 2: LLMs Can’t Slow Down and Examine Letters 24 Difference 3: LLMs See Text Differently 27 Counting Tokens 28 One Token at a Time 29 Auto-Regressive Models 29 Patterns and Repetitions 31 iii
Temperature and Probabilities 32 The Transformer Architecture 37 Conclusion 43 3. Moving to Chat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Reinforcement Learning from Human Feedback 46 The Process of Building an RLHF Model 47 Keeping LLMs Honest 50 Avoiding Idiosyncratic Behavior 51 RLHF Packs a Lot of Bang for the Buck 51 Beware of the Alignment Tax 51 Moving from Instruct to Chat 52 Instruct Models 52 Chat Models 54 The Changing API 56 Chat Completion API 56 Comparing Chat with Completion 59 Moving Beyond Chat to Tools 61 Prompt Engineering as Playwriting 61 Conclusion 63 4. Designing LLM Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 The Anatomy of the Loop 65 The User’s Problem 67 Converting the User’s Problem to the Model Domain 68 Using the LLM to Complete the Prompt 73 Transforming Back to User Domain 74 Zooming In to the Feedforward Pass 75 Building the Basic Feedforward Pass 75 Exploring the Complexity of the Loop 77 Evaluating LLM Application Quality 81 Offline Evaluation 82 Online Evaluation 82 Conclusion 83 Part II. Core Techniques 5. Prompt Content. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Sources of Content 88 iv | Table of Contents
Static Content 90 Clarifying Your Question 90 Few-Shot Prompting 91 Dynamic Content 100 Finding Dynamic Context 102 Retrieval-Augmented Generation 105 Summarization 116 Conclusion 120 6. Assembling the Prompt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Anatomy of the Ideal Prompt 123 What Kind of Document? 127 The Advice Conversation 127 The Analytic Report 130 The Structured Document 133 Formatting Snippets 137 More on Inertness 138 Formatting Few-Shot Examples 139 Elastic Snippets 139 Relationships Among Prompt Elements 141 Position 141 Importance 141 Dependency 142 Putting It All Together 143 Conclusion 146 7. Taming the Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Anatomy of the Ideal Completion 147 The Preamble 148 Recognizable Start and End 151 Postscript 152 Beyond the Text: Logprobs 153 How Good Is the Completion? 154 LLMs for Classification 155 Critical Points in the Prompt 157 Choosing the Model 159 Conclusion 166 Table of Contents | v
Part III. An Expert of the Craft 8. Conversational Agency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Tool Usage 169 LLMs Trained for Tool Usage 170 Guidelines for Tool Definitions 178 Reasoning 181 Chain of Thought 181 ReAct: Iterative Reasoning and Action 183 Beyond ReAct 186 Context for Task-Based Interactions 187 Sources for Context 187 Selecting and Organizing Context 189 Building a Conversational Agent 191 Managing Conversations 191 User Experience 195 Conclusion 197 9. LLM Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Would a Conversational Agent Suffice? 201 Basic LLM Workflows 204 Tasks 205 Assembling the Workflow 211 Example Workflow: Shopify Plug-in Marketing 214 Advanced LLM Workflows 217 Allowing an LLM Agent to Drive the Workflow 218 Stateful Task Agents 219 Roles and Delegation 219 Conclusion 221 10. Evaluating LLM Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 What Are We Even Testing? 224 Offline Evaluation 225 Example Suites 225 Finding Samples 228 Evaluating Solutions 231 SOMA Assessment 235 Online Evaluation 239 A/B Testing 239 vi | Table of Contents
Metrics 240 Conclusion 243 11. Looking Ahead. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Multimodality 245 User Experience and User Interface 247 Intelligence 249 Conclusion 251 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Table of Contents | vii
(This page has no text content)
Preface Since OpenAI introduced GPT-2 in early 2019, large language models (LLMs) have rapidly changed our world. In 2019, if you, as a coder, had a technical question, then you would search the internet for an answer. More often than not, there would be no answer, leaving only the option to post on some question-and-answer (Q&A) forum in the possibly vain hope that someone might answer you. But today, instead of breaking your flow, you just ask an LLM assistant for direct commentary on the code you’re working on. Moreover, you can even engage in a pairing session where the assistant writes the code to your specifications. This is just in the field of software engineering, and similar tectonic shifts are beginning to be felt in almost any field that you can name. The reason that this revolution is taking place is because the LLM is truly a revo‐ lutionary technology that makes it possible to achieve in software what formerly could be done only through human interaction. LLMs can generate content, answer questions, extract tabular data from natural language text, summarize text, classify documents, translate, and (in principle) do just about anything that you can do with text—except that LLMs will do it many orders of magnitude faster and never stop for a break. For entrepreneurs, this opens endless doors of opportunity in every field imaginable. But before you can take advantage of these opportunities, you have to be prepared. This book serves as a guide to help you understand LLMs, interact with them through prompt engineering, and build applications that will bring value to your users, your company, or yourself. Who Is This Book For? This book is written for application engineers. If you build software products that customers use, then this book is for you. If you build internal applications or data-processing workflows, then this book is also for you. The reason that we are being so inclusive is because we believe that the usage of LLMs will soon become ix
ubiquitous. Even if your day-to-day work doesn’t involve prompt engineering or LLM workflow design, your codebase will be filled with usages of LLMs, and you’ll need to understand how to interact with them just to get your job done. However, a subset of application engineers will be the dedicated LLM wranglers— these are the prompt engineers. It’s their job to convert problems into a packet of information that the LLM can understand—which we call the prompt—and then convert the LLM completions back into results that bring value to those who use the application. If this is your current role—or if you want this to be your role—then this book is especially for you. LLMs are very approachable—you speak with them in natural language. So, for this book, you won’t be expected to know everything about machine learning. But you do need to have a good grasp of basic engineering principles—you need to know how to program and how to use an API. Another prerequisite for this book is the ability to empathize, because unlike with any technology before, you need to understand how LLMs “think” so that you can guide them to generate the content you need. This book will show you how. What You Will Learn The goal of this book is to equip you with all the theory, techniques, tips, and tricks you need to master prompt engineering and build successful LLM applications. In Part I of the book, we convey a foundational understanding of LLMs, their inner workings, and their functionality as text completion engines. We cover the extension of LLMs to their new role as chat engines, and we present a high-level approach to LLM application development. In Part II, we introduce the core techniques for prompt engineering—how to source context information, rank its importance for the task at hand, pack the prompt (without overloading it), and organize everything into a template that will result in high-quality completions that elicit the answer you need. In Part III, we move to more advanced techniques. We assemble loops, pipelines, and workflows of LLM inference to create conversational agency and LLM-driven workflows, and we then explain techniques for evaluating LLMs. Throughout this book, we highlight one principle that underlies all others: At their core, LLMs are just text completion engines that mimic the text they see during their training. x | Preface
If you process that statement deeply, then you’ll arrive at the same conclusions that we share throughout this book: when you want an LLM to behave a certain way, you have to shape the prompt to resemble patterns seen in training data—use clear language, rely upon existing patterns rather than creating new ones, and don’t drown the LLM in superfluous content. Once you master prompt engineering, you can build upon these skills by creating conversation agency and workflows—the dominant paradigms for LLM applications. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. Preface | xi
O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-889-8969 (in the United States or Canada) 707-827-7019 (international or local) 707-829-0104 (fax) support@oreilly.com https://oreilly.com/about/contact.html We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/PromptEngForLLMs. For news and information about our books and courses, visit https://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media. Watch us on YouTube: https://youtube.com/oreillymedia. Acknowledgments Thank you to our technical reviewers, Leonie Monigatti, Benjamin Muskalla, David Foster, and Balaji Dhamodharan; our technical editor, Sara Verdi; and our develop‐ ment editor, Sara Hunter. xii | Preface
From John To Kumiko—my immeasurable love and thanks. I swore I’d never write a book again, but I did, and you supported me patiently through this foolishness once more. To Meg and Bo—Papa’s done with work for today! Let’s go play. From Albert To Annika, Fiona, and Loki—may your prompts never falter! Preface | xiii
(This page has no text content)
(This page has no text content)
CHAPTER 1 Introduction to Prompt Engineering ChatGPT was released in late November of 2022. By January of the following year, the application had accumulated an estimated 100 million monthly users, making ChatGPT the fastest-growing consumer application ever. (In comparison, TikTok took 9 months to reach 100 million users, and Instagram took 2.5 years.) And as you can surely attest, esteemed reader, this public acclaim is well deserved! LLMs—like the one that backs ChatGPT—are revolutionizing the way we work. Rather than running to Google to find answers via a traditional web search, you can easily just ask an LLM to talk about a topic. Rather than reading Stack Overflow or rummaging through blog posts to answer technical questions, you can ask an LLM to write you a personalized tutorial on your exact problem space and then follow it up with a set of questions and answers (a Q&A) about the topic. Rather than following the traditional steps to build a programming library, you can boost your progress by pairing with an LLM-based assistant to build the scaffolding and autocomplete your code as you write it! And to you, future reader, will you use LLMs in ways that we, your humble authors from the year 2024, cannot fathom? If the current trends continue, you’ll likely have conversations with LLMs many times during the course of a typical day—in the voice of the IT support assistant when your cable goes out, in a friendly conversation with the corner ATM, and, yes, even with a frustratingly realistic robo dialer. There will be other interactions as well. LLMs will curate your news for you, summarizing the headline stories that you’re most likely to be interested in and removing (or perhaps adding) biased commentary. You’ll use LLMs to assist in your communications by writing and summarizing emails, and office and home assistants will even reach out into the real world and interact on your behalf. In a single day, your personal AI assistant might at one point act as a travel agent, helping you make travel plans, book flights, and reserve hotels; and then at another point, act as a shopping assistant, helping you find and purchase items you need. 1
Why are LLMs so amazing? It’s because they are magic! As futurist Arthur C. Clarke famously stated, “Any sufficiently advanced technology is indistinguishable from magic.” We think a machine that you can have a conversation with certainly qualifies as magic, but it’s the goal of this book to dispel this magic. We will demonstrate that no matter how uncanny, intuitive, and humanlike LLMs sometimes seem to be, at the core, LLMs are simply models that predict the next word in a block of text—that’s it and nothing more! As such, LLMs are merely tools for helping users to accomplish some task, and the way that you interact with these tools is by crafting the prompt―the block of text―that they are to complete. This is what we call prompt engineering. Through this book, we will build up a practical framework for prompt engineering and ultimately for building LLM applications, which will be a magical experience for your users. This chapter sets the background for the journey you are about to take into prompt engineering. But first, let us tell you about how we, your authors, discovered the magic for ourselves. LLMs Are Magic Both authors of this book were early research developers for the GitHub Copilot code completion product. Albert was on the founding team, and John appeared on the scene as Albert was moving on to other distant-horizon LLM research projects. Albert first discovered the magic halfway through 2020. He puts it as follows: Every half year or so, during our ideation meetings in the ML-on-code group, someone would bring up the matter of code synthesis. And the answer was always the same: it will be amazing, one day, but that day won’t come for another five years at least. It was our cold fusion. This was true until the first day I laid hands on an early prototype of the LLM that would become OpenAI Codex. Then I saw that the future was now: cold fusion had finally arrived. It was immediately clear that this model was wholly different from the sorry stabs at code synthesis we had known before. This model wouldn’t just have a chance of predicting the next word―it could generate whole statements and whole functions from just the docstring. Functions that worked! Before we decided what we could build with this model (spoiler: it would eventually become GitHub’s Copilot code completion product), we wanted to quantify how good the model really was. So, we crowdsourced a bunch of GitHub engineers and had them come up with self-contained coding tasks. Some of the tasks were comparatively easy―but these were hardcore coders, and many of their tasks were also pretty involved. A good number of the tasks were the kind a junior developer would turn to Google for, but some would push even a senior developer to Stack Overflow. Yet, if we gave the model a few tries, it could solve most of them. 2 | Chapter 1: Introduction to Prompt Engineering
Comments 0
Loading comments...
Reply to Comment
Edit Comment