Statistics
12
Views
0
Downloads
0
Donations
Support
Share
Uploader

高宏飞

Shared on 2026-03-24

AuthorJim Dowling

Get up to speed on a new unified approach to building machine learning (ML) systems with a feature store. Using this practical book, data scientists and ML engineers will learn in detail how to develop and operate batch, real-time, and agentic ML systems. Author Jim Dowling introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. You'll see how any AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, you'll tackle the hardest part of ML systems--the data, learning how to transform data into features and embeddings, and how to design a data model for AI. Develop batch ML systems at any scale Develop real-time ML systems by shifting left or shifting right feature computation Develop agentic ML systems that use LLMs, tools, and retrieval-augmented generation Understand and apply MLOps principles when developing and operating ML systems This book introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. It illustrates how an AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, readers will tackle the hardest part of ML systems—the data, learning how to transform data into features and embeddings, and how to design a data model for AI. The book is arranged into 6 logical parts, with each consisting of a group of chapters. Part I (chap. 1~3) introduces the feature-training-inference (FTI) architecture and concludes with a case study. Part II (chap. 4, 5) introduces feature stores for ML and a real-time credit card fraud example that will be covered throughout the book. Part III (chap. 6~9) is about data transformations for AI systems using frameworks such as Pandas, Polars, Apache Spark, Apache Flink

Tags
No tags
ISBN: 1098165241
Publisher: O'Reilly Media
Publish Year: 2026
Language: 英文
Pages: 509
File Format: PDF
File Size: 13.6 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

B uild ing M a chine Learning System s w ith a Fea ture Store Jim Dowling Building Machine Learning Systems with a Feature Store Batch, Real-Time, and LLM Systems
ISBN: 978-1-098-16524-6 DATA Get up to speed on a new unified approach to building machine learning (ML) systems with a feature store. Using this practical book, data scientists and ML engineers will learn in detail how to develop and operate batch, real-time, and agentic ML systems. Author Jim Dowling introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. You’ll see how any AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, you’ll tackle the hardest part of ML systems—the data, learning how to transform data into features and embeddings, and how to design a data model for AI. • Develop batch ML systems at any scale • Develop real-time ML systems by shifting left or shifting right feature computation • Develop agentic ML systems that use LLMs, tools, and retrieval-augmented generation • Understand and apply MLOps principles when developing and operating ML systems Building Machine Learning Systems with a Feature Store “This book shows how modern feature engineering is really done. A must-read for anyone serious about building efficient, real-world ML systems.” Ritchie Vink, inventor of Polars, CEO and founder of Polars Inc. “In this crazy industry of ours, Jim’s the closest thing we have to a world-class expert. Read this book if you want a detailed, practical, reusable manual on how to get a good-quality running AI system.” Niall Murphy, O’Reilly author, cofounder and CEO at Stanza “A lot of practical tips on how to navigate production ML deployments.” Hannes Mühleisen, cocreator of DuckDB, CEO of DuckDB Labs “A great service to ML practitioners with best practices, a clear step-by-step guide.” Eric Bernhardsson, CEO of Modal Jim Dowling is CEO and a cofounder of Hopsworks. He taught the first course in deep learning in Sweden while at KTH Royal Institute of Technology, where he led research on the award-winning HopsFS filesystem and the development of Hopsworks. He also cofounded PyData Stockholm and organizes the annual Feature Store Summit. B uild ing M a chine Learning System s w ith a Fea ture Store
(This page has no text content)
Praise for Building Machine Learning Systems with a Feature Store I witnessed the rise of feature stores at Uber, where ML-powered products operated on batch and real-time data. Jim Dowling helped define the category, and this book gives every engineer a practical playbook for shipping production-grade ML systems that matter. —Vinoth Chandar, CEO and founder of Onehouse Inc. This book shows how modern feature engineering is really done: with scalable, expressive tools at its core. It bridges the gap between research and production by demonstrating how DataFrame engines, feature stores, and ML pipelines can work together seamlessly. A must-read for anyone serious about building efficient, real-world ML systems. —Ritchie Vink, inventor of Polars, CEO and founder of Polars Inc. Nobody before has captured the essentials of building AI apps using modern data streaming systems like Flink. Jim’s book shows the way! Using only widely available open source technologies, this book provides the right blueprints for the job. —Paris Carbone, ACM-Awarded computer scientist and Apache Flink committer It’s easy to be lost in quality metrics land and forget about the crucial systems aspect to ML. Jim does a great job explaining those aspects and gives a lot of practical tips on how to survive a long deployment. —Hannes Mühleisen, cocreator of DuckDB
Building machine learning systems in production has historically involved a lot of black magic and undocumented learnings. Jim Dowling is doing a great service to ML practitioners by sharing the best practices and putting together clear step-by-step guide. — Erik Bernhardsson, CEO of Modal In this crazy industry of ours, Jim’s the closest thing we have to a world-class expert. Read this book if you want a detailed, practical, re-usable manual on how to get a good-quality running system—as an SRE, I especially appreciate his attention to observability and debugging. The detailed case studies are crunchy icing on a filling cake. —Niall Murphy, O’Reilly author, cofounder and CEO at Stanza It’s really excellent, the sort of material that isn’t taught anywhere. —Liam Brannigan, data science educator A must-read for AI/ML practitioners looking to match use cases to the right ML platforms and tools. The book strikes the right balance of breadth, depth, and historical context through comprehensive projects covering real-world ML architectures. —Lalith Suresh, CEO of Feldera
Jim Dowling Building Machine Learning Systems with a Feature Store Batch, Real-Time, and LLM Systems
978-1-098-16524-6 [LSI] Building Machine Learning Systems with a Feature Store by Jim Dowling Copyright © 2026 O’Reilly Media, Inc. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 141 Stony Circle, Suite 195, Santa Rosa, CA 95401. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Nicole Butterfield Development Editor: Gary O’Brien Production Editor: Clare Laylock Copyeditor: nSight, Inc. Proofreader: Doug McNair Indexer: WordCo Indexing Services, Inc. Cover Designer: Susan Brown Cover Illustrator: José Marzan Jr. Interior Designer: David Futato Interior Illustrator: Kate Dullea November 2025: First Edition Revision History for the First Edition 2025-11-06: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098165239 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Building Machine Learning Systems with a Feature Store, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. This work is part of a collaboration between O’Reilly and Hopsworks. See our statement of editorial inde‐ pendence.
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Part I. The FTI Pipeline Architecture for Machine Learning Systems 1. Building Machine Learning Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Anatomy of a Machine Learning System 4 Types of Machine Learning 5 Data Sources 7 Mutable Data 8 A Brief History of Machine Learning Systems 10 MLOps and LLMOps 15 A Unified Architecture for AI Systems: Feature, Training, and Inference Pipelines 18 Classes of AI Systems with a Feature Store 21 ML Frameworks and ML Infrastructure Used in This Book 22 Summary 23 2. Machine Learning Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Building ML Systems with ML Pipelines 26 Minimal Viable Prediction Service 26 Writing Modular Code for ML Pipelines 30 A Taxonomy for Data Transformations in ML Pipelines 33 Feature Types and Model-Dependent Transformations 34 Reusable Features with Model-Independent Transformations 36 Real-Time Features with On-Demand Transformations 36 The ML Transformation Taxonomy and ML Pipelines 37 Feature Pipelines 39 Training Pipelines 41 v
Inference Pipelines 42 Titanic Survival as an ML System Built with ML Pipelines 44 Summary 47 3. Your Friendly Neighborhood Air Quality Forecasting Service. . . . . . . . . . . . . . . . . . . . . 49 AI System Overview 50 Air Quality Data 52 Exploratory Dataset Analysis 54 Air Quality Data 54 Weather Data 56 Creating and Backfilling Feature Groups 57 Feature Pipeline 58 Training Pipeline 59 Batch Inference Pipeline 62 Running the Pipelines 64 Scheduling the Pipelines as a GitHub Action 65 Building the Dashboard as a GitHub Page 67 Function Calling with LLMs 67 Summary and Exercises 71 Part II. Feature Stores 4. Feature Stores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 A Feature Store for Fraud Prediction 76 Brief History of Feature Stores 77 The Anatomy of a Feature Store 78 When Do You Need a Feature Store? 80 For Context and History in Real-Time ML Systems 80 For Time-Series Data 80 For Improved Collaboration with the FTI Pipeline Architecture 82 For Governance of ML Systems 83 For Discovery and Reuse of AI Assets 83 For Elimination of Offline-Online Feature Skew 84 For Centralizing Your Data for AI in a Single Platform 84 Feature Groups 86 Feature Groups Store Untransformed Feature Data 88 Feature Definitions and Feature Groups 89 Writing to Feature Groups 89 Data Models for Feature Groups 92 Dimension Modeling with a Credit Card Data Mart 94 Real-Time Credit Card Fraud Detection ML System 98 Feature Store Data Model for Inference 102 vi | Table of Contents
Online Inference 103 Batch Inference 104 Reading Feature Data with a Feature View 104 Point-in-Time Correct Training Data with Feature Views 106 Online Inference with a Feature View 108 Summary and Exercises 109 5. Hopsworks Feature Store. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Hopsworks Projects 111 Storing Files in a Project 112 Access Control Within Projects 113 Access Control at the Cluster Level Using Projects 113 Feature Groups 116 Versioning 119 Online Store 125 Offline Store (Lakehouse Tables) 129 Change Data Capture for Feature Groups 130 Feature Views 131 Feature Selection 131 Model-Dependent Transformations 133 Creating Feature Views 134 Training Data as Either DataFrames or Files 135 Batch Inference Data 137 Online Inference Data 138 Faster Queries for Feature Data 139 Summary and Exercises 141 Part III. Data Transformations 6. Model-Independent Transformations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Source Code Organization 146 Feature Pipelines 148 Data Transformations for DataFrames 151 Row Size–Preserving Transformations 153 Row and Column Size–Reducing Transformations 154 Row/Column Size–Increasing Transformations 157 Join Transformations 158 DAG of Feature Functions 158 Lazy DataFrames 160 Vectorized Compute, Multicore, and Arrow 160 Data Types 165 Credit Card Fraud Features 168 Table of Contents | vii
Composition of Transformations 170 Summary and Exercises 172 7. Model-Dependent and On-Demand Transformations. . . . . . . . . . . . . . . . . . . . . . . . . . 173 Feature Transformations 173 Encoding Categorical Variables 174 Distributions of Numerical Variables 175 Transforming Numerical Variables 178 Storing Transformed Feature Data in a Feature Group 180 Model-Specific Transformations 180 Outlier Handling Methods 181 Imputing Missing Values 181 Data Cleaning as Model-Based Transformations 184 Target-/Label-Dependent Transformations 185 Expensive Features Are Computed When Needed 185 Tokenizers and Chat Templates for LLMs 185 Transformations in Scikit-Learn Pipelines 186 Transformations in Feature Views 189 On-Demand Transformations 193 PyTorch Transformations 194 Using pytest 197 Unit Tests 197 A Testing Methodology 201 Summary and Exercises 202 8. Batch Feature Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Batch Feature Pipelines 206 Feature Pipeline Data Sources 207 Batch Data Sources 207 Streaming Data Sources 210 Unstructured Data in Object Stores and Filesystems 211 API and SaaS Sources 212 Synthetic Credit Card Data with LLMs 213 A Logical Model for the Data Mart and the LLM 213 LLM Prompts to Generate the Synthetic Data 215 Backfilling and Incremental Updates 216 Polling and CDC for Incremental Data 217 Backfill and Incremental Processing in One Program 218 Job Orchestrators 219 Modal 220 Hopsworks Jobs 221 Workflow Orchestrators 223 Airflow 224 viii | Table of Contents
Cloud Provider Workflow Orchestrators 225 Data Contracts 225 Data Validation with Great Expectations in Hopsworks 226 Summary and Exercises 229 9. Streaming and Real-Time Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Interactive AI-Enabled Systems Need Real-Time Features 232 Event-Streaming Platforms 233 Shift Left or Shift Right? 234 Shift-Right Architectures 236 Shift-Left Architectures 238 Writing Streaming Feature Pipelines 242 Dataflow Programming 243 Stateless and Stateful Data Transformations 244 Apache Flink 246 Feldera 247 Windowed Aggregations 248 Rolling Aggregations 251 Time Window Aggregations 253 Choosing the Best Window Type for Aggregations 256 Rolling Aggregations with Incremental Views 256 Credit Card Fraud Streaming Features 258 ASOF Joins and Composition of Transformations 260 Lagged Features and Feature Pipelines in Feldera 262 Summary and Exercises 264 Part IV. Training Models 10. Training Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Unstructured Data and Labels in Feature Groups 267 Self-Supervised and Unsupervised Learning 268 Supervised Learning Requires a Label 269 Root and Label Feature Groups 272 Feature Selection 273 Training Data 277 Splitting Training Data 279 Reproducible Training Data 280 Model Training 281 Model Architecture 283 Checkpoints to Recover from Failures 287 Hyperparameter Tuning with Ray Tune 288 Distributed Training with Ray 290 Table of Contents | ix
Parameter-Efficient Fine-Tuning of LLMs 292 Credit Card Fraud Model with XGBoost 295 Identifying Bottlenecks in Distributed Training 296 Model Evaluation and Model Validation 299 Model Performance for Classification and Regression 300 Model Interpretability 300 Model Bias Tests 301 Model File Formats and the Model Registry 302 Model Cards 303 Summary and Exercises 305 Part V. Inference and Agents 11. Inference Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Batch Inference Pipelines 309 Batch Predictions for a Time Range 310 Batch Predictions for Entities 312 Scaling Batch Inference with PySpark 314 Data Modeling for Batch Inference 315 Batch Inference for Neural Networks 317 Batch Inference for LLMs 318 Online Inference Pipelines 320 Ensure Offline-Online Consistency for Libraries 320 Model Deployments with FastAPI 322 LLM Deployments 323 Deployment API for Models and Feature Views 324 Model-Serving Frameworks with KServe 328 Performance and Failure Handling 331 Mixed-Mode UDFs 331 Native UDFs and Log-and-Wait 332 Handling Failures in Online Inference Pipelines 333 Model Deployment SLOs 334 Inference with Embedded Models 335 Embedded AI-Enabled Applications 336 Stream-Processing AI-Enabled Applications 337 UIs for AI-Enabled Applications in Python 338 Summary and Exercises 339 12. Agents and LLM Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 From LLMs to Agents 342 Prompt Management 345 Prompt Engineering 348 x | Table of Contents
Context Window 350 Agents and Workflows with LlamaIndex 352 Retrieval-Augmented Generation 356 Retrieval with a Document Store 358 Retrieval with a Feature Store 359 Retrieval with a Graph Database 360 Tools and Function-Calling LLMs 361 Model Context Protocol 364 Agent-to-Agent (A2A) Protocol 368 From LLM Workflows to Agents 370 Planning 373 Security Challenges 374 Domain-Specific (Intermediate) Representations 375 A Development Process for Agents 375 Agent Deployments in Hopsworks 377 Summary and Exercises 378 Part VI. MLOps and LLMOps 13. Testing AI Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Offline Testing 381 From Dev to Prod 383 Automatic Containerization and Jobs 385 Environments and Jobs in Hopsworks 386 Modal Jobs 389 CI/CD Tests for AI Systems 390 Feature Pipeline Tests 391 Training Pipeline Tests for Model Performance and Bias 394 Testing Model Deployments 395 A/B Tests for Batch Inference 397 Evals for Agents 397 Governance 402 Schematized Tags 402 Lineage 405 Versioning 406 Audit Logs 408 Summary and Exercises 409 14. Observability and Monitoring AI Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Logging and Metrics for ML Models 412 Logging for Batch and Online Models 412 Metrics for Online Models 417 Table of Contents | xi
Metrics for Batch Models 420 Monitoring Features and Models 422 Data Ingestion Drift 429 Univariate Feature Drift 430 Multivariate Feature Drift 431 Monitoring Vector Embeddings 432 Model Monitoring with NannyML 432 When to Retrain or Redesign a Model 435 Logging and Metrics for Agents 436 From Logs to Traces with Agents 437 Error Analysis 438 Guardrails 443 Online A/B Testing 444 Jailbreaking and Prompt Injection 444 LLM Metrics 445 Summary and Exercises 446 15. TikTok’s Personalized Recommender: The World’s Most Valuable AI System. . . . . . . 447 Introduction to Recommenders 447 A TikTok Recommender with the Retrieval-and-Ranking Architecture 449 Real-Time Personalized Recommender 454 Feature Pipelines 456 Training Pipelines 458 Online Inference Pipeline 462 Agentic Search for Videos 465 The Dirty Dozen of Fallacies of MLOps 467 The Ethical Responsibilities of AI Builders 472 Summary 472 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 xii | Table of Contents
Preface AI is a wide and deep field. If you’ve never trained a model, it can feel like you need a PhD just to begin. If you have trained a model, building a machine learning (ML) sys‐ tem can feel like you need to first become both a data engineer and a Kubernetes or cloud expert. You may already have some experience in ML or AI. Maybe you trained a model on a static dataset. Or you may have learned about large language models (LLMs) through crafting a prompt such that you successfully accomplished a task. But to create real value from AI, you need to move from static datasets and static prompts to dynamic data and context engineering. When you train a model, you need a system that will make many predictions with it, not just predictions on the static dataset you down‐ loaded. When you AI-enable an application, you don’t have to hardwire the same responses for all users. You can personalize the AI by providing fresh and relevant context information at request time. ML and AI systems create the most value when they work with dynamic data. Pipe‐ lines are key to this. You need pipelines to transform the dynamic data from your data sources into a format that can be used for anything from training your model, to making predictions, to providing context information for your LLM. In this book, we will define ML systems as sequences of pipelines. They transform data progressively from data sources until it is used as input to a model for training or inference (making predictions). Pipelines enable us to lift the level of abstraction when describing an ML or AI system. What is the pipeline’s input and output? Does it create feature data from your data sources? Does it train a model from your feature data? Does it output predictions using the model you trained? Pipelines help us decompose our ML or AI system into modular components. We will see how the fea‐ ture store, a data management platform for AI, enables the composition of pipelines into working ML or AI systems. xiii
You will also see that the journey to building pipelines for AI systems is similar to the journey to building pipelines for ML systems. Context engineering for agents follows many of the same principles as feature engineering for classical ML models. This book is useful because it can help you build different types of ML and AI sys‐ tems from scratch. A real-world ML system rarely processes a ready-made dataset and optimizes a clear metric. Instead, it often implements a messy process of identify‐ ing the right “prediction problem” to solve for available data sources; managing with incremental, never-ending data flows; sometimes training or fine-tuning a model; and building a user interface so that stakeholders can get value from your model. Your ML system should also be well engineered, not a house of cards. It needs to be tested before it goes into production and monitored once in production. And you should follow best practices in automated testing and deployment for software engi‐ neering. This book can help you attain the skills of a staff data scientist or lead ML engineer. This book teaches you the skills needed to build three important classes of ML or AI systems: • Batch ML systems that make predictions on a schedule • Real-time ML systems that run 24/7 and make (personalized) predictions in response to requests • Agentic AI systems that work autonomously to solve a goal using LLMs and rele‐ vant context data Why Did I Write This Book? This book is the coursebook I would like to have had for ID2223, “Scalable Machine Learning and Deep Learning”, a course I developed and taught at KTH Royal Insti‐ tute of Technology in Stockholm. KTH is the alma mater of the founders of impor‐ tant AI companies like Spotify, Lovable, Databricks, Modal, and Feldera (all of which are referenced in this book). My course was, to the best of my knowledge, the first university course that taught students to build complete and novel ML systems as part of their coursework. It was the result of my own nontraditional academic route of going wide (not just deep). I have published at top-tier conferences in the most important disciplines for building ML systems: AI (ICML, AAMAS), systems (USENIX, ACM Middleware), program‐ ming languages (ECOOP), and databases (SIGMOD, PVLDB). Building ML systems requires you to go wider, to leave your comfort zone. Hopefully, you will learn some‐ thing new about data engineering, model training, agents, or MLOps for building ML systems. xiv | Preface
By the end of my course, the students had built their own ML or AI system (after two to three weeks of work, in groups of two). Their ML system specification answered the following questions: • What unique data source (or sources) generates new data at some cadence? • What is the prediction problem you will solve with ML or AI using the data source(s)? • What is the UI (interactive or dashboard) for stakeholder(s) to generate value from your ML system? • How will you ensure the correctness and monitor the performance of your system? Here are some examples of ML and AI systems built by students: • A water height prediction system that uses public measurements of water height along with weather forecast data • A system that predicts electricity demand using historical and projected demand data, as well as weather forecast data • A system that predicts public transport arrival times using historical data, weather forecast data, and real-time context data • A system that lets users ask questions about the course through a UI, by indexing the course’s PDFs with retrieval-augmented generation (RAG) pipelines and an LLM Hopefully, after reading this book, you will be similarly inspired to build your own ML and AI systems. Target Readers of This Book This book is for data scientists, data engineers, software engineers, and software architects who love to build things and are interested in building ML or AI systems. If you are a data scientist and are tired of the constant refrain of productionizing your models, but are not yet a Docker and Terraform expert, then this book is for you. If you are a data engineer and wonder what all the fuss is about AI, then this book is for you. ML engineers will also enjoy the exercises that will enable them to refine their ML system design, pipeline building skills, and offline and online testing. You should have some experience in Python and SQL to get the most out of the exercises. Preface | xv
If any of the following describe you, you’ll find this book valuable: • A data scientist who wants to be able to build ML systems, not just train models • A data engineer who wants to learn about data modeling for AI as well as batch and real-time feature engineering • An AI engineer who wants to build agents that are fed with relevant context using pipelines • An ML engineer who wants to build scalable, reliable, and maintainable ML systems • A developer who wants to build ML systems, whether for a portfolio or for fun What This Book Is Not This book is not a traditional MLOps book that starts with experiment tracking and how to package and deploy software with containers and infrastructure as code. We do not discuss Docker, Terraform, or AWS CloudFormation. We don’t need them as we assume support for automatic containerization of pipelines. We also don’t cover experiment tracking due to our focus on ML systems over model training, the rise in AutoML (and the corresponding drop in the importance of hyperparameter tuning), and the fact that a model registry is all you need to store model evaluation results and support model governance. Outline of the Book The book is arranged into six logical parts, with each part consisting of a group of chapters. Each chapter stands in its own right and has exercises to help deepen your understanding of the concepts and technologies introduced. Part I introduces the feature, training, and inference (FTI) architecture and concludes with a case study. In Chapter 1, we describe the anatomy of an ML system, provide a whirlwind history of ML system architectures and MLOps, and introduce a unified architecture for building ML systems: FTI pipelines, connected by the feature store and model registry. Chapter 2 introduces the three main classes of ML pipeline: feature pipelines, training pipelines, and batch/online/agentic inference pipelines. It also intro‐ duces a development process for building AI systems and a taxonomy that helps you understand which class of data transformation should be performed in which FTI pipe‐ line. In Chapter 3, you’ll build your first ML system. You’ll identify an air quality sensor near where you live and build an air quality forecasting system using ML along with a dashboard. You will also query it with natural language using an LLM. Part II introduces feature stores for ML and a real-time credit card fraud example that will be covered throughout the book. In Chapter 4, we provide an overview of the xvi | Preface
main characteristics of a feature store, including the problems it solves by storing fea‐ ture data for training and inference in feature groups, querying feature data using feature views, preventing offline/online skew through supporting the taxonomy of data transformations, and data modeling. In Chapter 5, we introduce the Hopsworks feature store, its multitenant project security model, and its APIs for reading and writing with ML pipelines with feature groups and feature views, as well as running ML pipelines as jobs. Part III is about data transformations for AI systems using frameworks such as Pan‐ das, Polars, Apache Spark, Apache Flink, and Feldera. Chapter 6 describes data trans‐ formations for feature pipelines, including data validation with Great Expectations. Chapter 7 describes feature transformations for training and inference pipelines, including real-time transformations. Chapter 8 describes how to design and schedule batch feature pipelines. Chapter 9 describes how to design and operate streaming fea‐ ture pipelines, including windowed aggregations and rolling aggregations. Part IV is about training models. In Chapter 10, we start by describing how to build training datasets from a feature store and how to train a decision tree from time- series data. We then look at training models with unstructured data, including fine- tuning LLMs with low-rank adaptation (LoRA) and training PyTorch models with Ray. We also outline the scalability challenges in distributed training. Part V is about making predictions in batch, real-time, and agentic AI systems. In Chapter 11, we look at batch inference and how to scale it with PySpark. We also look at real-time inference and deployment APIs. We look at model serving using KServe, both with and without graphics processing units (GPUs), including vLLM for serving LLMs. In Chapter 12, we introduce agents and LLM workflows. We look at LlamaIndex, RAG, and protocols for using tools (like the Model Context Protocol [MCP]) and other agents (like Agent-to-Agent [A2A]). We also compare the agentic workflow with LLM workflows and introduce a development process for agents. Part VI is about MLOps. In Chapter 13, we cover offline tests for AI systems, from unit tests for features (to enforce their contract), to ML pipeline integration tests, to blue/green tests for deployments, to evals for agents. We also cover governance and automatic containerization for ML pipelines. In Chapter 14, we cover observability for AI systems, built on logging/traces and metrics for models and agents. We look at how feature monitoring and model monitoring are built from logs, as well as evals from agent traces. We look at how metrics help models meet service-level objectives through autoscaling. We conclude the book in Chapter 15 with a case study on how to build a personalized video recommender system, similar to TikTok’s, and the dirty dozen fallacies of MLOps. The book is deliberately light on references compared with the academic articles I usually write. I hope the book will still guide you to deeper sources of information on the topics covered and give credit to all the technologies and ideas it builds on. Preface | xvii