📄 Page
1
Patrick Hall, James Curtis & Parul Pandey Foreword by Agus Sudjianto, PhD Machine Learning for High-Risk Applications Approaches to Responsible AI
📄 Page
2
DATA SCIENCE “The authors have done an excellent job providing an overview of regulatory aspects, risk management, interpretability, and many other topics while providing practical advice and code examples.” —Christoph Molnar Author of Interpretable Machine Learning “This book stands out for its uniquely tactical approach to addressing system risks in ML. By taking a nuanced approach to de-risking ML, this book offers readers a valuable resource for successfully deploying ML systems in a responsible and sustainable manner.” —Liz Grennan Associate Partner and Global Co-Lead for Digital Trust, McKinsey & Company Machine Learning for High-Risk Applications Twitter: @oreillymedia linkedin.com/company/oreilly-media youtube.com/oreillymedia The past decade has witnessed the broad adoption of artificial intelligence and machine learning (AI/ML) technologies. However, a lack of oversight in their widespread implementation has resulted in some incidents and harmful outcomes that could have been avoided with proper risk management. Before we can realize AI/ML’s true benefit, practitioners must understand how to mitigate its risks. This book describes approaches to responsible AI—a holistic framework for improving AI/ML technology, business processes, and cultural competencies that builds on best practices in risk management, cybersecurity, data privacy, and applied social science. Authors Patrick Hall, James Curtis, and Parul Pandey created this guide for data scientists who want to improve real-world AI/ML system outcomes for organizations, consumers, and the public. • Learn technical approaches for responsible AI across explainability, model validation and debugging, bias management, data privacy, and ML security • Learn how to create a successful and impactful AI risk management practice • Get a basic guide to existing standards, laws, and assessments for adopting AI technologies, including the new NIST AI Risk Management Framework • Engage with interactive resources on GitHub and Colab Patrick Hall is a principal scientist at BNH.AI and a visiting faculty member at GWU. James Curtis is a quantitative researcher at Solea Energy. Parul Pandey is a principal data scientist at H2O.ai. US $79.99 CAN $99.99 ISBN: 978-1-098-10243-2
📄 Page
3
Praise for Machine Learning for High-Risk Applications Machine Learning for High-Risk Applications is a practical, opinionated, and timely book. Readers of all stripes will find rich insights into this fraught subject, whether you’re a data scientist interested in better understanding your models, or a manager responsible for ensuring compliance with existing standards, or an executive trying to improve your organization’s risk controls. —Agus Sudjianto, PhD, EVP, Head of Corporate Model Risk, Wells Fargo Don’t miss out on this must-read! Packed with a winning combination of cutting-edge theory and real-world expertise, this book is a game-changer for anyone grappling with the complexities of AI interpretability, explainability, and security. With expert guidance on managing bias and much more, it’s the ultimate guide to mastering the buzzword bonanza of the AI world. Don’t let the competition get ahead—get your hands on this indispensable resource today! —Mateusz Dymczyk, Software Engineer, Machine Learning, Meta The book is a comprehensive and timely guide for anyone working on machine learning when the stakes are high. The authors have done an excellent job providing an overview of regulatory aspects, risk management, interpretability, and many other topics while providing practical advice and code examples. Highly recommended for anyone who prefers diligence over disaster when deploying machine learning models. —Christoph Molnar, Author of Interpretable Machine Learning
📄 Page
4
Machine learning applications need to account for fairness, accountability, transparency, and ethics in every industry to be successful. Machine Learning for High-Risk Applications lays the foundation for such topics and gives valuable insights that can be utilized for various use cases. I highly recommend this book for any machine learning practitioners. —Navdeep Gill, Engineering Manager, H2O.ai Responsible AI—explained simply. —Hariom Tatsat, Coauthor of Machine Learning & Data Science Blueprints for Finance Machine Learning for High-Risk Applications is a highly needed book responding to the growing demand for in-depth analysis of predictive models. The book is very practical and gives explicit advice on how to look at different aspects, such as model debugging, bias, transparency, and explainability analysis. The authors share their huge experience in analyzing different classes of models, for both tabular and image data. I recommend this book to anyone wishing to work responsibly with complex models, not only in high-risk applications. —Przemysław Biecek, Professor at the Warsaw University of Technology A refreshingly thoughtful and practical guide to responsible use of machine learning. This book has the potential to prevent AI accidents and harms before they happen. —Harsh Singhal, Senior AI Solution Director, Financial Services, C3.ai This book stands out for its uniquely tactical approach to addressing system risks in ML. The authors emphasize the critical importance of addressing potential harms as necessary to the delivery of desired outcomes—noted as key to the very success of ML. Especially helpful is the focus on ensuring that the right roles are in the room when making decisions about ML. By taking a nuanced approach to derisking ML, this book offers readers a valuable resource for successfully deploying ML systems in a responsible and sustainable manner. —Liz Grennan, Associate Partner and Global Co-Lead for Digital Trust, McKinsey & Company This book is a comprehensive review of both social and technical approaches to high-risk AI applications and provides practitioners with useful techniques to bridge their day-to- day work with core concepts in Responsible AI. —Triveni Gandhi, PhD, Responsible AI Lead, Dataiku
📄 Page
5
Unlocking the full potential of machine learning and AI goes beyond mere accuracy of models. This book delves into the critical yet often overlooked aspects of explainable, bias-free, and robust models. In addition, it offers invaluable insights into the cultural and organizational best practices for organizations to ensure the success of their AI initiatives. With technology advancing at an unprecedented pace and regulations struggling to keep up, this timely and comprehensive guide serves as an indispensable resource for practitioners. —Ben Steiner, Columbia University Machine learning models are very complex in nature and their development is fraught with pitfalls. Mistakes in this field can cost many’s reputation and millions or even billions of dollars. This book contains must-have knowledge for any machine learning practitioner who wants to design, develop, and deploy robust machine learning models that avoid failing like so many other ML endeavors over the past years. —Szilard Pafka, PhD, Chief Scientist, Epoch Saying this book is timely is an understatement. People who do machine learning models need a text like this to help them consider all the possible biases and repercussions that arise from the models they create. The best part is that Patrick, James, and Parul do a wonderful job in making this book readable and digestible. This book is needed on any machine learning practitioner’s bookshelf. —Aric LaBarr, PhD, Associate Professor of Analytics This is an extremely timely book. Practitioners of data science and AI need to seriously consider the real-world impact and consequences of models. The book motivates and helps them to do so. It not only provides solid technical information, but weaves a cohesive tapestry with legislation, security, governance, and ethical threads. Highly recommended as reference material. —Jorge Silva, PhD, Director of AI/Machine Learning Server, SAS With the ever-growing applications of AI affecting every facet of our lives, it is important to ensure that AI applications, especially the ones that are safety critical, are developed responsibly. Patrick Hall and team have done a fantastic job in articulating the key aspects and issues in developing safety-critical applications in this book in a pragmatic way. I highly recommend this book, especially if you are involved in building AI applications that are high stakes, critical, and need to be developed and tested systematically and responsibly! —Sri Krishnamurthy, QuantUniversity
📄 Page
6
If you’re looking for direction from a trusted advisor as you venture into the use of AI in your organization, this book is a great place to start. The authors write from a position of both knowledge and experience, providing just the right mix of baseline education in technology and common pitfalls, coverage of regulatory and societal issues, relevant and relatable case studies, and practical guidance throughout. —Brett Wujek, PhD, Head of AI Product Management, SAS
📄 Page
7
Patrick Hall, James Curtis, and Parul Pandey Foreword by Agus Sudjianto, PhD Machine Learning for High-Risk Applications Approaches to Responsible AI Boston Farnham Sebastopol TokyoBeijing
📄 Page
8
978-1-098-10243-2 [LSI] Machine Learning for High-Risk Applications by Patrick Hall, James Curtis, and Parul Pandey Copyright © 2023 Patrick Hall, James Curtis, and Parul Pandey. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisition Editors: Rebecca Novack and Nicole Butterfield Development Editor: Michele Cronin Production Editor: Gregory Hyman Copyeditor: Liz Wheeler Proofreader: Kim Cofer Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea April 2023: First Edition Revision History for the First Edition 2023-04-17: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098102432 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Machine Learning for High-Risk Applications, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors, and do not represent the publisher’s views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. This work is part of a collaboration between O’Reilly and Dataiku. See our statement of editorial independence.
📄 Page
9
Table of Contents Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Part I. Theories and Practical Applications of AI Risk Management 1. Contemporary Machine Learning Risk Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 A Snapshot of the Legal and Regulatory Landscape 4 The Proposed EU AI Act 4 US Federal Laws and Regulations 5 State and Municipal Laws 5 Basic Product Liability 6 Federal Trade Commission Enforcement 7 Authoritative Best Practices 8 AI Incidents 11 Cultural Competencies for Machine Learning Risk Management 13 Organizational Accountability 13 Culture of Effective Challenge 14 Diverse and Experienced Teams 15 Drinking Our Own Champagne 15 Moving Fast and Breaking Things 16 Organizational Processes for Machine Learning Risk Management 16 Forecasting Failure Modes 17 Model Risk Management Processes 18 Beyond Model Risk Management 22 Case Study: The Rise and Fall of Zillow’s iBuying 27 Fallout 28 iii
📄 Page
10
Lessons Learned 28 Resources 31 2. Interpretable and Explainable Machine Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Important Ideas for Interpretability and Explainability 34 Explainable Models 39 Additive Models 39 Decision Trees 44 An Ecosystem of Explainable Machine Learning Models 47 Post Hoc Explanation 50 Feature Attribution and Importance 51 Surrogate Models 63 Plots of Model Performance 68 Cluster Profiling 71 Stubborn Difficulties of Post Hoc Explanation in Practice 71 Pairing Explainable Models and Post Hoc Explanation 75 Case Study: Graded by Algorithm 77 Resources 80 3. Debugging Machine Learning Systems for Safety and Performance. . . . . . . . . . . . . . . 81 Training 83 Reproducibility 83 Data Quality 85 Model Specification for Real-World Outcomes 88 Model Debugging 91 Software Testing 92 Traditional Model Assessment 93 Common Machine Learning Bugs 95 Residual Analysis 103 Sensitivity Analysis 107 Benchmark Models 110 Remediation: Fixing Bugs 112 Deployment 114 Domain Safety 114 Model Monitoring 116 Case Study: Death by Autonomous Vehicle 120 Fallout 120 An Unprepared Legal System 120 Lessons Learned 121 Resources 122 iv | Table of Contents
📄 Page
11
4. Managing Bias in Machine Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 ISO and NIST Definitions for Bias 126 Systemic Bias 126 Statistical Bias 126 Human Biases and Data Science Culture 127 Legal Notions of ML Bias in the United States 128 Who Tends to Experience Bias from ML Systems 131 Harms That People Experience 133 Testing for Bias 135 Testing Data 135 Traditional Approaches: Testing for Equivalent Outcomes 137 A New Mindset: Testing for Equivalent Performance Quality 141 On the Horizon: Tests for the Broader ML Ecosystem 143 Summary Test Plan 146 Mitigating Bias 147 Technical Factors in Mitigating Bias 148 The Scientific Method and Experimental Design 148 Bias Mitigation Approaches 149 Human Factors in Mitigating Bias 153 Case Study: The Bias Bug Bounty 156 Resources 158 5. Security for Machine Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Security Basics 161 The Adversarial Mindset 161 CIA Triad 162 Best Practices for Data Scientists 163 Machine Learning Attacks 166 Integrity Attacks: Manipulated Machine Learning Outputs 166 Confidentiality Attacks: Extracted Information 171 General ML Security Concerns 173 Countermeasures 175 Model Debugging for Security 175 Model Monitoring for Security 178 Privacy-Enhancing Technologies 179 Robust Machine Learning 182 General Countermeasures 182 Case Study: Real-World Evasion Attacks 184 Evasion Attacks 184 Lessons Learned 185 Resources 186 Table of Contents | v
📄 Page
12
Part II. Putting AI Risk Management into Action 6. Explainable Boosting Machines and Explaining XGBoost. . . . . . . . . . . . . . . . . . . . . . . 189 Concept Refresher: Machine Learning Transparency 190 Additivity Versus Interactions 190 Steps Toward Causality with Constraints 191 Partial Dependence and Individual Conditional Expectation 191 Shapley Values 194 Model Documentation 195 The GAM Family of Explainable Models 196 Elastic Net–Penalized GLM with Alpha and Lambda Search 196 Generalized Additive Models 200 GA2M and Explainable Boosting Machines 205 XGBoost with Constraints and Post Hoc Explanation 208 Constrained and Unconstrained XGBoost 208 Explaining Model Behavior with Partial Dependence and ICE 214 Decision Tree Surrogate Models as an Explanation Technique 217 Shapley Value Explanations 221 Problems with Shapley values 224 Better-Informed Model Selection 228 Resources 229 7. Explaining a PyTorch Image Classifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Explaining Chest X-Ray Classification 232 Concept Refresher: Explainable Models and Post Hoc Explanation Techniques 233 Explainable Models Overview 233 Occlusion Methods 234 Gradient-Based Methods 234 Explainable AI for Model Debugging 235 Explainable Models 235 ProtoPNet and Variants 236 Other Explainable Deep Learning Models 237 Training and Explaining a PyTorch Image Classifier 238 Training Data 238 Addressing the Dataset Imbalance Problem 239 Data Augmentation and Image Cropping 240 Model Training 242 Evaluation and Metrics 244 Generating Post Hoc Explanations Using Captum 244 Evaluating Model Explanations 250 The Robustness of Post Hoc Explanations 252 vi | Table of Contents
📄 Page
13
Conclusion 258 Resources 259 8. Selecting and Debugging XGBoost Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Concept Refresher: Debugging ML 262 Model Selection 262 Sensitivity Analysis 262 Residual Analysis 264 Remediation 265 Selecting a Better XGBoost Model 266 Sensitivity Analysis for XGBoost 271 Stress Testing XGBoost 272 Stress Testing Methodology 273 Altering Data to Simulate Recession Conditions 274 Adversarial Example Search 276 Residual Analysis for XGBoost 280 Analysis and Visualizations of Residuals 281 Segmented Error Analysis 285 Modeling Residuals 287 Remediating the Selected Model 290 Overemphasis of PAY_0 291 Miscellaneous Bugs 293 Conclusion 295 Resources 296 9. Debugging a PyTorch Image Classifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Concept Refresher: Debugging Deep Learning 299 Debugging a PyTorch Image Classifier 302 Data Quality and Leaks 303 Software Testing for Deep Learning 305 Sensitivity Analysis for Deep Learning 306 Remediation 314 Sensitivity Fixes 321 Conclusion 325 Resources 326 10. Testing and Remediating Bias with XGBoost. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Concept Refresher: Managing ML Bias 328 Model Training 331 Evaluating Models for Bias 335 Testing Approaches for Groups 335 Individual Fairness 345 Table of Contents | vii
📄 Page
14
Proxy Bias 349 Remediating Bias 350 Preprocessing 350 In-processing 355 Postprocessing 359 Model Selection 362 Conclusion 366 Resources 368 11. Red-Teaming XGBoost. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Concept Refresher 370 CIA Triad 370 Attacks 371 Countermeasures 373 Model Training 375 Attacks for Red-Teaming 379 Model Extraction Attacks 379 Adversarial Example Attacks 383 Membership Attacks 386 Data Poisoning 387 Backdoors 390 Conclusion 394 Resources 395 Part III. Conclusion 12. How to Succeed in High-Risk Machine Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Who Is in the Room? 400 Science Versus Engineering 402 The Data-Scientific Method 403 The Scientific Method 404 Evaluation of Published Results and Claims 405 Apply External Standards 407 Commonsense Risk Mitigation 410 Conclusion 413 Resources 414 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 viii | Table of Contents
📄 Page
15
Foreword Renowned statistician George Box once famously stated, “All models are wrong, but some are useful.” Acknowledgment of this fact forms the foundation of effective risk management. In a world where machine learning increasingly automates important decisions about our lives, the consequences of model failures can be catastrophic. It’s critical to take deliberate steps to mitigate risk and avoid unintended harm. Following the 2008 financial crisis, regulators and financial institutions recognized the importance of managing model risk in ensuring the safety of banks, refining the practice of model risk management (MRM). As AI and machine learning gain widespread adoption, MRM principles are being applied to manage their risk. The National Institute of Standards and Technology’s AI Risk Management Framework serves as an example of this evolution. Proper governance and control of the entire process, from senior management oversight to policy and procedures, including organizational structure and incentives, are crucial to promoting a culture of model risk management. In Machine Learning for High-Risk Applications, Hall, Curtis, and Pandey have pre‐ sented a framework for applying machine learning to high-stakes decision making. They provide compelling evidence through documented cases of model failures and emerging regulations that highlight the importance of strong governance and cul‐ ture. Unfortunately, these principles are still rarely implemented outside of regulated industries, such as banks. The book covers important topics ranging across model transparency, governance, security, bias management, and more. Performance testing alone is not enough in machine learning, where very different models can have the same performance due to model multiplicity. Models must also be explainable, secure, and fair. This is the first book that emphasizes inherently interpretable models and their recent development and application, particularly in cases where models impact individuals, such as in consumer finance. In these scenarios, where explainability standards and regulations are particularly stringent, ix
📄 Page
16
the explainable AI (XAI) post hoc explainability approach often faces significant challenges. Developing reliable and safe machine learning systems also requires a rigorous evalu‐ ation of model weaknesses. This book presents two thorough examples alongside a methodology for model debugging, including identifying model flaws through error or residual slicing, evaluating model robustness under input corruption, assessing the reliability or uncertainty of model outputs, and testing model resilience under distribution shift through stress testing. These are crucial topics for developing and deploying machine learning in high-risk settings. Machine learning models have the potential to disproportionately harm historically marginalized groups, and to deliver this harm rapidly and at scale through automa‐ tion. Biased model decisions have detrimental impacts on protected groups, perpet‐ uating social and economic disparities. In this book, the reader will learn how to approach the issue of model fairness through a sociotechnical lens. The authors also detail a thorough study of the effects of model debiasing techniques, and give practical advice on the application of these techniques within different regulated verticals. Machine Learning for High-Risk Applications is a practical, opinionated, and timely book. Readers of all stripes will find rich insights into this fraught subject, whether you’re a data scientist interested in better understanding your models, or a manager responsible for ensuring compliance with existing standards, or an executive trying to improve your organization’s risk controls. — Agus Sudjianto, PhD EVP, Head of Corporate Model Risk, Wells Fargo x | Foreword
📄 Page
17
Preface Today, machine learning (ML) is the most commercially viable subdiscipline of arti‐ ficial intelligence (AI). ML systems are used to make high-risk decisions in employ‐ ment, bail, parole, lending, security, and in many other high-impact applications throughout the world’s economies and governments. In a corporate setting, ML systems are used in all parts of an organization—from consumer-facing products, to employee assessments, to back-office automation, and more. Indeed, the past decade has brought with it even wider adoption of ML technologies. But it has also proven that ML presents risks to its operators, consumers, and even the general public. Like all technologies, ML can fail—whether by unintentional misuse or intentional abuse. As of 2023, there have been thousands of public reports of algorithmic discrimination, data privacy violations, training data security breaches, and other harmful incidents. Such risks must be mitigated before organizations, and the pub‐ lic, can realize the true benefits of this exciting technology. Addressing ML’s risks requires action from practitioners. While nascent standards, to which this book aims to adhere, have begun to take shape, the practice of ML still lacks broadly accepted professional licensing or best practices. That means it’s largely up to individual practitioners to hold themselves accountable for the good and bad outcomes of their technology when it’s deployed into the world. Machine Learning for High-Risk Appli‐ cations will arm practitioners with a solid understanding of model risk management processes and new ways to use common Python tools for training explainable models and debugging them for reliability, safety, bias management, security, and privacy issues. xi
📄 Page
18
We adapt a definition of AI from Stuart Russell and Peter Norvig’s book, Artificial Intelligence: A Modern Approach: The designing and building of intelligent systems that receive signals from the environment and take actions that affect that environment (2020). For ML, we use the common definition attributed—perhaps apoc‐ ryphally—to Arthur Samuel: [A] field of study that gives com‐ puters the ability to learn without being explicitly programmed (circa 1960). Who Should Read This Book This is a mostly technical book for early-to-middle career ML engineers and data scientists who want to learn about the responsible use of ML or ML risk management. The code examples are written in Python. That said, this book probably isn’t for every data scientist and engineer out there coding in Python. This book is for you if you want to learn some model governance basics and update your workflow to accommodate basic risk controls. This book is for you if your work needs to comply with certain nondiscrimination, transparency, privacy, or security standards. (Although we can’t guarantee compliance or provide legal advice!) This book is for you if you want to train explainable models, and learn to edit and debug them. Finally, this book is for you if you’re concerned that your work in ML may be leading to unintended consequences relating to sociological biases, data privacy violations, security vulnerabilities, or other known problems caused by automated decision making writ large—and you want to do something about it. Of course, this book may be of interest to others. If you’re coming to ML from a field like physics, econometrics, or psychometrics, this book can help you learn how to blend newer ML techniques with established domain expertise and notions of validity or causality. This book may give regulators or policy professionals some insights into the current state of ML technologies that may be used in an effort to comply with laws, regulations, or standards. Technical risk executives or risk managers may find this book helpful in providing an updated overview of newer ML approaches suited for high-stakes applications. And expert data scientists or ML engineers may find this book educational too, but they may also find it challenges many established data science practices. What Readers Will Learn Readers of this book will be exposed to both traditional model risk management and how to blend it with computer security best practices like incident response, bug bounties, and red-teaming, to apply battle-tested risk controls to ML workflows and systems. This book will introduce a number of older and newer explainable models, xii | Preface
📄 Page
19
and explanation techniques that make ML systems even more transparent. Once we’ve set up a solid foundation of highly transparent models, we’ll dig into testing models for safety and reliability. That’s a lot easier when we can see how our model works! We’ll go way beyond quality measurements in holdout data to explore how to apply well-known diagnostic techniques like residual analysis, sensitivity analysis, and benchmarking to new types of ML models. We’ll then progress to structuring models for bias management, testing for bias, and remediating bias from an organ‐ izational and technical perspective. Finally, we’ll discuss security for ML pipelines and APIs. The Draft European Union AI Act categorizes the following ML applications as high risk: biometric identification; management of critical infrastructure; education; employment; essential services, both public (e.g., public assistance) and private (e.g., credit lend‐ ing); law enforcement; immigration and border control; criminal justice; and the democratic process. These are the types of ML use cases we have in mind when we refer to high-risk applications, and that’s why we’ve chosen to focus the code examples in this book on computer vision and tree-based models for tabular data. Readers should also be aware that in this first edition we focus on more established ML methods for estimation and decision making. We do not address unsupervised learning, search, recommendation systems, reinforcement learning, and generative AI in great depth. There are several reasons for this: • These systems are not the most common commercial production systems, yet. • Before moving on to more sophisticated unsupervised, recommendation, and reinforcement learning or generative approaches, it is imperative that we master the fundamentals. This first edition of the book focuses on the basics that will enable readers to take on more sophisticated projects later. • Risk management for these systems is not as well understood as it is for the types of supervised models we concentrate on in this book. To be direct—as we often are in the remainder of the book—using models for which failure modes, mitigants, and controls are not well known can increase risk. We do hope to return to these topics in the future and we acknowledge they are affecting billions of people today—positively and negatively. We also note that with a little creativity and elbow grease many of the techniques, risk mitigants, and risk management frameworks in this book can and should be applied to unsupervised models, search, recommendation, and generative AI. Preface | xiii
📄 Page
20
Cutting-edge generative AI systems, like ChatGPT and GitHub Copilot, are an exciting way ML is impacting our lives. These sys‐ tems appear to have addressed some of the bias issues that plagued earlier generations of similar systems. However, they still pose risks when working in high-stakes applications. If we’re working with them and have concerns, we should consider the following simple guardrails: Don’t copy and paste from or into the user interface. Not using generated content directly and not pasting our own content directly into the interface can limit intellectual prop‐ erty and data privacy risks. Check all generated content. These systems continue to generate wrong, offensive, or other‐ wise problematic content. Avoid automation complacency. Generally, these systems are better suited to content genera‐ tion than to decision support. We should be careful not to let them unintentionally make decisions for us. Alignment with the NIST AI Risk Management Framework In an attempt to follow our own advice, and to make the book even more practical for those working on high-risk applications, we will highlight where the proposed approaches in the book align to the nascent National Institute of Standards and Technology (NIST) AI Risk Management Framework (RMF). Application of external standards is a well-known risk management tactic, and NIST has an incredible track record for authoritative technical guidance. The AI RMF has many components, but two of the most central are the characteristics for trustworthiness in AI and the core RMF guidance. The characteristics for trustworthiness establish the basic principles of AI risk management, while the core RMF guidance provides advice for the implementation of risk controls. We will use vocabulary relating to NIST’s characteristics for AI trustworthiness throughout the book: validity, reliability, safety, security, resiliency, transparency, accountability, explainability, interpretability, bias management, and enhanced privacy. At the beginning of each chapter in Part I, we’ll also use a callout box to break down how and where the content aligns to specific aspects of the core NIST AI RMF map, measure, manage, and govern functions. We hope alignment to the NIST AI RMF improves the usability of the book, making it a more effective AI risk management tool. xiv | Preface