📄 Page
1
Katy Warr Strengthening Deep Neural Networks Making AI Less Susceptible to Adversarial Trickery
📄 Page
2
(This page has no text content)
📄 Page
3
Katy Warr Strengthening Deep Neural Networks Making AI Less Susceptible to Adversarial Trickery Boston Farnham Sebastopol TokyoBeijing
📄 Page
4
978-1-492-04495-6 [GP] Strengthening Deep Neural Networks by Katy Warr Copyright © 2019 Katy Warr. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Jonathan Hassell Development Editor: Michele Cronin Production Editor: Deborah Baker Copy Editor: Sonia Saruba Proofreader: Rachel Head Indexer: WordCo Indexing Services Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest July 2019: First Edition Revision History for the First Edition 2019-07-02: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492044956 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Strengthening Deep Neural Networks, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
📄 Page
5
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Part I. An Introduction to Fooling AI 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 A Shallow Introduction to Deep Learning 1 A Very Brief History of Deep Learning 3 AI “Optical Illusions”: A Surprising Revelation 5 What Is “Adversarial Input”? 6 Adversarial Perturbation 8 Unnatural Adversarial Input 9 Adversarial Patches 11 Adversarial Examples in the Physical World 12 The Broader Field of “Adversarial Machine Learning” 15 Implications of Adversarial Input 15 2. Attack Motivations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Circumventing Web Filters 18 Online Reputation and Brand Management 19 Camouflage from Surveillance 20 Personal Privacy Online 21 Autonomous Vehicle Confusion 21 Voice Controlled Devices 23 3. Deep Neural Network (DNN) Fundamentals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Machine Learning 25 A Conceptual Introduction to Deep Learning 27 iii
📄 Page
6
DNN Models as Mathematical Functions 30 DNN Inputs and Outputs 33 DNN Internals and Feed-Forward Processing 34 How a DNN Learns 38 Creating a Simple Image Classifier 43 4. DNN Processing for Image, Audio, and Video. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Image 50 Digital Representation of Images 50 DNNs for Image Processing 52 Introducing CNNs 53 Audio 58 Digital Representation of Audio 59 DNNs for Audio Processing 60 Introducing RNNs 62 Speech Processing 64 Video 65 Digital Representation of Video 65 DNNs for Video Processing 65 Adversarial Considerations 66 Image Classification Using ResNet50 68 Part II. Generating Adversarial Input 5. The Principles of Adversarial Input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 The Input Space 76 Generalizations from Training Data 79 Experimenting with Out-of-Distribution Data 82 What’s the DNN Thinking? 83 Perturbation Attack: Minimum Change, Maximum Impact 87 Adversarial Patch: Maximum Distraction 89 Measuring Detectability 90 A Mathematical Approach to Measuring Perturbation 90 Considering Human Perception 93 Summary 95 6. Methods for Generating Adversarial Perturbation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 White Box Methods 102 Searching the Input Space 102 Exploiting Model Linearity 105 Adversarial Saliency 113 iv | Table of Contents
📄 Page
7
Increasing Adversarial Confidence 119 Variations on White Box Approaches 121 Limited Black Box Methods 121 Score-Based Black Box Methods 127 Summary 129 Part III. Understanding the Real-World Threat 7. Attack Patterns for Real-World Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Attack Patterns 133 Direct Attack 135 Replica Attack 136 Transfer Attack 137 Universal Transfer Attack 140 Reusable Patches and Reusable Perturbation 141 Bringing It Together: Hybrid Approaches and Trade-offs 144 8. Physical-World Attacks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Adversarial Objects 148 Object Fabrication and Camera Capabilities 149 Viewing Angles and Environment 151 Adversarial Sound 156 Audio Reproduction and Microphone Capabilities 156 Audio Positioning and Environment 157 The Feasibility of Physical-World Adversarial Examples 159 Part IV. Defense 9. Evaluating Model Robustness to Adversarial Inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Adversarial Goals, Capabilities, Constraints, and Knowledge 165 Goals 165 Capabilities, Knowledge, and Access 169 Model Evaluation 171 Empirically Derived Robustness Metrics 172 Theoretically Derived Robustness Metrics 176 Summary 177 10. Defending Against Adversarial Inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Improving the Model 180 Gradient Masking 181 Table of Contents | v
📄 Page
8
Adversarial Training 183 Out-of-Distribution Confidence Training 192 Randomized Dropout Uncertainty Measurements 196 Data Preprocessing 202 Preprocessing in the Broader Processing Chain 203 Intelligently Removing Adversarial Content 206 Concealing the Target 207 Building Strong Defenses Against Adversarial Input 209 Open Projects 209 Taking a Holistic View 210 11. Future Trends: Toward Robust AI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Increasing Robustness Through Outline Recognition 213 Multisensory Input 215 Object Composition and Hierarchy 216 Finally… 217 A. Mathematics Terminology Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 vi | Table of Contents
📄 Page
9
Preface Artificial intelligence (AI) is prevalent in our lives. Every day, machines make sense of complex data: surveillance systems perform facial recognition, digital assistants com‐ prehend spoken language, and autonomous vehicles and robots are able to navigate the messy and unconstrained physical world. AI not only competes with human capabilities in areas such as image, audio, and text processing, but often exceeds human accuracy and speed. While we celebrate advancements in AI, deep neural networks (DNNs)—the algo‐ rithms intrinsic to much of AI—have recently been proven to be at risk from attack through seemingly benign inputs. It is possible to fool DNNs by making subtle altera‐ tions to input data that often either remain undetected or are overlooked if presented to a human. For example, alterations to images that are so small as to remain unno‐ ticed by humans can cause DNNs to misinterpret the image content. As many AI sys‐ tems take their input from external sources—voice recognition devices or social media upload, for example—this ability to be tricked by adversarial input opens a new, often intriguing, security threat. This book is about this threat, what it tells us about DNNs, and how we can subsequently make AI more resilient to attack. By considering real-world scenarios where AI is exploited in our daily lives to process image, audio, and video data, this book considers the motivations, feasibility, and risks posed by adversarial input. It provides both intuitive and mathematical explana‐ tions for the topic and explores how intelligent systems can be made more robust against adversarial input. Understanding how to fool AI also provides us with insights into the often opaque deep learning algorithms, and discrepancies between how these algorithms and the human brain process sensory input. This book considers these differences and how artificial learning may move closer to its biological equivalent in the future. vii
📄 Page
10
Who Should Read This Book The target audiences of this book are: • Data scientists developing DNNs. You will gain greater understanding of how to create DNNs that are more robust against adversarial input. • Solution and security architects incorporating deep learning into operational pipe‐ lines that take image, audio, or video data from untrusted sources. After reading this book, you will understand the risks of adversarial input to your organiza‐ tion’s information assurance and potential risk mitigation strategies. • Anyone interested in the differences between artificial and biological perception. If you fall into this category, this book will provide you with an introduction to deep learning and explanations as to why algorithms that appear to accurately mimic human perception can get it very wrong. You’ll also get an insight into where and how AI is being used in our society and how artificial learning may become better at mimicking biological intelligence in the future. This book is written to be accessible to people from all knowledge backgrounds, while retaining the detail that some readers may be interested in. The content spans AI, human perception of audio and image, and information assurance. It is deliberately cross-disciplinary to capture different perspectives of this fascinating and fast- developing field. To read this book, you don’t need prior knowledge of DNNs. All you need to know is in an introductory chapter on DNNs (Chapter 3). Likewise, if you are a data scientist familiar with deep learning methods, you may wish to skip that chapter. The explanations are presented to be accessible to both mathematicians and non- mathematicians. Optional mathematics is included for those who are interested in seeing the formulae that underpin some of the ideas behind deep learning and adver‐ sarial input. Just in case you have forgotten your high school mathematics and require a refresher, key notations are included in the appendix. The code samples are also optional and provided for those software engineers or data scientists who like to put theoretical knowledge into practice. The code is written in Python, using Jupyter notebooks. Code snippets that are important to the narrative are included in the book, but all the code is located in an associated GitHub reposi‐ tory. Full details on how to run the code are also included in the repository. This is not a book about security surrounding the broader topic of machine learning; its focus is specifically DNN technologies for image and audio processing, and the mechanisms by which they may be fooled without misleading humans. viii | Preface
📄 Page
11
How This Book Is Organized This book is split into four parts: Part I, An Introduction to Fooling AI This group of chapters provides an introduction to adversarial input and attack motivations and explains the fundamental concepts of deep learning for process‐ ing image and audio data: • Chapter 1 begins by introducing adversarial AI and the broader topic of deep learning. • Chapter 2 considers potential motivations behind the generation of adversa‐ rial image, audio, and video. • Chapter 3 provides a short introduction to DNNs. Readers with an under‐ standing of deep learning concepts may choose to skip this chapter. • Chapter 4 then provides a high-level overview of DNNs used in image, audio, and video processing to provide a foundation for understanding the concepts in the remainder of this book. Part II, Generating Adversarial Input Following the introductory chapters of Part I, these chapters explain adversarial input and how it is created in detail: • Chapter 5 provides a conceptual explanation of the ideas that underpin adversarial input. • Chapter 6 then goes into greater depth, explaining computational methods for generating adversarial input. Part III, Understanding the Real-World Threat Building on the methods introduced in Part II, this part considers how an adver‐ sary might launch an attack in the real world, and the challenges that they might face: • Chapter 7 considers real attacks and the challenges that an adversary faces when using the methods defined in Part II against real-world systems. • Chapter 8 explores the specific threat of adversarial objects or adversarial sounds that are developed and created in the physical world. Part IV, Defense Building on Part III, this part moves the discussion to building resilience against adversarial input: Preface | ix
📄 Page
12
• Chapter 9 considers how the robustness of neural networks can be evaluated, both empirically and theoretically. • Chapter 10 explores the most recent thinking in the area of how to strengthen DNN algorithms against adversarial input. It then takes a more holistic view and considers defensive measures that can be introduced to the broader processing chain of which the neural network technology is a part. • Finally, Chapter 11 looks at future directions and how DNNs are likely to evolve in forthcoming years. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. x | Preface
📄 Page
13
Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/katywarr/strengthening-dnns. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a signifi‐ cant amount of example code from this book into your product’s documentation does require permission. We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Strengthening Deep Neural Networks by Katy Warr (O’Reilly). Copyright 2019 Katy Warr, 978-1-492-04495-6.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. The Mathematics in This Book This book is intended for both mathematicians and nonmathematicians. If you are unfamiliar with (or have forgotten) mathematical notations, Appendix A contains a summary of the main mathematical symbols used in this book. O’Reilly Online Learning For almost 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help compa‐ nies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in- depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com. Preface | xi
📄 Page
14
How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/Strengthening_DNNs. To comment or ask technical questions about this book, send email to bookques‐ tions@oreilly.com. For more information about our books, courses, conferences, and news, see our web‐ site at http://www.oreilly.com. Find us on Facebook: http://facebook.com/oreilly Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://www.youtube.com/oreillymedia Acknowledgments I am very grateful to the O’Reilly team for giving me the opportunity to write this book and providing excellent support throughout. Thank you especially to my editor, Michele Cronin, for her help and encouragement, and to the production team of Deborah Baker, Rebecca Demarest, and Sonia Saruba. Thanks also to Nick Adams from the tools team for working out some of the more tricky LaTeX math formatting. Thank you to my reviewers: Nikhil Buduma, Pin-Yu Chen, Dominic Monn, and Yacin Nadji. Your comments were all extremely helpful. Thank you also Dominic for checking over the code and providing useful suggestions for improvement. Several of my work colleagues at Roke Manor Research provided insightful feedback that provoked interesting discussions on deep learning, cybersecurity, and mathemat‐ ics. Thank you to Alex Collins, Robert Hancock, Darren Richardson, and Mark West. Much of this book is based on recent research and I am grateful to all the researchers who kindly granted me permission to use images from their work. Thank you to my children for being so supportive: Eleanor for her continual encour‐ agement, and Dylan for patiently explaining some of the math presented in the xii | Preface
📄 Page
15
research papers (and for accepting that “maths” might be spelled with a letter missing in this US publication). Finally, thank you to my husband George for the many cups of tea and for reviewing the early drafts when the words were in completely the wrong order. Sorry I didn’t include your jokes. Preface | xiii
📄 Page
16
(This page has no text content)
📄 Page
17
PART I An Introduction to Fooling AI This section provides an introduction to deep neural networks (DNNs), exploring how these can, and why they might be, tricked by adversarial input. To begin, Chapter 1 takes a look at the concept of adversarial input and a little his‐ tory. We’ll peek at some of the fascinating research that has provided insights into DNNs and how they can be fooled. Chapter 2 then goes on to explore the potential impact of adversarial input, examining real-world motivations for fooling the AI that is the foundation of systems such as social media sites, voice control audio, and autonomous vehicles. The final chapters in this section give an introduction to DNNs for image, audio, and video, for those of you who are unfamiliar with this area or would like a refresher. They will provide the necessary foundation for understanding the concepts in the remainder of the book. Chapter 3 explains the basic principles of machine and deep learning. Chapter 4 explains typical ways in which these principles are extended and applied to understand image, audio, and video. Both of these chapters finish with code examples that will be revisited later in the book when we examine how adversa‐ rial input is created and defended against. At the end of this section, you will have an understanding of adversarial examples, the motivations for creating them, and the systems at risk of attack. Part II will then examine how adversarial input is created to trick image and audio DNNs.
📄 Page
18
(This page has no text content)
📄 Page
19
CHAPTER 1 Introduction This book is concerned with deep neural networks (DNNs), the deep learning algo‐ rithms that underpin many aspects of artificial intelligence (AI). AI covers the broad discipline of creating intelligent machines that mimic human intelligence capabilities such as the processing and interpretation of images, audio, and language; learning from and interacting with unpredictable physical and digital environments; and rea‐ soning about abstract ideas and concepts. While AI also exploits other methods such as the broader field of machine learning (ML) and traditionally programmed algo‐ rithms, the ability of deep learning to imitate human capabilities places DNNs central to this discipline. DNNs can mimic, and often exceed, human capability in many tasks, such as image processing, speech recognition, and text comprehension. How‐ ever, this book is not about how accurate or fast DNNs are; it’s about how they can be fooled and what can be done to strengthen them against such trickery. This introduction will begin with a brief explanation of DNNs, including some his‐ tory and when it first became apparent that they might not always return the answer that we expect. This introductory chapter then goes on to explain what comprises adversarial input and its potential implications in a society where AI is becoming increasingly prevalent. A Shallow Introduction to Deep Learning A DNN is a type of machine learning algorithm. In contrast to traditional software programs, these algorithms do not expose the rules that govern their behavior in explicitly programmed steps, but learn their behavior from example (training) data. The learned algorithm is often referred to as a model because it provides a model of the characteristics of the training data used to generate it. 1
📄 Page
20
1 Determining the kind of input data that this DNN would need to perform the required task is left as an exer‐ cise for the reader. DNNs are a subset of a broader set of algorithms termed artificial neural networks (ANNs). The ideas behind ANNs date back to the 1940s and 1950s, when researchers first speculated that human intelligence and learning could be artificially simulated through algorithms (loosely) based on neuroscience. Because of this background, ANNs are sometimes explained at a high level in terms of neurobiological constructs, such as neurons and the axons and synapses that connect them. The architecture (or structure) of an ANN is typically layered, ingesting data into a first layer of artificial “neurons” that cause connecting artificial “synapses” to fire and trigger the next layer, and so on until the final neuron layer produces a result. Figure 1-1 is an extreme simplification of the highly advanced artificial neural pro‐ cessing performed by Deep Thought in The Hitchhiker’s Guide to the Galaxy, by Douglas Adams (1979). It takes in data and returns the meaning of life.1 Figure 1-1. A simplified depiction of a possible DNN implementation of Deep Thought, the computer tasked with establishing the meaning of life A DNN learns its behavior—essentially the circumstances under and extent to which the synapses and neurons should fire—by examples. Examples are presented in the form of training data, and the network’s behavior is adjusted until it behaves in the way that is required. The training step to create a DNN is classified as “deep” learning because, in contrast to the simple ANNs, DNN models comprise multiple layers of neurons between the layer that receives the input and the layer that produces output. They are used when the data or problem is too complex for simple ANNs or more traditional ML approaches. 2 | Chapter 1: Introduction