Statistics
63
Views
0
Downloads
0
Donations
Uploader

高宏飞

Shared on 2025-12-02
Support
Share

AuthorPhilip A. Dursey

AI is the new attack surface – are your intelligent systems secure? In an era where self-driving cars, trading algorithms, and AI chatbots run critical operations, Red Teaming AI: Attacking & Defending Intelligent Systems shows you how to stay one step ahead of emerging threats. This comprehensive guide (1100+ pages) teaches you to think like an attacker so you can fortify your AI models before disaster strikes. Written by Philip A. Dursey – a 3× AI founder, former CISO, and current HYPERGAME CEO with nearly 20 years working on adversarial systems and security – this book distills hard-won expertise into practical strategies you can apply immediately.

Tags
No tags
Publisher: AI Security LLC
Publish Year: 2025
Language: 英文
File Format: PDF
File Size: 10.0 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

RED TEAMING AI
(This page has no text content)
RED TEAMING AI ATTACKING & DEFENDING INTELLIGENT SYSTEMS PHILIP A. DURSEY AI SECURITY LLC
Copyright © 2025 by Philip A. Dursey All rights reserved. No part of this book may be reproduced in any form or by any electronic or mechanical means, including information storage and retrieval systems, without written permission from the author, except for the use of brief quotations in a book review.
For my Family
(This page has no text content)
CONTENTS Legal Disclaimer xvii PART ONE FOUNDATIONS 1. INTRODUCTION TO AI SECURITY RISKS 3 Demystifying AI/ML for Security Professionals: A Red Teamer's View 6 The Expanding AI Attack Surface: A Systems Thinking Perspective 8 Why Traditional Security Paradigms Fall Short: Opening the Door for AI Red Teams 11 Overview of AI Vulnerability Categories: The Red Team Kill Graph 13 The Dual-Use Nature of AI: Attacker and Defender 15 Real-World Implications & Examples: Why AI Red Teaming Matters 18 References 19 Summary 21 Exercises (Red Team Focus) 22 2. DEFINING AI RED TEAMING 23 What is AI Red Teaming? 25 Distinguishing AI Red Teaming from Related Fields 28 The AI Red Teaming Engagement Lifecycle 31 Navigating Ethical and Legal Considerations 35 The Evolving Landscape 37 References 38 Summary 39 Exercises (Red Team Focus) 41 3. THE AI RED TEAMING MINDSET AND METHODOLOGY 43 Thinking Like an AI Adversary 45 Threat Modeling for AI Systems 51
Developing a Structured AI Red Teaming Methodology 61 Applying Frameworks 71 Broader Context and Perspectives 73 References 74 Summary 76 Exercises 77 PART TWO ATTACK TOOLS & TECHNIQUES – UNDERSTANDING HOW AI SYSTEMS BREAK 4. DATA POISONING ATTACKS 81 The Critical Role of Data Integrity 84 Types of Data Poisoning Attacks 85 Common Poisoning Techniques 90 Attacker Mindset: Choosing the Right Technique 104 Heightened Risks: Online and Federated Learning 109 Detection and Mitigation Strategies 112 References 117 Summary 118 Exercises 119 5. EVASION ATTACKS AT INFERENCE TIME 121 Understanding Adversarial Examples 123 Generating Adversarial Examples: The Attacker's Toolkit 125 Defending Against Evasion Attacks 143 References 146 Summary 148 Exercises 149 6. MODEL EXTRACTION AND STEALING 150 Why Steal a Model? The Attacker's Motivation 151 What Does It Mean to Steal a Model? 153 How Do Model Extraction Attacks Work? 155 The Red Teamer's Perspective 172 Defenses Against Model Extraction 180 References 193
Summary 195 Exercises 197 7. MEMBERSHIP INFERENCE ATTACKS 198 Real-World Example: ChatGPT Incident 199 What is Membership Inference? 199 Why Does Membership Inference Matter? The Privacy Implications 200 How Membership Inference Attacks Work: Leaking Information 203 Attack Techniques 204 Defensive Strategies Against Membership Inference 215 References 217 Summary 219 Exercises 220 8. PROMPT INJECTION AND LLM MANIPULATION 222 The Unique LLM Attack Surface 223 Direct vs. Indirect Prompt Injection 224 Prompt Manipulation Techniques 230 The Human Element and Social Engineering 238 Exploiting Plugins, Tools, and Function Calling 239 Defensive Considerations and Mitigation Strategies 244 References 251 Summary 254 Exercises 255 9. ATTACKING & DEFENDING AI INFRASTRUCTURE 257 Attacking the MLOps Lifecycle Components 259 Exploiting Frameworks and Libraries 269 Securing Cloud and Container Environments 275 GPU-Speci!c Attacks and Defenses in AI Infrastructure 277 Securing the Data Architecture Infrastructure 283 API Security for AI Systems 287 Software Supply Chain Security for AI 289 References 290 Summary 293 Exercises 294
10. PRIVACY ATTACKS BEYOND MEMBERSHIP INFERENCE 297 Understanding Advanced Privacy Attack Vectors 298 Attribute Inference: Inferring Hidden Secrets of Individuals 301 Model Inversion: Reconstructing Representative Training Data 305 Property Inference: Uncovering Global Dataset Secrets 312 Linkage Attacks: Re-Identifying Individuals Across Datasets 315 Impact of Privacy Attacks 319 Federated Learning: Distributed Training, Distributed Risks? 321 Defenses Against Advanced Privacy Attacks 326 Ethical and Regulatory Considerations 334 References 336 Summary 338 Exercises 339 11. SOCIAL ENGINEERING AND HUMAN FACTORS IN AI SECURITY 341 AI-Enhanced Social Engineering 343 AI-Driven Deception and Social Engineering: The Cognitive Battle!eld 348 The Rise of Deepfakes and Voice Cloning 351 Disinformation and In"uence Operations 352 Exploiting User Trust in AI Systems 357 Targeting the Human Element in the AI Pipeline 359 Challenges in Detection and Mitigation 360 Defenses and Mitigation Strategies 361 Ethical Considerations and Responsible AI Use 367 Future Trends and Evolving Threats 368 References 369 Summary 372 Exercises 373 PART THREE
PART THREE AI RED TEAMING IN ACTION – FROM THEORY TO PRACTICE 12. RECONNAISSANCE FOR AI SYSTEMS 379 Identifying AI Components 380 Passive vs. Active Reconnaissance 382 Fingerprinting Models and Frameworks 383 Discovering APIs, Endpoints, and Data Flows 389 Understanding Data Flow 393 Open Source Intelligence (OSINT) for AI 396 Synthesizing Reconnaissance Findings 400 References 401 Summary 402 Exercises 403 13. ESSENTIAL TOOLS FOR THE AI RED TEAMER 410 Setting Up Your AI Red Teaming Lab 411 Key Libraries for Adversarial Machine Learning 418 Tools for Prompt Injection and LLM Assessment 423 Leveraging Standard Penetration Testing Tools 425 Advanced Simulation, Emulation, and Deception Platforms 429 The Power of Custom Scripting 430 References 433 Summary 437 Exercises 438 14. RED TEAMING LARGE LANGUAGE MODELS (LLMS) 444 Hands-on Prompt Injection Testing 446 Testing for Data Leakage 459 Assessing Safety Filters and Alignment 462 Exploiting Plugins, Tools, and Functions 465 Denial of Service (DoS) Attacks 470 Reporting LLM Red Team Findings 472 Case Study: Red Teaming "HelpBot 5000" 475 References 477 Summary 480 Exercises 482
15. RED TEAMING COMPUTER VISION (CV) SYSTEMS 486 Adversarial Examples in the Image Domain 487 Attacking Object Detection and Segmentation 499 Facial Recognition Vulnerabilities 501 Physical Adversarial Attacks 504 Ethical Considerations in CV Red Teaming 509 Case Study: Red Teaming a Smart Surveillance Camera System 510 References 514 Summary 517 Exercises 518 16. RED TEAMING SPEECH AND AUDIO SYSTEMS 520 Adversarial Audio Attacks 521 Attacking Speech-to-Text (ASR) Systems 526 Voice Assistant Security 527 War Stories: Audio Attacks in Practice 531 Practical Tools for Adversarial Audio Testing 537 Future Trends and Research Directions 539 References 540 Summary 542 Exercises 543 17. RED TEAMING OTHER AI DOMAINS 544 Attacking Recommender Systems 546 Evading Anomaly Detection Systems 560 Exploiting Reinforcement Learning (RL) Systems 571 Attacking Tabular Data Models 592 Cross-Domain Attack Considerations 607 References 610 Summary 613 Exercises 614 18. ADVANCED TECHNIQUES AND BYPASSES 616 Bypassing Defenses 617 Multi-Stage Attacks and Vulnerability Chaining 624 Exploiting Interpretability Tools 627 Attacking Watermarking 630 Emerging Techniques and Future Trends 635
Advanced Defense Paradigms: Active Defense, Hypergames, and Re!exive Control 637 Contextualizing Advanced Attacks with Frameworks 639 References 641 Summary 645 Exercises 646 19. EFFECTIVE REPORTING AND COMMUNICATION 647 Structuring Your Findings for Clarity and Impact 648 Quantifying and Communicating Risk 652 Visualizing Attacks and Impact 655 Communicating E"ectively to Di"erent Stakeholders 659 Presenting Findings and Gathering Feedback 662 Operational Security (OPSEC) for Reporting and Handling Sensitive Findings 663 Driving Action: Remediation Tracking and Follow- up 666 Responsible Disclosure 669 References 671 Summary 673 Exercises 674 PART FOUR BUILDING RESILIENT AI SYSTEMS 20. REMEDIATION STRATEGIES AND DEFENSES 679 Defense-in-Depth for AI Systems: A Systems Thinking Approach 681 Threat-Informed Defense: Prioritizing Based on Adversary Behavior 684 Robust Training Practices 687 Input Validation and Sanitization 688 Output Filtering and Monitoring 696 Model Hardening Techniques 698 Active Defense: Generative Deception and Agentic Responses 701 Organizational Aspects of Remediation 702
Continuous Monitoring, Incident Response, and Remediation Operations: Enabling Resilience 704 References 710 Summary 712 Exercises 714 21. INTEGRATING AI RED TEAMING INTO THE DEVELOPMENT LIFECYCLE 716 Shifting Left: The Imperative for Early AI Security Testing 717 Introducing the Secure AI Development Lifecycle (SAIDL) 721 Continuous and Automated AI Red Teaming 733 Fostering E!ective Collaboration Models 738 Addressing Insider Threats in the AI Lifecycle 742 Leveraging Bug Bounty Programs for AI Systems 750 References 754 Summary 757 Exercises 759 PART FIVE STRATEGY, FORESIGHT, AND RESPONSIBILITY 22. BUILDING AND MATURING AN AI RED TEAM CAPABILITY 763 De"ning the AI Red Team's Scope, Mandate, and Goals: The Foundation of Authority 765 Structuring the Team: Assembling the Elite AI Adversarial Unit 770 Developing Processes and Playbooks: Operationalizing the Capability 778 Measuring Success: Metrics, KPIs, and Demonstrating Impactful ROI 786 Budgeting and Justifying ROI: Securing Resources for Strategic Assurance 792 Leveling Up: AI Red Teaming Meets Cyber Wargaming 796 The Future is Automated (and Autonomous?): AI for AI Red Teaming 800
Staying Current: The Unrelenting Mandate for Continuous Learning and Adaptation 805 Summary: Forging a Strategic AI Assurance Capability 809 References 811 Exercises 813 23. EMERGING THREATS AND FUTURE ATTACK VECTORS 817 AI vs. AI: The Automation of Attack and Defense 819 The Quantum Shadow: Potential Impacts on AI Security 827 Federated Learning: Distributed Risks 829 Beyond LLMs: Security of Other Generative AI Models 831 Securing AI in the Physical World: Robotics and Automation 833 Future Research Directions 837 Long-Term and Systemic Risks 842 The Specter of Arti!cial General Intelligence (AGI) 844 References 846 Summary 848 24. NAVIGATING THE AI RISK LANDSCAPE: REGULATION, ETHICS, AND SOCIETAL IMPACT 850 The Shifting Regulatory Terrain: Compliance vs. Demonstrated Security 852 US Policy & Strategic Directions: Evaluating Impact Beyond Intent 858 The Geo-Strategic Context: Market Agility vs. State Control in the US-China Rivalry 861 The AI-Cyber Warfare and Exploitation Dynamic 863 State Responses: Cyber Privateering and Dismantling Adversarial AI 865 AI in the Cyber Intelligence Contest: Autonomous Defense and Hypergames 870 Visualizing the AI Risk Landscape 872 Bias, Fairness, and Transparency as Security Concerns 874
Ethics in O!ensive AI Research: Practicing Safe Science 879 Societal Impact and the Broader Threat Landscape 881 Open Source AI: Decentralization, Innovation, and Security Challenges 884 What This Means for Red Teamers: Embracing Adaptive Realities & High Agency 885 References 888 Summary 895 Exercises 897 25. THE ROAD AHEAD 899 Synthesizing the Core Principles 900 Thinking Strategically: Advanced Adversarial Models 903 The Evolving Threat Landscape and Defensive Posture 905 A Call to Action: Building Cyber Defense at the Speed of AI 906 Appendix A: Glossary of AI and Security Terms 911 Appendix B: Chapter Bibliography 963 Appendix C: AI Red Teaming Tool Compendium 1025 About the Author 1043
LEGAL DISCLAIMER The information presented throughout this book is intended strictly for educational and informational purposes. It is not a substitute for professional advice and should not be construed as legal, !nancial, technical, or ethical guidance. While this work explores techniques and methodologies related to security testing and AI red teaming, including adversarial tactics and system probing, such knowledge carries inherent risks and responsibilities. The reader assumes full responsibility for the consequences of any actions taken based on the content of this book. The authors and publisher strongly caution against the unauthorized use of any tools, strategies, or procedures described herein. Always seek and obtain explicit, written authorization before conducting any security assessments, red teaming operations, or related activities on systems you do not own or have direct permission to evaluate. Engaging in such activities without proper consent may violate laws, contractual obligations, or ethical norms, and could result in civil or criminal liability.
References, citations, or links to speci!c tools, technologies, organiza‐ tions, or individuals are provided solely for illustrative or informa‐ tional purposes. Inclusion of any such reference does not imply endorsement, recommendation, or a#liation. Readers should inde‐ pendently verify any cited resources before applying them in practice. Neither the authors, contributors, editors, nor the publisher shall be held liable for any loss, injury, damage, or legal consequence arising from the use or misuse of the information in this book. Readers are advised to consult with quali!ed legal counsel, cybersecurity profes‐ sionals, and other relevant experts before implementing any of the concepts discussed.
PART ONE FOUNDATIONS Welcome to the front lines of a new security paradigm. The rapid proliferation of Arti!cial Intelligence (AI) presents not just transfor‐ mative opportunities, but also a landscape fraught with novel and complex security challenges. Traditional defenses often prove inade‐ quate against threats that target the very intelligence and learning capabilities of these systems. Understanding how to secure AI is no longer a niche concern—it's an imperative for anyone involved in building, deploying, or managing these powerful technologies. Part I: Foundations lays the critical groundwork for navigating this evolving domain. We begin by confronting the 'why': Why do AI systems demand a fundamentally di#erent approach to security? This Part establishes the essential concepts and perspectives needed before you can e#ectively identify and mitigate AI-speci!c vulnera‐ bilities. We'll move from recognizing the unique threat landscape to understanding the specialized discipline designed to address it. As we explore the unique security risks inherent in AI systems (Chapter 1), the structured approach of AI Red Teaming (Chapter 2), and the crucial adversarial mindset and methodology (Chapter 3), it's
PHILIP A. DURSEY important to grasp the paradigm shift required. We are moving beyond conventional cybersecurity to a world where data, algorithms, and emergent behaviors become primary attack surfaces. Under‐ standing this shift is key to appreciating the depth and nature of AI vulnerabilities. By the end of this Part, you'll have a robust conceptual framework – a clear understanding of why AI security is distinct, what constitutes a dedicated adversarial assessment, and how to begin cultivating the mindset necessary to protect these intelligent systems. Our journey starts with an exploration of the unique security risks that AI intro‐ duces, setting the stage for a new way of thinking about security in an arti"cially intelligent world.
The above is a preview of the first 20 pages. Register to read the complete e-book.