Statistics
23
Views
0
Downloads
0
Donations
Uploader

高宏飞

Shared on 2025-12-18
Support
Share

AuthorLuk Arbuckle, Khaled El Emam

How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time. • Create anonymization solutions diverse enough to cover a spectrum of use cases • Match your solutions to the data you use, the people you share it with, and your analysis goals • Build anonymization pipelines around various data collection models to cover different business needs • Generate an anonymized version of original data or use an analytics platform to generate anonymized outputs • Examine the ethical issues around the use of anonymized data

Tags
No tags
ISBN: 1492053430
Publisher: O'Reilly Media
Publish Year: 2020
Language: 英文
Pages: 166
File Format: PDF
File Size: 14.3 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

(This page has no text content)
(This page has no text content)
Luk Arbuckle and Khaled El Emam Building an Anonymization Pipeline Creating Safe Data Boston Farnham Sebastopol TokyoBeijing
978-1-492-05343-9 [LSI] Building an Anonymization Pipeline by Luk Arbuckle and Khaled El Emam Copyright © 2020 K Sharp Technology, Inc., and Luk Arbuckle. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Jonathan Hassell Development Editor: Melissa Potter Production Editor: Christopher Faucher Copyeditor: Sonia Saruba Proofreader: Charles Roumeliotis Indexer: Angela Howard Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest April 2020: First Edition Revision History for the First Edition 2020-04-10: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492053439 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Building an Anonymization Pipeline, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors, and do not represent the publisher’s views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Identifiability 2 Getting to Terms 3 Laws and Regulations 4 States of Data 7 Anonymization as Data Protection 9 Approval or Consent 11 Purpose Specification 12 Re-identification Attacks 14 Anonymization in Practice 18 Final Thoughts 20 2. Identifiability Spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Legal Landscape 23 Disclosure Risk 24 Types of Disclosure 25 Dimensions of Data Privacy 28 Re-identification Science 31 Defined Population 32 Direction of Matching 35 Structure of Data 38 Overall Identifiability 41 Final Thoughts 42 iii
3. A Practical Risk-Management Framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Five Safes of Anonymization 44 Safe Projects 46 Safe People 49 Safe Settings 52 Safe Data 54 Safe Outputs 57 Five Safes in Practice 60 Final Thoughts 61 4. Identified Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Requirements Gathering 64 Use Cases 64 Data Flows 68 Data and Data Subjects 70 From Primary to Secondary Use 74 Dealing with Direct Identifiers 75 Dealing with Indirect Identifiers 76 From Identified to Anonymized 78 Mixing Identified with Anonymized 81 Applying Anonymized to Identified 85 Final Thoughts 87 5. Pseudonymized Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Data Protection and Legal Authority 89 Pseudonymized Services 90 Legal Authority 91 Legitimate Interests 93 A First Step to Anonymization 94 Revisiting Primary to Secondary Use 97 Analytics Platforms 98 Synthetic Data 101 Biometric Identifiers 107 Final Thoughts 109 6. Anonymized Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Identifiability Spectrum Revisited 111 Making the Connection 113 Anonymized at Source 114 Additional Sources of Data 116 Pooling Anonymized Data 118 iv | Table of Contents
Pros/Cons of Collecting at Source 118 Methods of Collecting at Source 120 Safe Pooling 122 Access to the Stored Data 123 Feeding Source Anonymization 124 Final Thoughts 125 7. Safe Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Foundations of Trust 127 Trust in Algorithms 129 Techniques of AIML 130 Technical Challenges 132 Algorithms Failing on Trust 135 Principles of Responsible AIML 138 Governance and Oversight 139 Privacy Ethics 140 Data Monitoring 141 Final Thoughts 142 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Table of Contents | v
(This page has no text content)
1 Khaled El Emam and Luk Arbuckle, Anonymizing Health Data: Case Studies and Methods to Get You Started, (Sebastopol, CA: O’Reilly, 2014), http://oreil.ly/anonymizing-health-data. Preface A few years ago we partnered with O’Reilly to write a book of case studies and meth‐ ods for anonymizing health data, walking readers through practical methods to pro‐ duce anonymized data sets in a variety of contexts.1 Since that time, interest in anonymization, sometimes also called de-identification, has increased due to the growth and use of data, evolving and stricter privacy laws, and expectations of trust by privacy regulators, by private industry, and by citizens from whom data is being collected and processed. Why We Wrote This Book The sharing of data for the purposes of data analysis and research can have many benefits. At the same time, concerns and controversies about data ownership and data privacy elicit significant debate. O’Reilly’s “Data Newsletter” on January 2, 2019, recognized that tools for secure and privacy-preserving analytics are a trend on the O’Reilly radar. Thus an idea was born: write a book that provides strategic opportuni‐ ties to leverage the spectrum of identifiability to disassociate the personal from data in a variety of contexts to enhance privacy while providing useful data. The result is this book, in which we explore end-to-end solutions to reduce the identifiability of data. We draw on various data collection models and use cases that are enabled by real business needs, have been learned from working in some of the most demanding data environments, and are based on practical approaches that have stood the test of time. vii
2 Ann Cavoukian, “Privacy by Design: The 7 Foundational Principles,” Information and Privacy Commissioner of Ontario (January 2011), https://oreil.ly/eSQRA. The central question we are consistently asked is how to utilize data in a way that pro‐ tects individual privacy, but still ensures the data is of sufficient granularity that ana‐ lytics will be useful and meaningful. By incorporating anonymization methods to reduce identifiability, organizations can establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. We will describe different technologies that reduce identifiability by generalizing, suppressing, or randomizing data, to produce outputs of data or statistics. We will also describe how these technologies fit within the broader theme of “risk-based” methods to drive the degree of data transformations needed based on the context of data sharing. The purpose of a risk-based approach is to replace an otherwise subjective gut check with a more guided decision-making approach that is scalable and proportionate, resulting in solutions that ensure data is useful while being sufficiently protected. Statistical estima‐ tors are used to provide objective support, with greater emphasis placed on empirical evidence to drive decision making. We have a combined three decades of experience in data privacy, from academic research and authorship to training courses, seminars, and presentations, as well as leading highly skilled teams of researchers, data scientists, and practitioners. We’ve learned a great deal, and we continue to learn a great deal, about how to put privacy technology into practice. We want to share that knowledge to help drive best practice forward, demonstrating that it is possible to achieve the “win-win” of data privacy that has been championed by the likes of former privacy commissioner Dr. Ann Cav‐ oukian in her highly influental concept of Privacy by Design.2 There are many privacy advocates that believe that we can and should treat privacy as a societal good that is encouraged and enforced, and that there are practical ways we can achieve this while meeting the wants and needs of our modern society. This is, however, a book of strategy, not a book of theory. Consider this book your advisor on how to plan for and use the full spectrum of anonymization tools and pro‐ cesses. The book will guide you in using data for purposes other than those originally intended, helping to ensure that data is not only richer but also that its use is legal and defensible. We will work through different scenarios based on three distinct classes of identifiability of the data involved, and provide details to understand some of the strategic considerations that organizations are struggling with. viii | Preface
Our aim is to help match privacy considerations to technical solu‐ tions. This book is generic, however, touching on a variety of topics relevant to anonymization. Legal interpretations are contextual, and we urge you to consult with your legal and privacy team! Mate‐ rials presented in this book are for informational purposes only, and not for the purpose of providing legal advice. Okay, now that we’ve given our disclaimer, we can breathe easy. Who This Book Was Written For When conceptualizing this book, we divided the audience in two groups: those who need strategic support (our primary audience) and those who need to understand strategic decisions (our secondary audience). Whether in government or industry, it is a functional need to deliver on the promise of data. We assume that our audience is ready to do great things, beyond compliance with data privacy and data protection laws. And we assume that they are looking for data access models, to enable the safe and responsible use of data. Primary audience (concerned with crafting a vision and ensuring the successful exe‐ cution of that vision): • Executive teams concerned with how to make the most of data, e.g., to improve efficiencies, derive new insights, and bring new products to market, all in an effort to make their services broader and better while enhancing the privacy of data subjects. They are more likely to skim this book to nail down their vision and how anonymization fits within it. • Data architects and data engineers who need to match their problems to privacy solutions, thereby enabling secure and privacy-preserving analytics. They are more likely to home in on specific details and considerations to help support strategic decisions and figure out the specifics they need for their use cases. Secondary audience (concerned with understanding the vision and how it will be executed): • Data analysts and data scientists who want to understand decisions made regard‐ ing the access they have to data. As a detail-oriented group, they may have more questions than we can cover in one book! From our experience this may lead to interest in understanding privacy more broadly (certainly a good thing). • Privacy professionals who wish to support the analytic function of an organiza‐ tion. They live and breathe privacy, and unless they have a technical background, they may actually want to dig into specific sections and considerations. That way Preface | ix
they can figure out how they can support use cases with their strong knowledge and understanding of privacy. A core challenge with writing a book of strategy about the safe and responsible use of data is striking the right balance in terms of language and scope. This book will cover privacy, data science, and data processing. Although we attempt to introduce the reader to some basic concepts in all of these areas, we recognize that it may be chal‐ lenging for some readers. We hope that the book will serve as an important reference, and encourage readers to learn more where they feel it is needed. How This Book Is Organized We’ll provide a conceptual basis for understanding anonymization, starting with an understanding of identifiability, that is, providing a reasonable estimate of clustering based on identifying features in data and the likelihood of an attack. We will do this in two chapters, starting with the idea of an identifiability spectrum to understand identifiability in data in Chapter 2, and then a governance framework that explains the context of data sharing to understand threats in Chapter 3. Identifiability will be assessed in terms of both data and context, since they are intimately linked. Our iden‐ tifiability spectrum will therefore evolve from the concept of data identifiability into one that encompasses both data and context. From this conceptual basis of identifiability, we will then look at data processing steps to create different pipelines. We’ll start with identified data and concepts from privacy engineering in Chapter 4, that is, how to design a system with privacy in mind, build‐ ing in protections and, in particular, reducing identifiability for those novel uses of data that fall outside of the original purposes of data collection. We will also touch on the subject of having both identified and anonymized data within the same data holdings. Once we’ve established the requirements related to identified data, we will consider another class of data for which direct identification has been removed, which we explained above as being pseudonymized. This is the first step to reducing identifia‐ bility, by removing names and addresses of the people in the data. In Chapter 5, we start to explicitly work toward anonymizing data. We first look at how pseudonymi‐ zation fits as data protection, and introduce a first step toward anonymization. We also consider analytics technologies that can sit on top of pseudonymized data, and what that means in terms of anonymization. Our final data pipeline is focused entirely on anonymization in Chapter 6 (so entirely about secondary uses of data). We start with the more traditional approach of push‐ ing the anonymization at source to a recipient. But then we turn things around, con‐ sidering the anonymized data as being pulled by the recipient. This way of thinking provides an interesting opportunity to leverage anonymization from a different set of x | Preface
requirements, and opens up a way to build data lakes. We will do this by building on concepts introduced in other chapters, to come up with novel approaches to building a pipeline. We finish the book in Chapter 7 with a discussion of the safe use of data, including the topics of accountability and ethics. The practical use of “deep learning” and related methods in artificial intelligence and machine learning (AIML) has intro‐ duced new concerns to the world of data privacy. Many frameworks and guiding principles have been suggested to manage these concerns, and we wish to summarize and provide some practical considerations in the context of building anonymization pipelines. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. Preface | xi
O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/building-anonymization- pipeline. Email bookquestions@oreilly.com to comment or ask technical questions about this book. For news and more information about our books and courses, see our website at http://oreilly.com. Find us on Facebook: http://facebook.com/oreilly Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://www.youtube.com/oreillymedia xii | Preface
Acknowledgments This book would not be possible without the support of the many experts at Privacy Analytics, who work day in and day out in advisory, data and software delivery, and implementation. It’s one thing to theorize solutions, it’s quite another to work with organizations, large and small, to bring privacy practices and solutions to market and at scale. It’s through working with clients that real-world solutions are born and grow up. We must gush about our technical reviewers! They took the time to read the entirety of the first draft of this book and provided valuable feedback. Their varied back‐ grounds provided critical insights. Their feedback to the manuscript allowed us to directly address areas in need of further development. While the views and opinions expressed in this book are our own, we hope that we successfully incorporated their feedback into the final version of this book. In alphabetical order, we wish to thank: Bryan Cline, an expert in standards and risk management; Jordan Collins, an expert in real-world anonymization; Leroy Ruggerio, an expert in business technology; and Malcolm Townsend, an expert in data protection technology. We would also like to thank Felix Ritchie for having created and promoted the adop‐ tion of the Five Safes, which served as inspiration to us! An entire chapter is dedica‐ ted to the Five Safes, and we have been fortunate to work with Felix since we drafted our first version of that chapter. We appreciated the help of Pierre Chetelat with final edits, which also served as an opportunity for him to learn about the legal and techni‐ cal landscape in which we work. Finally, we must thank O’Reilly for giving us the opportunity to write another book about anonymization in practice. And also Melissa Potter, our development editor at O’Reilly, who supported us in the writing and editing of this book. We may not have visibility behind the curtain at O’Reilly, but we also thank their team of diligent copy editors, graphic artists, technical support, and everyone else who brings books to market. Preface | xiii
(This page has no text content)
CHAPTER 1 Introduction Data is recognized as an important driver of innovation in economic and research activities, and is used to improve services and derive new insights. Services are deliv‐ ered more efficiently, at a lower cost, and with increased usability, based on an analy‐ sis of relevant data regarding how a service is provided and used. Insights improve outcomes in many facets of our lives, reducing the likelihood of fatal accidents (in travel, work, or leisure), getting us better returns from financial investments, or improving health-related outcomes by allowing us to understand disease progression and environmental influences, to name but a few examples. Sharing and using data responsibly is at the core of all these data-driven activities. The focus of this book is on implementing and deploying solutions to reduce identifi‐ ability within a data pipeline, and it’s therefore important to establish context around the technologies and data flows that will be used in production. Example applications include everything from structured data collection to Internet of Things (IoT) and device data (smart cities, telco, medical). In addition to the advantages and limita‐ tions of particular technologies, decision makers need to understand where these technologies apply within a deployed data pipeline so that they can best manage the spectrum of identifiability. Identifiability is more than just a black-and-white concept, as we will see when we explore a range of data transformations and disclosure contexts. Before we delve into the concepts that will drive the selection of solutions and how they’re deployed, we need to appreciate some concepts of privacy and data protection. These will help frame the scope of this book, and in particular the scope of reducing identifiability. While this is a book about anonymization, we divide the book up by different categories of identifiability that have been established by privacy and data protection laws and regulations. We will also demonstrate how to support proper anonymization through the concepts of these laws and regulations, and provide 1
1 For an excellent summary of the identifiability spectrum applied across a range of controls, see Kelsey Finch, “A Visual Guide to Practical De-Identification,” Future of Privacy Forum, April 25, 2016, https://oreil.ly/siE1D. examples of where things went wrong because proper anonymization was not employed. Anonymization should, in practice, involve more than just removing peo‐ ple’s names from data. Identifiability Best practice recognizes that data falls on a spectrum of identifiability,1 and that this spectrum can be leveraged to create various pipelines to anonymization. This spec‐ trum is managed through technology-enabled processes, including security and pri‐ vacy controls, but more specifically through data transformations and monitoring. We will explain how to objectively compare data sharing options for various data col‐ lection use cases to help the reader better understand how to match their problems to privacy solutions, thereby enabling secure and privacy-preserving analytics. There is a range of permutations in how to reduce identifiability, including where and when to provide useful data while meaningfully protecting privacy in light of broader benefits and needs. While technology is an important enabler of anonymization, technology is not the end of the story. Accounting for risks in an anonymization process is critical to ach‐ ieving the right level of data transformations and resulting data utility, which influ‐ ences the analytic outcomes. Accordingly, to maintain usable outcomes, an organization must have efficient methods of measuring, monitoring, and assuring the controls associated with each disclosure context. Planning and documenting are also critical for any regulated area, as auditors and investigators need to review implemen‐ tations to ensure the right balance is met when managing risks. And, ultimately, anonymization can be a catalyst for responsibly using data, as it is privacy enhancing. There is a security component to responsibly using data that comes from limiting the ability to identify individuals, as well as an ethical compo‐ nent that comes from deriving insights that are broader than single individuals. Con‐ ceptually, we can think of this as using “statistics” (that is, numerical pieces of information) rather than single individuals, and using those statistics to leverage insights into broader populations and application areas to increase reach and impact. Let’s discuss some of the other terms you’ll need to know next. 2 | Chapter 1: Introduction
Getting to Terms Before we can dig in and describe anonymization in any more detail, there are some terms it would be best to introduce at the outset, for those not familiar with the pri‐ vacy landscape. We will describe a variety of privacy considerations and data flows in this book based on potential data pipelines, and we will simply describe this as data sharing. Whether the data is released, as in a copy of the data is provided to another party, or access is granted to an external user of a repository or system internal to an organization, it’s all sharing to us! Sometimes the term disclosure is also used for shar‐ ing data, and in a very broad sense. In an attempt to keep things simple, we will make no distinction between these terms. We will use the terms data custodian to refer to the entity (meaning person or com‐ pany) sharing data, and data recipient to refer to the entity receiving data. For internal data sharing scenarios, the data custodian is the organization as an entity, and the data recipient is a functional unit within that organization. The organization main‐ tains oversight on the data sharing to the functional unit, and ensures that functional unit is treated as a separate unit so it can be assessed and treated as a legitimate data recipient. We will discuss this scenario in more detail later in the book. In this book we have chosen to use the term identifiability, which pairs well with privacy laws and regulations that describe identifia‐ ble information, rather than speak of “re-identification risk.” Although our measures are probabilistic, nonexperts sometimes find this approach to be both daunting and discouraging due to the focus on “risk.” We hope that this change in language will set a more reasonable tone, and put the focus on more important apsects of building data pipelines that reduce identifiability and provide reasonable assurance that data is nonidentifiable. We would struggle to describe anonymization, and privacy in general, without explaining that personal data is information about an identifiable individual. You may also come across the terms personal information (as it’s referred to in Canada), person‐ ally identifying information (used in the US), or protected health information (identifi‐ able health information defined for specific US health organizations). Personal data is probably the broadest of these terms (and due to EU privacy regulations, also of high impact globally), and since our focus is on data for analytics, we will use this term throughout this book. In legal documentation, the term used will depend on which law applies. For example, personally identifying information mixed with protected health information would simply be called protected health information. Getting to Terms | 3
2 Generally speaking, laws are written by a legislative assembly to codify rules, and regulations are written by admistrative agencies and departments to put these rules into practice. Both are enforceable. 3 HIPAA applies to health care providers, health care clearinghouses, and health plans (collectively known as covered entities), as well as their business associates. Health data that does not fall into these categories is not covered. When personal data is discussed, an identifiable individual is often referred to as a data subject. The data subject is not necesarily the “thing under study” (that is, the “unit of analysis,” a term commonly used in scientific research to mean the person or thing under study). If data is collected about births, the thing under study may be the actual births, the infants, or the mothers. That is, the statistical analysis can focus on any one of these, and changing the thing under study can change how data is organ‐ ized and how the statistical tools are used. For example, an analysis of mothers could be hierarchical, with infants at a different structural level. We will describe simple data structures with regard to statistical analysis in the next chapter. For the purposes of this book, and most privacy laws and regulations, any individual represented in the data is considered a data subject. The thing under study could be households, where the adult guardians represent the individuals that are of primary interest to the study. Although the number of children a person has (as parent or guardian) is personal, children are also data subjects in their own right. That being said, laws and regulations vary, and there are exceptions. Information about profes‐ sional activities may be confidential but not necessarily private. We will ignore these exceptions and instead focus on all individuals in the data as data subjects whose identity we endeavor to protect. Laws and Regulations Many of the terms that can help us understand anonymization are to be found in pri‐ vacy laws and regulations.2 Data protection, or privacy laws and regulations (which we will simply call laws and regulations, or privacy laws and regulations), and subse‐ quent legal precedents, define what is meant by personal data. This isn’t a book about law, and there are many laws and regulations to consider (including national, regional, sectorial, even cultural or tribal norms, depending on the country). How‐ ever, there are two that are notable for our purposes, as they have influenced the field of anonymization in terms of how it is defined and its reach: Health Insurance Portability and Accountability Act (HIPAA) Specific to US health data (and a subset at that),3 HIPAA includes a Privacy Rule that provides the most descriptive definition of anonymization (called de- identification in the act). Known as Expert Determination, this approach requires someone familiar with generally accepted statistical or scientific principles and 4 | Chapter 1: Introduction
The above is a preview of the first 20 pages. Register to read the complete e-book.