📄 Page
1
Data Governance The Definitive Guide People, Processes, and Tools to Operationalize Data Trustworthiness Evren Eryurek, Uri Gilad, Valliappa Lakshmanan, Anita Kibunguchy-Grant & Jessi Ashdown
📄 Page
2
(This page has no text content)
📄 Page
3
Praise for Data Governance: The Definitive Guide We live in a digital world. Whether we realize it or not, we are living in the throes of one of the biggest economic and social revolutions since the Industrial Revolution. It’s about transforming traditional business processes—many of them previously nondigital or manual—into processes that will fundamentally change how we live, how we operate our businesses, and how we deliver value to our customers. Data defines, informs, and predicts—it is used to penetrate new markets, control costs, drive revenues, manage risk, and help us discover the world around us. But to realize these benefits, data must be properly managed and stewarded. Data Governance: The Definitive Guide walks you through the many facets of data management and data governance—people, process and tools, data ownership, data quality, data protection, privacy and security—and does it in a way that is practical and easy to follow. A must-read for the data professional! —John Bottega, president of the EDM Council Enterprises are increasingly evolving as insight-driven businesses, putting pressure on data to satisfy new use cases and business ecosystems. Add to this business complexity, market disruption, and demand for speed, and data governance is front and center to make data trusted, secure, and relevant. This is not your grandfather’s slow and bureaucratic data governance either. This book shares the secrets into how modern data governance ensures data is the cornerstone to your business resilience, elasticity, speed, and growth opportunity and not an afterthought. —Michele Goetz, vice president/principal analyst–business insights at Forrester
📄 Page
4
Data governance has evolved from a discipline focused on cost and compliance to one that propels organizations to grow and innovate. Today’s data governance solutions can benefit from technological advances that establish a continuous, autonomous, and virtuous cycle. This in turn becomes an ecosystem—a community in which data is used for good, and doing the right thing is also the easy thing. Executives looking to use data as an asset and deliver positive business outcomes need to rethink governance’s role and adopt the modern and transformative approach Data Governance: The Definitive Guide provides. —Jim Cushman, CPO of Collibra
📄 Page
5
Evren Eryurek, Uri Gilad, Valliappa Lakshmanan, Anita Kibunguchy-Grant, and Jessi Ashdown Data Governance: The Definitive Guide People, Processes, and Tools to Operationalize Data Trustworthiness Boston Farnham Sebastopol TokyoBeijing
📄 Page
6
978-1-492-06349-0 [LSI] Data Governance: The Definitive Guide by Evren Eryurek, Uri Gilad, Valliappa Lakshmanan, Anita Kibunguchy-Grant, and Jessi Ashdown Copyright © 2021 Uri Gilad, Jessi Ashdown, Valliappa Lakshmanan, Evren Eryurek, and Anita Kibunguchy-Grant. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Jessica Haberman Development Editor: Gary O’Brien Production Editor: Kate Galloway Copyeditor: Piper Editorial Consulting, LLC Proofreader: Arthur Johnson Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea March 2021: First Edition Revision History for the First Edition 2021-03-08: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492063490 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Data Governance: The Definitive Guide, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors, and do not represent the publisher’s views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
📄 Page
7
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1. What Is Data Governance?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 What Data Governance Involves 2 Holistic Approach to Data Governance 4 Enhancing Trust in Data 5 Classification and Access Control 6 Data Governance Versus Data Enablement and Data Security 8 Why Data Governance Is Becoming More Important 9 The Size of Data Is Growing 9 The Number of People Working and/or Viewing the Data Has Grown Exponentially 10 Methods of Data Collection Have Advanced 10 More Kinds of Data (Including More Sensitive Data) Are Now Being Collected 13 The Use Cases for Data Have Expanded 14 New Regulations and Laws Around the Treatment of Data 16 Ethical Concerns Around the Use of Data 16 Examples of Data Governance in Action 17 Managing Discoverability, Security, and Accountability 18 Improving Data Quality 19 The Business Value of Data Governance 23 Fostering Innovation 23 The Tension Between Data Governance and Democratizing Data Analysis 24 Manage Risk (Theft, Misuse, Data Corruption) 25 Regulatory Compliance 26 Considerations for Organizations as They Think About Data Governance 28 Why Data Governance Is Easier in the Public Cloud 30 v
📄 Page
8
Location 31 Reduced Surface Area 32 Ephemeral Compute 32 Serverless and Powerful 32 Labeled Resources 33 Security in a Hybrid World 34 Summary 34 2. Ingredients of Data Governance: Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 The Enterprise Dictionary 37 Data Classes 38 Data Classes and Policies 40 Data Classification and Organization 43 Data Cataloging and Metadata Management 44 Data Assessment and Profiling 45 Data Quality 46 Lineage Tracking 46 Key Management and Encryption 47 Data Retention and Data Deletion 50 Workflow Management for Data Acquisition 52 IAM—Identity and Access Management 52 User Authorization and Access Management 54 Summary 55 3. Ingredients of Data Governance: People and Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . 57 The People: Roles, Responsibilities, and Hats 57 User Hats Defined 58 Data Enrichment and Its Importance 65 The Process: Diverse Companies, Diverse Needs and Approaches to Data Governance 65 Legacy 66 Cloud Native/Digital Only 67 Retail 67 Highly Regulated 69 Small Companies 71 Large Companies 72 People and Process Together: Considerations, Issues, and Some Successful Strategies 73 Considerations and Issues 74 Processes and Strategies with Varying Success 77 Summary 84 vi | Table of Contents
📄 Page
9
4. Data Governance over a Data Life Cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 What Is a Data Life Cycle? 85 Phases of a Data Life Cycle 86 Data Creation 87 Data Processing 88 Data Storage 88 Data Usage 88 Data Archiving 89 Data Destruction 89 Data Life Cycle Management 90 Data Management Plan 90 Applying Governance over the Data Life Cycle 92 Data Governance Framework 92 Data Governance in Practice 94 Example of How Data Moves Through a Data Platform 97 Operationalizing Data Governance 100 What Is a Data Governance Policy? 101 Importance of a Data Governance Policy 102 Developing a Data Governance Policy 103 Data Governance Policy Structure 103 Roles and Responsibilities 105 Step-by-Step Guidance 106 Considerations for Governance Across a Data Life Cycle 108 Summary 111 5. Improving Data Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 What Is Data Quality? 113 Why Is Data Quality Important? 114 Data Quality in Big Data Analytics 116 Data Quality in AI/ML Models 117 Why Is Data Quality a Part of a Data Governance Program? 121 Techniques for Data Quality 121 Scorecard 123 Prioritization 123 Annotation 123 Profiling 124 Summary 131 6. Governance of Data in Flight. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Data Transformations 133 Lineage 134 Why Lineage Is Useful 135 Table of Contents | vii
📄 Page
10
How to Collect Lineage 135 Types of Lineage 136 The Fourth Dimension 138 How to Govern Data in Flight 139 Policy Management, Simulation, Monitoring, Change Management 141 Audit, Compliance 141 Summary 142 7. Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Planning Protection 143 Lineage and Quality 144 Level of Protection 145 Classification 146 Data Protection in the Cloud 146 Multi-Tenancy 147 Security Surface 147 Virtual Machine Security 148 Physical Security 149 Network Security 151 Security in Transit 151 Data Exfiltration 153 Virtual Private Cloud Service Controls (VPC-SC) 155 Secure Code 156 Zero-Trust Model 157 Identity and Access Management 158 Authentication 158 Authorization 159 Policies 160 Data Loss Prevention 161 Encryption 162 Differential Privacy 164 Access Transparency 165 Keeping Data Protection Agile 165 Security Health Analytics 165 Data Lineage 166 Event Threat Detection 167 Data Protection Best Practices 167 Separated Network Designs 168 Physical Security 168 Portable Device Encryption and Policy 170 Data Deletion Process 170 Summary 173 viii | Table of Contents
📄 Page
11
8. Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 What Is Monitoring? 175 Why Perform Monitoring? 176 What Should You Monitor? 179 Data Quality Monitoring 179 Data Lineage Monitoring 180 Compliance Monitoring 182 Program Performance Monitoring 183 Security Monitoring 185 What Is a Monitoring System? 187 Analysis in Real Time 187 System Alerts 187 Notifications 187 Reporting/Analytics 188 Graphic Visualization 188 Customization 188 Monitoring Criteria 189 Important Reminders for Monitoring 190 Summary 191 9. Building a Culture of Data Privacy and Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Data Culture: What It Is and Why It’s Important 193 Starting at the Top—Benefits of Data Governance to the Business 194 Analytics and the Bottom Line 195 Company Persona and Perception 195 Intention, Training, and Communications 196 A Data Culture Needs to Be Intentional 197 Training: Who Needs to Know What 197 Beyond Data Literacy 200 Motivation and Its Cascading Effects 200 Maintaining Agility 201 Requirements, Regulations, and Compliance 202 The Importance of Data Structure 202 Scaling the Governance Process Up and Down 203 Interplay with Legal and Security 203 Staying on Top of Regulations 204 Communication 204 Interplay in Action 204 Agility Is Still Key 205 Incident Handling 205 When “Everyone” Is Responsible, No One Is Responsible 205 Importance of Transparency 206 Table of Contents | ix
📄 Page
12
What It Means to Be Transparent 206 Building Internal Trust 206 Building External Trust 207 Setting an Example 208 Summary 208 A. Google’s Internal Data Governance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 B. Additional Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 x | Table of Contents
📄 Page
13
Preface In recent years, the ease of moving to the cloud has motivated and energized a fast- growing community of data consumers to collect, capture, store, and analyze data for insights and decision making. For a number of reasons, as adoption of cloud comput‐ ing continues to grow, information management stakeholders have questions about the potential risks involved in managing their data in the cloud. Evren faced such questions for the first time when he worked in healthcare and had to put in place the processes and technologies to govern data. Now at Google Cloud, Uri and Lak also answer these questions nearly every week and dispense advice on getting value from data, breaking down data silos, preserving anonymity, protecting sensitive informa‐ tion, and improving the trustworthiness of data. We noticed that GDPR was what precipitated a sea change in customers’ behavior. Some customers even deleted their data, thinking it was the right thing to do. That reaction, more than any other, prompted us to write this book capturing the advice we have provided over the years to Google Cloud customers. If data is the new cur‐ rency, we do not want enterprises to be scared of it. If the data is locked away or is not trustworthy, it is of no value. We all pride ourselves on helping Google Cloud customers get value for their techni‐ cal expenditures. Data is a huge investment, and we felt obligated to provide our cus‐ tomers with the best way to get value from it. Customers’ questions usually involve one of three risk factors: Securing the data Storing data in a public cloud infrastructure might concern large enterprises that typically deploy their systems on-premises and expect tight security. With a sig‐ nificant number of security threats and breaches in the news, organizations are concerned that they might be the next victim. These factors contribute to risk management concerns for protecting against unauthorized access to or exposure of sensitive data, ranging from personally identifiable information (PII) to corpo‐ rate confidential information, trade secrets, or intellectual property. xi
📄 Page
14
Regulations and compliance There is a growing set of regulations, including the California Consumer Privacy Act (CCPA), the European Union’s General Data Protection Regulation (GDPR), and industry-specific standards such as global Legal Entity Identifier (LEI) num‐ bers in the financial industry and ACORD data standards in the insurance indus‐ try. Compliance teams responsible for adhering to these regulations and standards may have concerns about oversight and control of data stored in the cloud. Visibility and control Data management professionals and data consumers sometimes lack visibility into their own data landscape: which data assets are available, where those assets are located and how and if they can be used, and who has access to the data and whether they should have access to it. This uncertainty limits their ability to fur‐ ther leverage their own data to improve productivity or drive business value. These risk factors clearly highlight the need for increased data assessment, cataloging of metadata, access control management, data quality, and information security as core data governance competencies that the cloud provider should not only provide but also continuously upgrade in a transparent way. In essence, addressing these risks without abandoning the benefits provided by cloud computing has elevated the importance of not only understanding data governance in the cloud, but also know‐ ing what is important. Good data governance can inspire customer trust and lead to vast improvements in customer experience. Why Your Business Needs Data Governance in the Cloud As your business generates more data and moves it into the cloud, the dynamics of data management change in a number of fundamental ways. Organizations should take note of the following: Risk management There are concerns about potential exposure of sensitive information to unau‐ thorized individuals or systems, security breaches, or known personnel accessing data under the wrong circumstances. Organizations are looking to minimize this risk, so additional forms of protection (such as encryption) to obfuscate the data object’s embedded information are required to safeguard the data should a sys‐ tem breach occur. In addition, other tools are required in order to support access management, identify sensitive data assets, and create a policy around their protection. Data proliferation The speed at which businesses create, update, and stream their data assets has increased, and while cloud-based platforms are capable of handling increased xii | Preface
📄 Page
15
1 Craig Stedman and Jack Vaughan, “What Is Data Governance and Why Does It Matter?” TechTarget, Decem‐ ber 2019. This article was updated in February 2020; the current version no longer includes this quote. data velocity, volume, and variety, it is important to introduce controls and mechanisms to rapidly validate the quality aspects of high-bandwidth data streams. Data management The need to adopt externally produced data sources and data streams (including paid feeds from third parties) means that you should be prepared not to trust all external data sources. You may need to introduce tools that document data line‐ age, classification, and metadata to help your employees (data consumers, in par‐ ticular) to determine data usability based on their knowledge of how the data assets were produced. Discovery (and data awareness) Moving data into any kind of data lake (cloud-based or on-premises) runs the risk of losing track of which data assets have been moved, the characteristics of their content, and details about their metadata. The ability, therefore, to assess data asset content and sensitivity (no matter where the data is) becomes very important. Privacy and compliance Regulatory compliance demands auditable and measurable standards and proce‐ dures that ensure compliance with internal data policies as well as external gov‐ ernment regulations. Migrating data to the cloud means that organizations need tools to enforce, monitor, and report compliance, as well as ensure that the right people and services have access and permissions to the right data. Framework and Best Practices for Data Governance in the Cloud Given the changing dynamics of data management, how should organizations think about data governance in the cloud, and why is it important? According to TechTarget, data governance is the overall management of the availability, usability, integrity, and security of data used in an enterprise. A sound data governance program includes a governing body or council, a defined set of procedures and a plan to execute those procedures.1 Simply put, data governance encompasses the ways that people, processes and tech‐ nology can work together to enable auditable compliance with defined and agreed- upon data policies. Preface | xiii
📄 Page
16
Data Governance Framework Enterprises need to think about data governance comprehensively, from data intake and ingestion to cataloging, persistence, retention, storage management, sharing, archiving, backup, recovery, loss prevention, disposition, and removal and deletion: Data discovery and assessment Cloud-based environments often offer an economical option for creating and managing data lakes, but the risk remains for ungoverned migration of data assets. This risk represents a potential loss of knowledge of what data assets are in the data lake, what information is contained within each object, and where those data objects originated from. A best practice for data governance in the cloud is data discovery and assessment in order to know what data assets you have. The data discovery and assessment process is used to identify data assets within the cloud environment, and to trace and record each data asset’s origin and lineage, what transformations have been applied, and object metadata. (Often this meta‐ data describes the demographic details, such as the name of the creator, the size of the object, the number of records if it is a structured data object, or when it was last updated.) Data classification and organization Properly evaluating a data asset and scanning the content of its different attributes can help categorize the data asset for subsequent organization. This process can also infer whether the object contains sensitive data and, if so, clas‐ sify it in terms of the level of data sensitivity, such as personal and private data, confidential data, or intellectual property. To implement data governance in the cloud, you’ll need to profile and classify sensitive data to determine which gover‐ nance policies and procedures apply to the data. Data cataloging and metadata management Once your data assets are assessed and classified, it is crucial that you document your learnings so that your communities of data consumers have visibility into your organization’s data landscape. You need to maintain a data catalog that con‐ tains structural metadata, data object metadata, and the assessment of levels of sensitivity in relation to the governance directives (such as compliance with one or more data privacy regulations). The data catalog not only allows data consum‐ ers to view this information but can also serve as part of a reverse index for search and discovery, both by phrase and (given the right ontologies) by concept. It is also important to understand the format of structured and semi-structured data objects and allow your systems to handle these data types differently, as necessary. xiv | Preface
📄 Page
17
Data quality management Different data consumers may have different data quality requirements, so it’s important to provide a means of documenting data quality expectations as well as techniques and tools for supporting the data validation and monitoring process. Data quality management processes include creating controls for validation, enabling quality monitoring and reporting, supporting the triage process for assessing the level of incident severity, enabling root cause analysis and recom‐ mendation of remedies to data issues, and data incident tracking. The right pro‐ cesses for data quality management will provide measurably trustworthy data for analysis. Data access management There are two aspects of governance for data access. The first aspect is the provi‐ sioning of access to available assets. It’s important to provide data services that allow data consumers to access their data, and fortunately, most cloud platforms provide methods for developing data services. The second aspect is prevention of improper or unauthorized access. It’s important to define identities, groups, and roles and assign access rights to establish a level of managed access. This best practice involves managing access services as well as interoperating with the cloud provider’s identity and access management (IAM) services by defining roles, specifying access rights, and managing and allocating access keys to ensure that only authorized and authenticated individuals and systems are able to access data assets according to defined rules. Auditing Organizations must be able to assess their systems to make sure that they are working as designed. Monitoring, auditing, and tracking (who did what and when and with what information) helps security teams gather data, identify threats, and act on those threats before they result in business damage or loss. It’s important to perform regular audits to check the effectiveness of controls in order to quickly mitigate threats and evaluate overall security health. Data protection Despite the efforts of information technology security groups to establish perim‐ eter security as a way to prevent unauthorized individuals from accessing data, perimeter security is not and never has been sufficient for protecting sensitive data. While you might be successful in preventing someone from breaking into your system, you are not protected from an insider security breach or even from exfiltration (data theft). It’s important to institute additional methods of data pro‐ tection—including encryption at rest, encryption in transit, data masking, and permanent deletion—to ensure that exposed data cannot be read. Preface | xv
📄 Page
18
Operationalizing Data Governance in Your Organization Technology certainly helps support the data governance principles presented in the preceding section, but data governance goes beyond the selection and implementa‐ tion of products and tools. The success of a data governance program depends on a combination of: • People to build the business case, develop the operating model, and take on appropriate roles • Processes that operationalize policy development, implementation, and enforcement • Technology used to facilitate the ways that people execute those processes The following steps are critical in planning, launching, and supporting a data gover‐ nance program: 1. Build the business case. Establish the business case by identifying critical busi‐ ness drivers to justify the effort and investment associated with data governance. Outline perceived data risks (such as the storage of data on cloud-based plat‐ forms) and indicate how data governance helps the organization mitigate those risks. 2. Document guiding principles. Assert core principles associated with governance and oversight of enterprise data. Document those principles in a data governance charter to present to senior management. 3. Get management buy-in. Engage data governance champions and get buy-in from the key senior stakeholders. Present your business case and guiding princi‐ ples to C-level management for approval. 4. Develop an operating model. Once you have management approval, define the data governance roles and responsibilities, and then describe the processes and procedures for the data governance council and data stewardship teams who will define processes for defining and implementing policies as well as reviewing and remediating identified data issues. 5. Establish a framework for accountability. Establish a framework for assigning custodianship and responsibility for critical data domains. Make sure there is vis‐ ibility to the “data owners” across the data landscape. Provide a methodology to ensure that everyone is accountable for contributing to data usability. 6. Develop taxonomies and ontologies. There may be a number of governance directives associated with data classification, organization, and—in the case of sensitive information—data protection. To enable your data consumers to com‐ ply with those directives, there must be a clear definition of the categories (for organizational structure) and classifications (for assessing data sensitivity). xvi | Preface
📄 Page
19
7. Assemble the right technology stack. Once you’ve assigned data governance roles to your staff and defined and approved your processes and procedures, you should assemble a suite of tools that facilitate ongoing validation of compliance with data policies and accurate compliance reporting. 8. Establish education and training. Raise awareness of the value of data gover‐ nance by developing educational materials highlighting data governance practi‐ ces and procedures, and the use of supporting technology. Plan for regular training sessions to reinforce good data governance practices. The Business Benefits of Robust Data Governance Data security, data protection, data accessibility and usability, data quality, and other aspects of data governance will continue to emerge and grow as critical priorities for organizations. And as more organizations migrate their data assets to the cloud, the need for auditable practices for ensuring data utility will also continue to grow. To address these directives, businesses should frame their data governance practices around three key components: • A framework that enables people to define, agree to, and enforce data policies • Effective processes for control, oversight, and stewardship over all data assets across on-premises systems, cloud storage, and data warehouse platforms • The right tools and technologies for operationalizing data policy compliance With this framework in mind, an effective data governance strategy and operating model provides a path for organizations to establish control and maintain visibility into their data assets, providing a competitive advantage over their peers. Organiza‐ tions will likely reap immense benefits as they promote a data-driven culture within their organizations—specifically: Improved decision making Better data discovery means that users can find the data they need when they need it, which makes them more efficient. Data-driven decision making plays a huge role in improving business planning within an organization. Better risk management A good data governance operating model helps organizations audit their pro‐ cesses more easily so that they reduce the risk of fines, increase customer trust, and improve operations. Downtime can be minimized while productivity still grows. Preface | xvii
📄 Page
20
Regulatory compliance Increasing governmental regulation has made it even more important for organi‐ zations to establish data governance practices. With a good data governance framework, organizations can embrace the changing regulatory environment instead of simply reacting to it. As you migrate more of your data to the cloud, data governance provides a level of protection against data misuse. At the same time, auditable compliance with defined data policies helps demonstrate to your customers that you protect their private information, alleviating their concerns about information risks. Who Is This Book For? The current growth in data is unprecedented and, when coupled with increased regu‐ lations and fines, has meant that organizations are forced to look into their data gov‐ ernance plans to make sure that they do not become the next statistic. Therefore, every organization will need to establish an understanding of the data it collects, the liability and regulation associated with that data, and who has access to it. This book is for you if you want to know what that entails, the risks to be aware of, and the con‐ siderations to keep in mind. This book is for anyone who needs to implement the processes or technology that enables data to become trustworthy. This book covers the ways that people, processes, and technology can work together to enable auditable compliance with defined and agreed-upon data policies. The benefits of data governance are multifaceted, ranging from legal and regulatory compliance to better risk management and the ability to drive top-line revenue and cost savings by creating new products and services. Read this book to learn how to establish control and maintain visibility into your data assets, which will provide you with a competitive advantage over your peers. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. xviii | Preface