MongoDB The Definitive Guide Powerful and Scalable Data Storage 3rd Edition ( etc.) (Z-Library)
Author: Shannon Bradshaw, Eoin Brazil, Kristina Chodorow
技术
No Description
📄 File Format:
PDF
💾 File Size:
11.4 MB
289
Views
144
Downloads
0.00
Total Donations
📄 Text Preview (First 20 pages)
ℹ️
Registered users can read the full content for free
Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.
📄 Page
1
Shannon Bradshaw, Eoin Brazil & Kristina Chodorow MongoDB The Defi nitive Guide Powerful and Scalable Data Storage Third Edition
📄 Page
2
(This page has no text content)
📄 Page
3
Shannon Bradshaw, Eoin Brazil, and Kristina Chodorow MongoDB: The Definitive Guide Powerful and Scalable Data Storage THIRD EDITION Boston Farnham Sebastopol TokyoBeijing
📄 Page
4
978-1-491-95446-1 [LSI] MongoDB: The Definitive Guide by Shannon Bradshaw, Eoin Brazil, and Kristina Chodorow Copyright © 2020 Shannon Bradshaw and Eoin Brazil. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Editor: Nicole Taché Production Editor: Kristen Brown Copyeditor: Rachel Head Proofreader: Christina Edwards Indexer: Judith McConville Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest September 2010: First Edition May 2013: Second Edition December 2019: Third Edition Revision History for the Third Edition 2019-12-09: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781491954461 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. MongoDB: The Definitive Guide, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
📄 Page
5
This book is dedicated to our families for the time, space, and support they provided to make our work on this book possible and for their love. For Anna, Sigourney, Graham, and Beckett. —Shannon And for Gemma, Clodagh, and Bronagh. —Eoin
📄 Page
6
(This page has no text content)
📄 Page
7
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Part I. Introduction to MongoDB 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Ease of Use 3 Designed to Scale 4 Rich with Features… 5 …Without Sacrificing Speed 6 The Philosophy 6 2. Getting Started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Documents 7 Collections 8 Dynamic Schemas 8 Naming 9 Databases 10 Getting and Starting MongoDB 11 Introduction to the MongoDB Shell 13 Running the Shell 13 A MongoDB Client 14 Basic Operations with the Shell 14 Data Types 16 Basic Data Types 16 Dates 18 v
📄 Page
8
Arrays 19 Embedded Documents 19 _id and ObjectIds 20 Using the MongoDB Shell 22 Tips for Using the Shell 22 Running Scripts with the Shell 23 Creating a .mongorc.js 25 Customizing Your Prompt 26 Editing Complex Variables 27 Inconvenient Collection Names 28 3. Creating, Updating, and Deleting Documents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Inserting Documents 29 insertMany 29 Insert Validation 32 insert 33 Removing Documents 33 drop 34 Updating Documents 35 Document Replacement 35 Using Update Operators 37 Upserts 46 Updating Multiple Documents 49 Returning Updated Documents 49 4. Querying. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Introduction to find 53 Specifying Which Keys to Return 54 Limitations 55 Query Criteria 55 Query Conditionals 55 OR Queries 56 $not 57 Type-Specific Queries 57 null 57 Regular Expressions 58 Querying Arrays 59 Querying on Embedded Documents 63 $where Queries 65 Cursors 66 vi | Table of Contents
📄 Page
9
Limits, Skips, and Sorts 67 Avoiding Large Skips 68 Immortal Cursors 70 Part II. Designing Your Application 5. Indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Introduction to Indexes 75 Creating an Index 78 Introduction to Compound Indexes 81 How MongoDB Selects an Index 84 Using Compound Indexes 85 How $ Operators Use Indexes 104 Indexing Objects and Arrays 114 Index Cardinality 116 explain Output 116 When Not to Index 125 Types of Indexes 126 Unique Indexes 126 Partial Indexes 128 Index Administration 129 Identifying Indexes 130 Changing Indexes 130 6. Special Index and Collection Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Geospatial Indexes 133 Types of Geospatial Queries 134 Using Geospatial Indexes 136 Compound Geospatial Indexes 144 2d Indexes 144 Indexes for Full Text Search 146 Creating a Text Index 147 Text Search 148 Optimizing Full-Text Search 151 Searching in Other Languages 151 Capped Collections 151 Creating Capped Collections 154 Tailable Cursors 154 Time-To-Live Indexes 155 Table of Contents | vii
📄 Page
10
Storing Files with GridFS 156 Getting Started with GridFS: mongofiles 156 Working with GridFS from the MongoDB Drivers 157 Under the Hood 158 7. Introduction to the Aggregation Framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Pipelines, Stages, and Tunables 161 Getting Started with Stages: Familiar Operations 163 Expressions 168 $project 169 $unwind 174 Array Expressions 181 Accumulators 186 Using Accumulators in Project Stages 186 Introduction to Grouping 187 The _id Field in Group Stages 192 Group Versus Project 195 Writing Aggregation Pipeline Results to a Collection 198 8. Transactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Introduction to Transactions 199 A Definition of ACID 200 How to Use Transactions 200 Tuning Transaction Limits for Your Application 205 Timing and Oplog Size Limits 205 9. Application Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Schema Design Considerations 207 Schema Design Patterns 208 Normalization Versus Denormalization 211 Examples of Data Representations 212 Cardinality 216 Friends, Followers, and Other Inconveniences 216 Optimizations for Data Manipulation 219 Removing Old Data 219 Planning Out Databases and Collections 220 Managing Consistency 221 Migrating Schemas 222 Managing Schemas 223 When Not to Use MongoDB 223 viii | Table of Contents
📄 Page
11
Part III. Replication 10. Setting Up a Replica Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Introduction to Replication 227 Setting Up a Replica Set, Part 1 228 Networking Considerations 229 Security Considerations 230 Setting Up a Replica Set, Part 2 230 Observing Replication 233 Changing Your Replica Set Configuration 238 How to Design a Set 241 How Elections Work 243 Member Configuration Options 244 Priority 244 Hidden Members 245 Election Arbiters 246 Building Indexes 247 11. Components of a Replica Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Syncing 249 Initial Sync 251 Replication 253 Handling Staleness 253 Heartbeats 253 Member States 254 Elections 255 Rollbacks 255 When Rollbacks Fail 259 12. Connecting to a Replica Set from Your Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Client−to−Replica Set Connection Behavior 261 Waiting for Replication on Writes 263 Other Options for “w” 265 Custom Replication Guarantees 265 Guaranteeing One Server per Data Center 265 Guaranteeing a Majority of Nonhidden Members 267 Creating Other Guarantees 267 Sending Reads to Secondaries 268 Consistency Considerations 268 Table of Contents | ix
📄 Page
12
Load Considerations 269 Reasons to Read from Secondaries 269 13. Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Starting Members in Standalone Mode 271 Replica Set Configuration 272 Creating a Replica Set 272 Changing Set Members 273 Creating Larger Sets 274 Forcing Reconfiguration 274 Manipulating Member State 275 Turning Primaries into Secondaries 275 Preventing Elections 275 Monitoring Replication 275 Getting the Status 276 Visualizing the Replication Graph 279 Replication Loops 280 Disabling Chaining 281 Calculating Lag 281 Resizing the Oplog 282 Building Indexes 283 Replication on a Budget 285 Part IV. Sharding 14. Introduction to Sharding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 What Is Sharding? 289 Understanding the Components of a Cluster 290 Sharding on a Single-Machine Cluster 291 15. Configuring Sharding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 When to Shard 303 Starting the Servers 304 Config Servers 304 The mongos Processes 305 Adding a Shard from a Replica Set 306 Adding Capacity 310 Sharding Data 310 How MongoDB Tracks Cluster Data 311 x | Table of Contents
📄 Page
13
Chunk Ranges 312 Splitting Chunks 314 The Balancer 316 Collations 317 Change Streams 317 16. Choosing a Shard Key. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Taking Stock of Your Usage 319 Picturing Distributions 320 Ascending Shard Keys 320 Randomly Distributed Shard Keys 323 Location-Based Shard Keys 325 Shard Key Strategies 327 Hashed Shard Key 327 Hashed Shard Keys for GridFS 328 The Firehose Strategy 329 Multi-Hotspot 330 Shard Key Rules and Guidelines 334 Shard Key Limitations 334 Shard Key Cardinality 334 Controlling Data Distribution 334 Using a Cluster for Multiple Databases and Collections 335 Manual Sharding 336 17. Sharding Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Seeing the Current State 339 Getting a Summary with sh.status() 339 Seeing Configuration Information 341 Tracking Network Connections 348 Getting Connection Statistics 348 Limiting the Number of Connections 354 Server Administration 356 Adding Servers 356 Changing Servers in a Shard 356 Removing a Shard 356 Balancing Data 359 The Balancer 360 Changing Chunk Size 361 Moving Chunks 362 Jumbo Chunks 364 Table of Contents | xi
📄 Page
14
Refreshing Configurations 367 Part V. Application Administration 18. Seeing What Your Application Is Doing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 Seeing the Current Operations 371 Finding Problematic Operations 374 Killing Operations 375 False Positives 375 Preventing Phantom Operations 375 Using the System Profiler 376 Calculating Sizes 379 Documents 379 Collections 380 Databases 385 Using mongotop and mongostat 386 19. An Introduction to MongoDB Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 MongoDB Authentication and Authorization 389 Authentication Mechanisms 389 Authorization 390 Using x.509 Certificates to Authenticate Both Members and Clients 392 A Tutorial on MongoDB Authentication and Transport Layer Encryption 395 Establish a CA 395 Generate and Sign Member Certificates 400 Generate and Sign Client Certificates 401 Bring Up the Replica Set Without Authentication and Authorization Enabled 401 Create the Admin User 402 Restart the Replica Set with Authentication and Authorization Enabled 403 20. Durability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Durability at the Member Level Through Journaling 405 Durability at the Cluster Level Using Write Concern 407 The w and wtimeout Options for writeConcern 407 The j (Journaling) Option for writeConcern 408 Durability at a Cluster Level Using Read Concern 408 Durability of Transactions Using a Write Concern 409 What MongoDB Does Not Guarantee 410 xii | Table of Contents
📄 Page
15
Checking for Corruption 410 Part VI. Server Administration 21. Setting Up MongoDB in Production. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 Starting from the Command Line 415 File-Based Configuration 419 Stopping MongoDB 420 Security 421 Data Encryption 422 SSL Connections 423 Logging 423 22. Monitoring MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Monitoring Memory Usage 425 Introduction to Computer Memory 426 Tracking Memory Usage 426 Tracking Page Faults 427 I/O Wait 429 Calculating the Working Set 429 Some Working Set Examples 431 Tracking Performance 431 Tracking Free Space 433 Monitoring Replication 433 23. Making Backups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Backup Methods 437 Backing Up a Server 438 Filesystem Snapshot 438 Copying Data Files 442 Using mongodump 443 Specific Considerations for Replica Sets 446 Specific Considerations for Sharded Clusters 446 Backing Up and Restoring an Entire Cluster 447 Backing Up and Restoring a Single Shard 447 24. Deploying MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 Designing the System 449 Choosing a Storage Medium 449 Table of Contents | xiii
📄 Page
16
Recommended RAID Configurations 450 CPU 451 Operating System 451 Swap Space 452 Filesystem 452 Virtualization 453 Memory Overcommitting 453 Mystery Memory 453 Handling Network Disk I/O Issues 453 Using Non-Networked Disks 455 Configuring System Settings 455 Turning Off NUMA 455 Setting Readahead 457 Disabling Transparent Huge Pages (THP) 458 Choosing a Disk Scheduling Algorithm 458 Disabling Access Time Tracking 459 Modifying Limits 460 Configuring Your Network 461 System Housekeeping 462 Synchronizing Clocks 462 The OOM Killer 463 Turn Off Periodic Tasks 463 A. Installing MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 B. MongoDB Internals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 xiv | Table of Contents
📄 Page
17
Preface How This Book Is Organized This book is split up into six sections, covering development, administration, and deployment information. Getting Started with MongoDB In Chapter 1 we provide background on MongoDB: why it was created, the goals it is trying to accomplish, and why you might choose to use it for a project. We go into more detail in Chapter 2, which provides an introduction to the core concepts and vocabulary of MongoDB. Chapter 2 also provides a first look at working with Mon‐ goDB, getting you started with the database and the shell. The next two chapters cover the basic material that developers need to know to work with MongoDB. In Chapter 3, we describe how to perform those basic write operations, including how to do them with different levels of safety and speed. Chapter 4 explains how to find documents and create complex queries. This chapter also covers how to iterate through results and gives options for limiting, skipping, and sorting results. Developing with MongoDB Chapter 5 covers what indexing is and how to index your MongoDB collections. Chapter 6 explains how to use several special types of indexes and collections. Chap‐ ter 7 covers a number of techniques for aggregating data with MongoDB, including counting, finding distinct values, grouping documents, the aggregation framework, and writing these results to a collection. Chapter 8 introduces transactions: what they are, how best to use them for your application, and how to tune. Finally, this section finishes with a chapter on designing your application: Chapter 9 goes over tips for writing an application that works well with MongoDB. xv
📄 Page
18
Replication The replication section starts with Chapter 10, which gives you a quick way to set up a replica set locally and covers many of the available configuration options. Chap‐ ter 11 then covers the various concepts related to replication. Chapter 12 shows how replication interacts with your application and Chapter 13 covers the administrative aspects of running a replica set. Sharding The sharding section starts in Chapter 14 with a quick local setup. Chapter 15 then gives an overview of the components of the cluster and how to set them up. Chap‐ ter 16 has advice on choosing a shard key for a variety of applications. Finally, Chap‐ ter 17 covers administering a sharded cluster. Application Administration The next two chapters cover many aspects of MongoDB administration from the per‐ spective of your application. Chapter 18 discusses how to introspect what MongoDB is doing. Chapter 19 covers security in MongoDb and how to configure authentica‐ tion as well as authorization for your deployment. Chapter 20 explains how Mon‐ goDB stores data durably. Server Administration The final section is focused on server administration. Chapter 21 covers common options when starting and stopping MongoDB. Chapter 22 discusses what to look for and how to read stats when monitoring. Chapter 23 describes how to take and restore backups for each type of deployment. Finally, Chapter 24 discusses a number of sys‐ tem settings to keep in mind when deploying MongoDB. Appendixes Appendix A explains MongoDB’s versioning scheme and how to install it on Win‐ dows, OS X, and Linux. Appendix B details how MongoDB works internally: its stor‐ age engine, data format, and wire protocol. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, collection names, database names, filenames, and file extensions. xvi | Preface
📄 Page
19
Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, command-line utilities, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/mongodb-the-definitive-guide-3e/mongodb-the-definitive-guide-3e. If you have a technical question or a problem using the code examples, please send email to bookquestions@oreilly.com. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of Preface | xvii
📄 Page
20
example code from this book into your product’s documentation does require per‐ mission. We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “MongoDB: The Defini‐ tive Guide, Third Edition by Shannon Bradshaw, Eoin Brazil, and Kristina Chodorow (O’Reilly). Copyright 2020 Shannon Bradshaw and Eoin Brazil, 978-1-491-95446-1.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in- depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/mongoDB_TDG_3e. Email bookquestions@oreilly.com to comment or ask technical questions about this book. For more information about our books, courses, conferences, and news, see our web‐ site at http://www.oreilly.com. Find us on Facebook: http://facebook.com/oreilly xviii | Preface
The above is a preview of the first 20 pages. Register to read the complete e-book.