📄 Page
1
Jim Blandy and Jason Orendorff Programming Rust Fast, Safe Systems Development
📄 Page
2
978-1-491-92728-1 [M] Programming Rust December 2017: First Edition Revision History for the First Edition 2017-11-20: First Release http://oreilly.com/catalog/errata.csp?isbn=9781491927212 by Jim Blandy and Jason Orendorff Copyright © 2018 Jim Blandy, Jason Orendorff Printed in the United States of America
📄 Page
3
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1. Why Rust?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Type Safety 3 2. A Tour of Rust. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Downloading and Installing Rust 7 A Simple Function 10 Writing and Running Unit Tests 11 Handling Command-Line Arguments 12 A Simple Web Server 17 Concurrency 23 What the Mandelbrot Set Actually Is 24 Parsing Pair Command-Line Arguments 28 Mapping from Pixels to Complex Numbers 31 Plotting the Set 32 Writing Image Files 33 A Concurrent Mandelbrot Program 35 Running the Mandelbrot Plotter 40 Safety Is Invisible 41 3. Basic Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Machine Types 46 Integer Types 47 Floating-Point Types 50 The bool Type 51 Characters 52 Tuples 54 Contents
📄 Page
4
Pointer Types 55 References 56 Boxes 56 Raw Pointers 57 Arrays, Vectors, and Slices 57 Arrays 58 Vectors 59 Building Vectors Element by Element 62 Slices 62 String Types 64 String Literals 64 Byte Strings 65 Strings in Memory 65 String 67 Using Strings 68 Other String-Like Types 68 Beyond the Basics 69 4. Ownership. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Ownership 73 Moves 77 More Operations That Move 82 Moves and Control Flow 84 Moves and Indexed Content 84 Copy Types: The Exception to Moves 86 Rc and Arc: Shared Ownership 90 5. References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 References as Values 97 Rust References Versus C++ References 97 Assigning References 98 References to References 99 Comparing References 99 References Are Never Null 100 Borrowing References to Arbitrary Expressions 100 References to Slices and Trait Objects 101 Reference Safety 101 Borrowing a Local Variable 101 Receiving References as Parameters 105 Passing References as Arguments 107 Returning References 107 Structs Containing References 109
📄 Page
5
Distinct Lifetime Parameters 111 Omitting Lifetime Parameters 112 Sharing Versus Mutation 114 Taking Arms Against a Sea of Objects 121 6. Expressions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 An Expression Language 123 Blocks and Semicolons 124 Declarations 126 if and match 127 if let 129 Loops 130 return Expressions 132 Why Rust Has loop 133 Function and Method Calls 134 Fields and Elements 135 Reference Operators 137 Arithmetic, Bitwise, Comparison, and Logical Operators 137 Assignment 138 Type Casts 139 Closures 140 Precedence and Associativity 140 Onward 142 7. Error Handling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Panic 145 Unwinding 146 Aborting 147 Result 148 Catching Errors 148 Result Type Aliases 150 Printing Errors 150 Propagating Errors 152 Working with Multiple Error Types 153 Dealing with Errors That “Can’t Happen” 155 Ignoring Errors 156 Handling Errors in main() 156 Declaring a Custom Error Type 157 Why Results? 158 8. Crates and Modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Crates 161
📄 Page
6
Build Profiles 164 Modules 165 Modules in Separate Files 166 Paths and Imports 167 The Standard Prelude 169 Items, the Building Blocks of Rust 170 Turning a Program into a Library 172 The src/bin Directory 174 Attributes 175 Tests and Documentation 178 Integration Tests 180 Documentation 181 Doc-Tests 182 Specifying Dependencies 185 Versions 186 Cargo.lock 187 Publishing Crates to crates.io 188 Workspaces 190 More Nice Things 191 9. Structs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Named-Field Structs 193 Tuple-Like Structs 196 Unit-Like Structs 197 Struct Layout 197 Defining Methods with impl 198 Generic Structs 202 Structs with Lifetime Parameters 203 Deriving Common Traits for Struct Types 204 Interior Mutability 205 10. Enums and Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Enums 212 Enums with Data 214 Enums in Memory 215 Rich Data Structures Using Enums 216 Generic Enums 218 Patterns 221 Literals, Variables, and Wildcards in Patterns 223 Tuple and Struct Patterns 225 Reference Patterns 226 Matching Multiple Possibilities 229
📄 Page
7
Pattern Guards 229 @ patterns 230 Where Patterns Are Allowed 230 Populating a Binary Tree 232 The Big Picture 233 11. Traits and Generics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Using Traits 237 Trait Objects 238 Trait Object Layout 239 Generic Functions 240 Which to Use 243 Defining and Implementing Traits 245 Default Methods 246 Traits and Other People’s Types 247 Self in Traits 249 Subtraits 250 Static Methods 251 Fully Qualified Method Calls 252 Traits That Define Relationships Between Types 253 Associated Types (or How Iterators Work) 254 Generic Traits (or How Operator Overloading Works) 257 Buddy Traits (or How rand::random() Works) 258 Reverse-Engineering Bounds 260 Conclusion 263 12. Operator Overloading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Arithmetic and Bitwise Operators 266 Unary Operators 268 Binary Operators 269 Compound Assignment Operators 270 Equality Tests 272 Ordered Comparisons 275 Index and IndexMut 277 Other Operators 280 13. Utility Traits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Drop 282 Sized 285 Clone 287 Copy 289 Deref and DerefMut 289
📄 Page
8
Default AsRef and AsMut 294 Borrow and BorrowMut 296 From and Into 297 ToOwned 300 Borrow and ToOwned at Work: The Humble Cow 300 14. Closures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 Capturing Variables 305 Closures That Borrow 306 Closures That Steal 306 Function and Closure Types 308 Closure Performance 310 Closures and Safety 311 Closures That Kill 312 FnOnce 312 FnMut 314 Callbacks 316 Using Closures Effectively 319 15. Iterators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 The Iterator and IntoIterator Traits 322 Creating Iterators 324 iter and iter_mut Methods 324 IntoIterator Implementations 325 drain Methods 327 Other Iterator Sources 328 Iterator Adapters 330 map and filter 330 filter_map and flat_map 332 scan 335 take and take_while 335 skip and skip_while 336 peekable 337 fuse 338 Reversible Iterators and rev 339 inspect 340 chain 341 enumerate 341 zip 342 by_ref 342 cloned 344
📄 Page
9
cycle 344 Consuming Iterators 345 Simple Accumulation: count, sum, product 345 max, min 346 max_by, min_by 346 max_by_key, min_by_key 347 Comparing Item Sequences 347 any and all 348 position, rposition, and ExactSizeIterator 348 fold 349 nth 350 last 350 find 351 Building Collections: collect and FromIterator 351 The Extend Trait 353 partition 353 Implementing Your Own Iterators 354 16. Collections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 Overview 360 Vec<T> 361 Accessing Elements 362 Iteration 364 Growing and Shrinking Vectors 364 Joining 367 Splitting 368 Swapping 370 Sorting and Searching 370 Comparing Slices 372 Random Elements 373 Rust Rules Out Invalidation Errors 373 VecDeque<T> 374 LinkedList<T> 376 BinaryHeap<T> 377 HashMap<K, V> and BTreeMap<K, V> 378 Entries 381 Map Iteration 383 HashSet<T> and BTreeSet<T> 384 Set Iteration 384 When Equal Values Are Different 385 Whole-Set Operations 385 Hashing 387
📄 Page
10
Using a Custom Hashing Algorithm 388 Beyond the Standard Collections 389 17. Strings and Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Some Unicode Background 392 ASCII, Latin-1, and Unicode 392 UTF-8 392 Text Directionality 394 Characters (char) 394 Classifying Characters 395 Handling Digits 395 Case Conversion for Characters 396 Conversions to and from Integers 396 String and str 397 Creating String Values 398 Simple Inspection 398 Appending and Inserting Text 399 Removing Text 401 Conventions for Searching and Iterating 401 Patterns for Searching Text 402 Searching and Replacing 403 Iterating over Text 403 Trimming 406 Case Conversion for Strings 406 Parsing Other Types from Strings 406 Converting Other Types to Strings 407 Borrowing as Other Text-Like Types 408 Accessing Text as UTF-8 409 Producing Text from UTF-8 Data 409 Putting Off Allocation 410 Strings as Generic Collections 412 Formatting Values 413 Formatting Text Values 414 Formatting Numbers 415 Formatting Other Types 417 Formatting Values for Debugging 418 Formatting Pointers for Debugging 419 Referring to Arguments by Index or Name 419 Dynamic Widths and Precisions 420 Formatting Your Own Types 421 Using the Formatting Language in Your Own Code 423 Regular Expressions 424
📄 Page
11
Basic Regex Use 425 Building Regex Values Lazily 426 Normalization 427 Normalization Forms 428 The unicode-normalization Crate 429 18. Input and Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Readers and Writers 432 Readers 433 Buffered Readers 435 Reading Lines 436 Collecting Lines 439 Writers 439 Files 441 Seeking 441 Other Reader and Writer Types 442 Binary Data, Compression, and Serialization 444 Files and Directories 445 OsStr and Path 445 Path and PathBuf Methods 447 Filesystem Access Functions 449 Reading Directories 450 Platform-Specific Features 451 Networking 453 19. Concurrency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 Fork-Join Parallelism 459 spawn and join 461 Error Handling Across Threads 463 Sharing Immutable Data Across Threads 464 Rayon 466 Revisiting the Mandelbrot Set 468 Channels 470 Sending Values 472 Receiving Values 475 Running the Pipeline 476 Channel Features and Performance 478 Thread Safety: Send and Sync 479 Piping Almost Any Iterator to a Channel 482 Beyond Pipelines 483 Shared Mutable State 484 What Is a Mutex? 484
📄 Page
12
Mutex<T> 486 mut and Mutex 488 Why Mutexes Are Not Always a Good Idea 488 Deadlock 489 Poisoned Mutexes 490 Multi-producer Channels Using Mutexes 490 Read/Write Locks (RwLock<T>) 491 Condition Variables (Condvar) 493 Atomics 494 Global Variables 496 What Hacking Concurrent Code in Rust Is Like 497 20. Macros. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Macro Basics 500 Basics of Macro Expansion 501 Unintended Consequences 503 Repetition 505 Built-In Macros 507 Debugging Macros 508 The json! Macro 509 Fragment Types 510 Recursion in Macros 513 Using Traits with Macros 514 Scoping and Hygiene 516 Importing and Exporting Macros 519 Avoiding Syntax Errors During Matching 521 Beyond macro_rules! 522 21. Unsafe Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525 Unsafe from What? 526 Unsafe Blocks 527 Example: An Efficient ASCII String Type 529 Unsafe Functions 531 Unsafe Block or Unsafe Function? 533 Undefined Behavior 533 Unsafe Traits 536 Raw Pointers 538 Dereferencing Raw Pointers Safely 540 Example: RefWithFlag 541 Nullable Pointers 544 Type Sizes and Alignments 544 Pointer Arithmetic 545
📄 Page
13
Moving into and out of Memory 546 Example: GapBuffer 550 Panic Safety in Unsafe Code 556 Foreign Functions: Calling C and C++ from Rust 557 Finding Common Data Representations 558 Declaring Foreign Functions and Variables 561 Using Functions from Libraries 562 A Raw Interface to libgit2 566 A Safe Interface to libgit2 572 Conclusion 583 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
📄 Page
14
Preface Rust is a language for systems programming. This bears some explanation these days, as systems programming is unfamiliar to most working programmers. Yet it underlies everything we do. You close your laptop. The operating system detects this, suspends all the running programs, turns off the screen, and puts the computer to sleep. Later, you open the laptop: the screen and other components are powered up again, and each program is able to pick up where it left off. We take this for granted. But systems programmers wrote a lot of code to make that happen. Systems programming is for: • Operating systems • Device drivers of all kinds • Filesystems • Databases • Code that runs in very cheap devices, or devices that must be extremely reliable • Cryptography • Media codecs (software for reading and writing audio, video, and image files) • Media processing (for example, speech recognition or photo editing software) • Memory management (for example, implementing a garbage collector) • Text rendering (the conversion of text and fonts into pixels) • Implementing higher-level programming languages (like JavaScript and Python) • Networking • Virtualization and software containers
📄 Page
15
• Scientific simulations • Games In short, systems programming is resource-constrained programming. It is program‐ ming when every byte and every CPU cycle counts. The amount of systems code involved in supporting a basic app is staggering. This book will not teach you systems programming. In fact, this book covers many details of memory management that might seem unnecessarily abstruse at first, if you haven’t already done some systems programming on your own. But if you are a seas‐ oned systems programmer, you’ll find that Rust is something exceptional: a new tool that eliminates major, well-understood problems that have plagued a whole industry for decades. Who Should Read This Book If you’re already a systems programmer, and you’re ready for an alternative to C++, this book is for you. If you’re an experienced developer in any programming lan‐ guage, whether that’s C#, Java, Python, JavaScript, or something else, this book is for you too. However, you don’t just need to learn Rust. To get the most out of the language, you also need to gain some experience with systems programming. We recommend read‐ ing this book while also implementing some systems programming side projects in Rust. Build something you’ve never built before, something that takes advantage of Rust’s speed, concurrency, and safety. The list of topics at the beginning of this preface should give you some ideas. Why We Wrote This Book We set out to write the book we wished we had when we started learning Rust. Our goal was to tackle the big, new concepts in Rust up front and head-on, presenting them clearly and in depth so as to minimize learning by trial and error. Navigating This Book The first two chapters of this book introduce Rust and provide a brief tour before we move on to the fundamental data types in Chapter 3. Chapters 4 and 5 address the core concepts of ownership and references. We recommend reading these first five chapters through in order. Chapters 6 through 10 cover the basics of the language: expressions (Chapter 6), error handling (Chapter 7), crates and modules (Chapter 8), structs (Chapter 9), and
📄 Page
16
enums and patterns (Chapter 10). It’s all right to skim a little here, but don’t skip the chapter on error handling. Trust us. Chapter 11 covers traits and generics, the last two big concepts you need to know. Traits are like interfaces in Java or C#. They’re also the main way Rust supports inte‐ grating your types into the language itself. Chapter 12 shows how traits support oper‐ ator overloading, and Chapter 13 covers many more utility traits. Understanding traits and generics unlocks the rest of the book. Closures and itera‐ tors, two key power tools that you won’t want to miss, are covered in Chapters 14 and 15, respectively. You can read the remaining chapters in any order, or just dip into them as needed. They cover the rest of the language: collections (Chapter 16), strings and text (Chapter 17), input and output (Chapter 18), concurrency (Chapter 19), macros (Chapter 20), and unsafe code (Chapter 21). Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This icon signifies a tip or suggestion. This icon signifies a general note.
📄 Page
17
This icon indicates a warning or caution. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/oreillymedia/programming_rust. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a signifi‐ cant amount of example code from this book into your product’s documentation does require permission. We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Programming Rust by Jim Blandy and Jason Orendorff (O’Reilly). Copyright 2018 Jim Blandy and Jason Orendorff, 978-1-491-92728-1.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com.
📄 Page
18
CHAPTER 1 Why Rust? In certain contexts—for example the context Rust is targeting—being 10x or even 2x faster than the competition is a make-or-break thing. It decides the fate of a system in the market, as much as it would in the hardware market. —Graydon Hoare All computers are now parallel... Parallel programming is programming. —Michael McCool et al., Structured Parallel Programming TrueType parser flaw used by nation-state attacker for surveillance; all software is security-sensitive. —Andy Wingo Systems programming languages have come a long way in the 50 years since we started using high-level languages to write operating systems, but two problems in particular have proven difficult to crack: • It’s difficult to write secure code. It’s especially difficult to manage memory cor‐ rectly in C and C++. Users have been suffering with the consequences for deca‐ des, in the form of security holes dating back at least as far as the 1988 Morris worm. • It’s very difficult to write multithreaded code, which is the only way to exploit the abilities of modern machines. Even experienced programmers approach threa‐ ded code with caution: concurrency can introduce broad new classes of bugs and make ordinary bugs much harder to reproduce. Enter Rust: a safe, concurrent language with the performance of C and C++. 1
📄 Page
19
Rust is a new systems programming language developed by Mozilla and a community of contributors. Like C and C++, Rust gives developers fine control over the use of memory, and maintains a close relationship between the primitive operations of the language and those of the machines it runs on, helping developers anticipate their code’s costs. Rust shares the ambitions Bjarne Stroustrup articulates for C++ in his paper “Abstraction and the C++ Machine Model:” In general, C++ implementations obey the zero-overhead principle: What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better. To these Rust adds its own goals of memory safety and trustworthy concurrency. The key to meeting all these promises is Rust’s novel system of ownership, moves, and borrows, checked at compile time and carefully designed to complement Rust’s flexi‐ ble static type system. The ownership system establishes a clear lifetime for each value, making garbage collection unnecessary in the core language, and enabling sound but flexible interfaces for managing other sorts of resources like sockets and file handles. Moves transfer values from one owner to another, and borrowing lets code use a value temporarily without affecting its ownership. Since many program‐ mers will have never encountered these features in this form before, we explain them in detail in Chapters 4 and 5. These same ownership rules also form the foundation of Rust’s trustworthy concur‐ rency model. Most languages leave the relationship between a mutex and the data it’s meant to protect to the comments; Rust can actually check at compile time that your code locks the mutex while it accesses the data. Most languages admonish you to be sure not to use a data structure yourself after you’ve given it to another thread; Rust checks that you don’t. Rust is able to prevent data races at compile time. Rust is not really an object-oriented language, although it has some object-oriented characteristics. Rust is not a functional language, although it does tend to make the influences on a computation’s result more explicit, as functional languages do. Rust resembles C and C++ to an extent, but many idioms from those languages don’t apply, so typical Rust code does not deeply resemble C or C++ code. It’s probably best to reserve judgement about what sort of language Rust is, and see what you think once you’ve become comfortable with the language. To get feedback on the design in a real-world setting, Mozilla has developed Servo, a new web browser engine, in Rust. Servo’s needs and Rust’s goals are well matched: a browser must perform well and handle untrusted data securely. Servo uses Rust’s safe concurrency to put the full machine to work on tasks that would be impractical to parallelize in C or C++. In fact, Servo and Rust have grown up together, with Servo using the latest new language features, and Rust evolving based on feedback from Ser‐ vo’s developers. 2 | Chapter 1: Why Rust?
📄 Page
20
Type Safety Rust is a type-safe language. But what do we mean by “type safety”? Safety sounds good, but what exactly are we being kept safe from? Here’s the definition of undefined behavior from the 1999 standard for the C pro‐ gramming language, known as C99: undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erro‐ neous data, for which this International Standard imposes no requirements Consider the following C program: int main(int argc, char **argv) { unsigned long a[1]; a[3] = 0x7ffff7b36cebUL; return 0; } According to C99, because this program accesses an element off the end of the array a, its behavior is undefined, meaning that it can do anything whatsoever. When we ran this program on Jim’s laptop, it produced the following output: undef: Error: .netrc file is readable by others. undef: Remove password or make file unreadable by others. Then it crashed. Jim’s laptop doesn’t even have a .netrc file. If you try it yourself, it will probably do something entirely different. The machine code the C compiler generated for this main function happens to place the array a on the stack three words before the return address, so storing 0x7ffff7b36cebUL in a[3] changes poor main’s return address to point into the midst of code in the C standard library that consults one’s .netrc file for a password. When main returns, execution resumes not in main’s caller, but at the machine code for these lines from the library: warnx(_("Error: .netrc file is readable by others.")); warnx(_("Remove password or make file unreadable by others.")); goto bad; In allowing an array reference to affect the behavior of a subsequent return state‐ ment, the C compiler is fully standards-compliant. An undefined operation doesn’t just produce an unspecified result: it is allowed to cause the program to do anything at all. The C99 standard grants the compiler this carte blanche to allow it to generate faster code. Rather than making the compiler responsible for detecting and handling odd Type Safety | 3