Statistics
9
Views
0
Downloads
0
Donations
Support
Share
Uploader

高宏飞

Shared on 2026-03-28

AuthorRemzi H. Arpaci-Dusseau, Andrea C. Arpaci-Dusseau

A book about modern operating systems. Topics are broken down into three major conceptual pieces: Virtualization, Concurrency, and Persistence. Includes all major components of modern systems including scheduling, virtual memory management, disk subsystems and I/O, file systems, and even a short introduction to distributed systems.

Tags
No tags
ISBN: 1105979121
Publisher: Arpaci-Dusseau
Publish Year: 2015
Language: 英文
Pages: 643
File Format: PDF
File Size: 4.0 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

OPERATING SYSTEMS THREE EASY PIECES REMZI H. ARPACI-DUSSEAU ANDREA C. ARPACI-DUSSEAU UNIVERSITY OF WISCONSIN–MADISON
(This page has no text content)
. . c© 2014 by Arpaci-Dusseau Books, Inc. All rights reserved
(This page has no text content)
i To Vedat S. Arpaci, a lifelong inspiration c© 2014, ARPACI-DUSSEAU THREE EASY PIECES
(This page has no text content)
Preface To Everyone Welcome to this book! We hope you’ll enjoy reading it as much as we enjoyed writing it. The book is called Operating Systems: Three Easy Pieces, and the title is obviously an homage to one of the greatest sets of lecture notes ever created, by one Richard Feynman on the topic of Physics [F96]. While this book will undoubt- edly fall short of the high standard set by that famous physicist, perhaps it will be good enough for you in your quest to understand what operating systems (and more generally, systems) are all about. The three easy pieces refer to the three major thematic elements the book is organized around: virtualization, concurrency, and persistence. In discussing these concepts, we’ll end up discussing most of the important things an operating system does; hopefully, you’ll also have some fun along the way. Learning new things is fun, right? At least, it should be. Each major concept is divided into a set of chapters, most of which present a particular problem and then show how to solve it. The chapters are short, and try (as best as possible) to reference the source material where the ideas really came from. One of our goals in writing this book is to make the paths of history as clear as possible, as we think that helps a student understand what is, what was, and what will be more clearly. In this case, seeing how the sausage was made is nearly as important as understanding what the sausage is good for1 . There are a couple devices we use throughout the book which are probably worth introducing here. The first is the crux of the problem. Anytime we are trying to solve a problem, we first try to state what the most important issue is; such a crux of the problem is explicitly called out in the text, and hopefully solved via the techniques, algorithms, and ideas presented in the rest of the text. There are also numerous asides and tips throughout the text, adding a little color to the mainline presentation. Asides tend to discuss something relevant (but perhaps not essential) to the main text; tips tend to be general lessons that can be applied to systems you build. An index at the end of the book lists all of these tips and asides (as well as cruces, the odd plural of crux) for your convenience. We use one of the oldest didactic methods, the dialogue, throughout the book, as a way of presenting some of the material in a different light. These are used to introduce the major thematic concepts (in a peachy way, as we will see), as well as to review material every now and then. They are also a chance to write in a more 1Hint: eating! Or if you’re a vegetarian, running away from. iii
iv humorous style. Whether you find them useful, or humorous, well, that’s another matter entirely. At the beginning of each major section, we’ll first present an abstraction that an operating system provides, and then work in subsequent chapters on the mecha- nisms, policies, and other support needed to provide the abstraction. Abstractions are fundamental to all aspects of Computer Science, so it is perhaps no surprise that they are also essential in operating systems. Throughout the chapters, we try to use real code (not pseudocode) where pos- sible, so for virtually all examples, you should be able to type them up yourself and run them. Running real code on real systems is the best way to learn about operating systems, so we encourage you to do so when you can. In various parts of the text, we have sprinkled in a few homeworks to ensure that you are understanding what is going on. Many of these homeworks are little simulations of pieces of the operating system; you should download the home- works, and run them to quiz yourself. The homework simulators have the follow- ing feature: by giving them a different random seed, you can generate a virtually infinite set of problems; the simulators can also be told to solve the problems for you. Thus, you can test and re-test yourself until you have achieved a good level of understanding. The most important addendum to this book is a set of projects in which you learn about how real systems work by designing, implementing, and testing your own code. All projects (as well as the code examples, mentioned above) are in the C programming language [KR88]; C is a simple and powerful language that underlies most operating systems, and thus worth adding to your tool-chest of languages. Two types of projects are available (see the online appendix for ideas). The first are systems programming projects; these projects are great for those who are new to C and UNIX and want to learn how to do low-level C programming. The second type are based on a real operating system kernel developed at MIT called xv6 [CK+08]; these projects are great for students that already have some C and want to get their hands dirty inside the OS. At Wisconsin, we’ve run the course in three different ways: either all systems programming, all xv6 programming, or a mix of both. OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG
v To Educators If you are an instructor or professor who wishes to use this book, please feel free to do so. As you may have noticed, they are free and available on-line from the following web page: http://www.ostep.org You can also purchase a printed copy from lulu.com. Look for it on the web page above. The (current) proper citation for the book is as follows: Operating Systems: Three Easy Pieces Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau Arpaci-Dusseau Books, Inc. May, 2014 (Version 0.8) http://www.ostep.org The course divides fairly well across a 15-week semester, in which you can cover most of the topics within at a reasonable level of depth. Cramming the course into a 10-week quarter probably requires dropping some detail from each of the pieces. There are also a few chapters on virtual machine monitors, which we usually squeeze in sometime during the semester, either right at end of the large section on virtualization, or near the end as an aside. One slightly unusual aspect of the book is that concurrency, a topic at the front of many OS books, is pushed off herein until the student has built an understand- ing of virtualization of the CPU and of memory. In our experience in teaching this course for nearly 15 years, students have a hard time understanding how the concurrency problem arises, or why they are trying to solve it, if they don’t yet un- derstand what an address space is, what a process is, or why context switches can occur at arbitrary points in time. Once they do understand these concepts, how- ever, introducing the notion of threads and the problems that arise due to them becomes rather easy, or at least, easier. You may have noticed there are no slides that go hand-in-hand with the book. The major reason for this omission is that we believe in the most old-fashioned of teaching methods: chalk and a blackboard. Thus, when we teach the course, we come to class with a few major ideas and examples in mind and use the board to present them; handouts and live code demos sprinkled are also useful. In our experience, using too many slides encourages students to “check out” of lecture (and log into facebook.com), as they know the material is there for them to digest later; using the blackboard makes lecture a “live” viewing experience and thus (hopefully) more interactive, dynamic, and enjoyable for the students in your class. If you’d like a copy of the notes we use in preparation for class, please drop us an email. We have already shared them with many others around the world. One last request: if you use the free online chapters, please just link to them, instead of making a local copy. This helps us track usage (over 1 million chapters downloaded in the past few years!) and also ensures students get the latest and greatest version. c© 2014, ARPACI-DUSSEAU THREE EASY PIECES
vi To Students If you are a student reading this book, thank you! It is an honor for us to provide some material to help you in your pursuit of knowledge about operating systems. We both think back fondly towards some textbooks of our undergraduate days (e.g., Hennessy and Patterson [HP90], the classic book on computer architec- ture) and hope this book will become one of those positive memories for you. You may have noticed this book is free and available online. There is one major reason for this: textbooks are generally too expensive. This book, we hope, is the first of a new wave of free materials to help those in pursuit of their education, regardless of which part of the world they come from or how much they are willing to spend for a book. Failing that, it is one free book, which is better than none. We also hope, where possible, to point you to the original sources of much of the material in the book: the great papers and persons who have shaped the field of operating systems over the years. Ideas are not pulled out of the air; they come from smart and hard-working people (including numerous Turing-award winners2), and thus we should strive to celebrate those ideas and people where possible. In doing so, we hopefully can better understand the revolutions that have taken place, instead of writing texts as if those thoughts have always been present [K62]. Further, perhaps such references will encourage you to dig deeper on your own; reading the famous papers of our field is certainly one of the best ways to learn. 2The Turing Award is the highest award in Computer Science; it is like the Nobel Prize, except that you have never heard of it. OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG
vii Acknowledgments This section will contain thanks to those who helped us put the book together. The important thing for now: your name could go here! But, you have to help. So send us some feedback and help debug this book. And you could be famous! Or, at least, have your name in some book. The people who have helped so far include: Abhirami Senthilkumaran*, Adam Drescher, Adam Eggum, Ahmed Fikri*, Ajaykrishna Raghavan, Alex Wyler, Anand Mundada, B. Brahmananda Reddy (Minnesota), Bala Subrahmanyam Kambala, Benita Bose, Biswajit Mazumder (Clemson), Bobby Jack, Björn Lindberg, Bren- nan Payne, Brian Kroth, Cara Lauritzen, Charlotte Kissinger, Chien-Chung Shen (Delaware)*, Cody Hanson, Dan Soendergaard (U. Aarhus), David Hanle (Grin- nell), Deepika Muthukumar, Dorian Arnold (New Mexico), Dustin Metzler, Dustin Passofaro, Emily Jacobson, Emmett Witchel (Texas), Ernst Biersack (France), Finn Kuusisto*, Guilherme Baptista, Hamid Reza Ghasemi, Henry Abbey, Hrishikesh Amur, Huanchen Zhang*, Jake Gillberg, James Perry (U. Michigan-Dearborn)*, Jay Lim, Jerod Weinman (Grinnell), Joel Sommers (Colgate), Jonathan Perry (MIT), Jun He, Karl Wallinger, Kaushik Kannan, Kevin Liu*, Lei Tian (U. Nebraska-Lincoln), Leslie Schultz, Lihao Wang, Martha Ferris, Masashi Kishikawa (Sony), Matt Rei- choff, Matty Williams, Meng Huang, Mike Griepentrog, Ming Chen (Stonybrook), Mohammed Alali (Delaware), Murugan Kandaswamy, Natasha Eilbert, Nathan Dipiazza, Nathan Sullivan, Neeraj Badlani (N.C. State), Nelson Gomez, Nghia Huynh (Texas), Patricio Jara, Radford Smith, Ripudaman Singh, Ross Aiken, Rus- lan Kiselev, Ryland Herrick, Samer Al-Kiswany, Sandeep Ummadi (Minnesota), Satish Chebrolu (NetApp), Satyanarayana Shanmugam*, Seth Pollen, Sharad Punuganti, Shreevatsa R., Sivaraman Sivaraman*, Srinivasan Thirunarayanan*, Suriyhaprakhas Balaram Sankari, Sy Jin Cheah, Thomas Griebel, Tongxin Zheng, Tony Adkins, Torin Rudeen (Princeton), Tuo Wang, Varun Vats, Xiang Peng, Xu Di, Yue Zhuo (Texas A&M), Yufui Ren, Zef RosnBrick, Zuyu Zhang. Special thanks to those marked with an asterisk above, who have gone above and beyond in their sugges- tions for improvement. Special thanks to Professor Joe Meehean (Lynchburg) for his detailed notes on each chapter, to Professor Jerod Weinman (Grinnell) and his entire class for their incredible booklets, and to Professor Chien-Chung Shen (Delaware) for his invalu- able and detailed reading and comments about the book. All three have helped these authors immeasurably in the refinement of the materials herein. Also, many thanks to the hundreds of students who have taken 537 over the years. In particular, the Fall ’08 class who encouraged the first written form of these notes (they were sick of not having any kind of textbook to read – pushy students!), and then praised them enough for us to keep going (including one hi- larious “ZOMG! You should totally write a textbook!” comment in our course evaluations that year). A great debt of thanks is also owed to the brave few who took the xv6 project lab course, much of which is now incorporated into the main 537 course. From Spring ’09: Justin Cherniak, Patrick Deline, Matt Czech, Tony Gregerson, Michael Griepentrog, Tyler Harter, Ryan Kroiss, Eric Radzikowski, Wesley Reardan, Rajiv Vaidyanathan, and Christopher Waclawik. From Fall ’09: Nick Bearson, Aaron Brown, Alex Bird, David Capel, Keith Gould, Tom Grim, Jeffrey Hugo, Brandon Johnson, John Kjell, Boyan Li, James Loethen, Will McCardell, Ryan Szaroletta, Si- c© 2014, ARPACI-DUSSEAU THREE EASY PIECES
viii mon Tso, and Ben Yule. From Spring ’10: Patrick Blesi, Aidan Dennis-Oehling, Paras Doshi, Jake Friedman, Benjamin Frisch, Evan Hanson, Pikkili Hemanth, Michael Jeung, Alex Langenfeld, Scott Rick, Mike Treffert, Garret Staus, Brennan Wall, Hans Werner, Soo-Young Yang, and Carlos Griffin (almost). Although they do not directly help with the book, our graduate students have taught us much of what we know about systems. We talk with them regularly while they are at Wisconsin, but they do all the real work – and by telling us about what they are doing, we learn new things every week. This list includes the fol- lowing collection of current and former students with whom we published pa- pers; an asterisk marks those who received a Ph.D. under our guidance: Abhishek Rajimwale, Ao Ma, Brian Forney, Chris Dragga, Deepak Ramamurthi, Florentina Popovici*, Haryadi S. Gunawi*, James Nugent, John Bent*, Lanyue Lu, Lakshmi Bairavasundaram*, Laxman Visampalli, Leo Arulraj, Meenali Rungta, Muthian Si- vathanu*, Nathan Burnett*, Nitin Agrawal*, Sriram Subramanian*, Stephen Todd Jones*, Swaminathan Sundararaman*, Swetha Krishnan, Thanh Do, Thanumalayan S. Pillai, Timothy Denehy*, Tyler Harter, Venkat Venkataramani, Vijay Chidambaram, Vijayan Prabhakaran*, Yiying Zhang*, Yupu Zhang*, Zev Weiss. A final debt of gratitude is also owed to Aaron Brown, who first took this course many years ago (Spring ’09), then took the xv6 lab course (Fall ’09), and finally was a graduate teaching assistant for the course for two years or so (Fall ’10 through Spring ’12). His tireless work has vastly improved the state of the projects (par- ticularly those in xv6 land) and thus has helped better the learning experience for countless undergraduates and graduates here at Wisconsin. As Aaron would say (in his usual succinct manner): “Thx.” OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG
ix Final Words Yeats famously said “Education is not the filling of a pail but the lighting of a fire.” He was right but wrong at the same time3. You do have to “fill the pail” a bit, and these notes are certainly here to help with that part of your education; after all, when you go to interview at Google, and they ask you a trick question about how to use semaphores, it might be good to actually know what a semaphore is, right? But Yeats’s larger point is obviously on the mark: the real point of education is to get you interested in something, to learn something more about the subject matter on your own and not just what you have to digest to get a good grade in some class. As one of our fathers (Remzi’s dad, Vedat Arpaci) used to say, “Learn beyond the classroom”. We created these notes to spark your interest in operating systems, to read more about the topic on your own, to talk to your professor about all the exciting re- search that is going on in the field, and even to get involved with that research. It is a great field(!), full of exciting and wonderful ideas that have shaped computing history in profound and important ways. And while we understand this fire won’t light for all of you, we hope it does for many, or even a few. Because once that fire is lit, well, that is when you truly become capable of doing something great. And thus the real point of the educational process: to go forth, to study many new and fascinating topics, to learn, to mature, and most importantly, to find something that lights a fire for you. Andrea and Remzi Married couple Professors of Computer Science at the University of Wisconsin Chief Lighters of Fires, hopefully4 3If he actually said this; as with many famous quotes, the history of this gem is murky. 4If this sounds like we are admitting some past history as arsonists, you are probably missing the point. Probably. If this sounds cheesy, well, that’s because it is, but you’ll just have to forgive us for that. c© 2014, ARPACI-DUSSEAU THREE EASY PIECES
x References [CK+08] “The xv6 Operating System” Russ Cox, Frans Kaashoek, Robert Morris, Nickolai Zeldovich From: http://pdos.csail.mit.edu/6.828/2008/index.html xv6 was developed as a port of the original UNIX version 6 and represents a beautiful, clean, and simple way to understand a modern operating system. [F96] “Six Easy Pieces: Essentials Of Physics Explained By Its Most Brilliant Teacher” Richard P. Feynman Basic Books, 1996 This book reprints the six easiest chapters of Feynman’s Lectures on Physics, from 1963. If you like Physics, it is a fantastic read. [HP90] “Computer Architecture a Quantitative Approach” (1st ed.) David A. Patterson and John L. Hennessy Morgan-Kaufman, 1990 A book that encouraged each of us at our undergraduate institutions to pursue graduate studies; we later both had the pleasure of working with Patterson, who greatly shaped the foundations of our research careers. [KR88] “The C Programming Language” Brian Kernighan and Dennis Ritchie Prentice-Hall, April 1988 The C programming reference that everyone should have, by the people who invented the language. [K62] “The Structure of Scientific Revolutions” Thomas S. Kuhn University of Chicago Press, 1962 A great and famous read about the fundamentals of the scientific process. Mop-up work, anomaly, crisis, and revolution. We are mostly destined to do mop-up work, alas. OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG
Contents To Everyone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii To Educators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v To Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 A Dialogue on the Book 1 2 Introduction to Operating Systems 3 2.1 Virtualizing the CPU . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Virtualizing Memory . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 Design Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.6 Some History . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 I Virtualization 21 3 A Dialogue on Virtualization 23 4 The Abstraction: The Process 25 4.1 The Abstraction: A Process . . . . . . . . . . . . . . . . . . . 26 4.2 Process API . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3 Process Creation: A Little More Detail . . . . . . . . . . . . 28 4.4 Process States . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.5 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 xi
xii CONTENTS 5 Interlude: Process API 35 5.1 The fork() System Call . . . . . . . . . . . . . . . . . . . . 35 5.2 Adding wait() System Call . . . . . . . . . . . . . . . . . . 37 5.3 Finally, the exec() System Call . . . . . . . . . . . . . . . . 38 5.4 Why? Motivating the API . . . . . . . . . . . . . . . . . . . 39 5.5 Other Parts of the API . . . . . . . . . . . . . . . . . . . . . . 42 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6 Mechanism: Limited Direct Execution 45 6.1 Basic Technique: Limited Direct Execution . . . . . . . . . . 45 6.2 Problem #1: Restricted Operations . . . . . . . . . . . . . . . 46 6.3 Problem #2: Switching Between Processes . . . . . . . . . . 50 6.4 Worried About Concurrency? . . . . . . . . . . . . . . . . . 54 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Homework (Measurement) . . . . . . . . . . . . . . . . . . . . . . 58 7 Scheduling: Introduction 59 7.1 Workload Assumptions . . . . . . . . . . . . . . . . . . . . . 59 7.2 Scheduling Metrics . . . . . . . . . . . . . . . . . . . . . . . 60 7.3 First In, First Out (FIFO) . . . . . . . . . . . . . . . . . . . . 60 7.4 Shortest Job First (SJF) . . . . . . . . . . . . . . . . . . . . . 62 7.5 Shortest Time-to-Completion First (STCF) . . . . . . . . . . 63 7.6 Round Robin . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.7 Incorporating I/O . . . . . . . . . . . . . . . . . . . . . . . . 66 7.8 No More Oracle . . . . . . . . . . . . . . . . . . . . . . . . . 68 7.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 8 Scheduling:The Multi-Level Feedback Queue 71 8.1 MLFQ: Basic Rules . . . . . . . . . . . . . . . . . . . . . . . 72 8.2 Attempt #1: How to Change Priority . . . . . . . . . . . . . 73 8.3 Attempt #2: The Priority Boost . . . . . . . . . . . . . . . . . 76 8.4 Attempt #3: Better Accounting . . . . . . . . . . . . . . . . . 77 8.5 Tuning MLFQ And Other Issues . . . . . . . . . . . . . . . . 78 8.6 MLFQ: Summary . . . . . . . . . . . . . . . . . . . . . . . . 79 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9 Scheduling: Proportional Share 83 9.1 Basic Concept: Tickets Represent Your Share . . . . . . . . . 83 9.2 Ticket Mechanisms . . . . . . . . . . . . . . . . . . . . . . . 85 9.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 86 9.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 9.5 How To Assign Tickets? . . . . . . . . . . . . . . . . . . . . . 88 OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG
CONTENTS xiii 9.6 Why Not Deterministic? . . . . . . . . . . . . . . . . . . . . 88 9.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 10 Multiprocessor Scheduling (Advanced) 93 10.1 Background: Multiprocessor Architecture . . . . . . . . . . 94 10.2 Don’t Forget Synchronization . . . . . . . . . . . . . . . . . 96 10.3 One Final Issue: Cache Affinity . . . . . . . . . . . . . . . . 97 10.4 Single-Queue Scheduling . . . . . . . . . . . . . . . . . . . . 97 10.5 Multi-Queue Scheduling . . . . . . . . . . . . . . . . . . . . 99 10.6 Linux Multiprocessor Schedulers . . . . . . . . . . . . . . . 102 10.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 11 Summary Dialogue on CPU Virtualization 105 12 A Dialogue on Memory Virtualization 107 13 The Abstraction: Address Spaces 109 13.1 Early Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 109 13.2 Multiprogramming and Time Sharing . . . . . . . . . . . . . 110 13.3 The Address Space . . . . . . . . . . . . . . . . . . . . . . . 111 13.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 13.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 14 Interlude: Memory API 119 14.1 Types of Memory . . . . . . . . . . . . . . . . . . . . . . . . 119 14.2 The malloc() Call . . . . . . . . . . . . . . . . . . . . . . . 120 14.3 The free() Call . . . . . . . . . . . . . . . . . . . . . . . . 122 14.4 Common Errors . . . . . . . . . . . . . . . . . . . . . . . . . 122 14.5 Underlying OS Support . . . . . . . . . . . . . . . . . . . . . 125 14.6 Other Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 14.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 15 Mechanism: Address Translation 129 15.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 15.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 15.3 Dynamic (Hardware-based) Relocation . . . . . . . . . . . . 133 15.4 OS Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 15.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 16 Segmentation 141 c© 2014, ARPACI-DUSSEAU THREE EASY PIECES
xiv CONTENTS 16.1 Segmentation: Generalized Base/Bounds . . . . . . . . . . 141 16.2 Which Segment Are We Referring To? . . . . . . . . . . . . . 144 16.3 What About The Stack? . . . . . . . . . . . . . . . . . . . . . 145 16.4 Support for Sharing . . . . . . . . . . . . . . . . . . . . . . . 146 16.5 Fine-grained vs. Coarse-grained Segmentation . . . . . . . 147 16.6 OS Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 16.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 17 Free-Space Management 153 17.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 17.2 Low-level Mechanisms . . . . . . . . . . . . . . . . . . . . . 155 17.3 Basic Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 163 17.4 Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . 165 17.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 18 Paging: Introduction 169 18.1 Where Are Page Tables Stored? . . . . . . . . . . . . . . . . 172 18.2 What’s Actually In The Page Table? . . . . . . . . . . . . . . 173 18.3 Paging: Also Too Slow . . . . . . . . . . . . . . . . . . . . . 174 18.4 A Memory Trace . . . . . . . . . . . . . . . . . . . . . . . . . 176 18.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 19 Paging: Faster Translations (TLBs) 183 19.1 TLB Basic Algorithm . . . . . . . . . . . . . . . . . . . . . . 183 19.2 Example: Accessing An Array . . . . . . . . . . . . . . . . . 185 19.3 Who Handles The TLB Miss? . . . . . . . . . . . . . . . . . . 187 19.4 TLB Contents: What’s In There? . . . . . . . . . . . . . . . . 189 19.5 TLB Issue: Context Switches . . . . . . . . . . . . . . . . . . 190 19.6 Issue: Replacement Policy . . . . . . . . . . . . . . . . . . . 192 19.7 A Real TLB Entry . . . . . . . . . . . . . . . . . . . . . . . . 193 19.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Homework (Measurement) . . . . . . . . . . . . . . . . . . . . . . 197 20 Paging: Smaller Tables 201 20.1 Simple Solution: Bigger Pages . . . . . . . . . . . . . . . . . 201 20.2 Hybrid Approach: Paging and Segments . . . . . . . . . . . 202 20.3 Multi-level Page Tables . . . . . . . . . . . . . . . . . . . . . 205 20.4 Inverted Page Tables . . . . . . . . . . . . . . . . . . . . . . 212 20.5 Swapping the Page Tables to Disk . . . . . . . . . . . . . . . 213 20.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG
CONTENTS xv Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 21 Beyond Physical Memory: Mechanisms 217 21.1 Swap Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 21.2 The Present Bit . . . . . . . . . . . . . . . . . . . . . . . . . . 219 21.3 The Page Fault . . . . . . . . . . . . . . . . . . . . . . . . . . 220 21.4 What If Memory Is Full? . . . . . . . . . . . . . . . . . . . . 221 21.5 Page Fault Control Flow . . . . . . . . . . . . . . . . . . . . 222 21.6 When Replacements Really Occur . . . . . . . . . . . . . . . 223 21.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 22 Beyond Physical Memory: Policies 227 22.1 Cache Management . . . . . . . . . . . . . . . . . . . . . . . 227 22.2 The Optimal Replacement Policy . . . . . . . . . . . . . . . 228 22.3 A Simple Policy: FIFO . . . . . . . . . . . . . . . . . . . . . 230 22.4 Another Simple Policy: Random . . . . . . . . . . . . . . . . 232 22.5 Using History: LRU . . . . . . . . . . . . . . . . . . . . . . . 233 22.6 Workload Examples . . . . . . . . . . . . . . . . . . . . . . . 234 22.7 Implementing Historical Algorithms . . . . . . . . . . . . . 237 22.8 Approximating LRU . . . . . . . . . . . . . . . . . . . . . . 238 22.9 Considering Dirty Pages . . . . . . . . . . . . . . . . . . . . 239 22.10 Other VM Policies . . . . . . . . . . . . . . . . . . . . . . . . 240 22.11 Thrashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 22.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 23 The VAX/VMS Virtual Memory System 245 23.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 23.2 Memory Management Hardware . . . . . . . . . . . . . . . 246 23.3 A Real Address Space . . . . . . . . . . . . . . . . . . . . . . 247 23.4 Page Replacement . . . . . . . . . . . . . . . . . . . . . . . . 249 23.5 Other Neat VM Tricks . . . . . . . . . . . . . . . . . . . . . . 250 23.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 24 Summary Dialogue on Memory Virtualization 255 II Concurrency 259 25 A Dialogue on Concurrency 261 26 Concurrency: An Introduction 263 26.1 An Example: Thread Creation . . . . . . . . . . . . . . . . . 264 26.2 Why It Gets Worse: Shared Data . . . . . . . . . . . . . . . . 267 c© 2014, ARPACI-DUSSEAU THREE EASY PIECES
xvi CONTENTS 26.3 The Heart of the Problem: Uncontrolled Scheduling . . . . 269 26.4 The Wish For Atomicity . . . . . . . . . . . . . . . . . . . . . 271 26.5 One More Problem: Waiting For Another . . . . . . . . . . . 273 26.6 Summary: Why in OS Class? . . . . . . . . . . . . . . . . . . 273 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 27 Interlude: Thread API 279 27.1 Thread Creation . . . . . . . . . . . . . . . . . . . . . . . . . 279 27.2 Thread Completion . . . . . . . . . . . . . . . . . . . . . . . 280 27.3 Locks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 27.4 Condition Variables . . . . . . . . . . . . . . . . . . . . . . . 285 27.5 Compiling and Running . . . . . . . . . . . . . . . . . . . . 287 27.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 28 Locks 291 28.1 Locks: The Basic Idea . . . . . . . . . . . . . . . . . . . . . . 291 28.2 Pthread Locks . . . . . . . . . . . . . . . . . . . . . . . . . . 292 28.3 Building A Lock . . . . . . . . . . . . . . . . . . . . . . . . . 293 28.4 Evaluating Locks . . . . . . . . . . . . . . . . . . . . . . . . 293 28.5 Controlling Interrupts . . . . . . . . . . . . . . . . . . . . . . 294 28.6 Test And Set (Atomic Exchange) . . . . . . . . . . . . . . . . 295 28.7 Building A Working Spin Lock . . . . . . . . . . . . . . . . . 297 28.8 Evaluating Spin Locks . . . . . . . . . . . . . . . . . . . . . 299 28.9 Compare-And-Swap . . . . . . . . . . . . . . . . . . . . . . 299 28.10 Load-Linked and Store-Conditional . . . . . . . . . . . . . . 300 28.11 Fetch-And-Add . . . . . . . . . . . . . . . . . . . . . . . . . 302 28.12 Summary: So Much Spinning . . . . . . . . . . . . . . . . . 303 28.13 A Simple Approach: Just Yield, Baby . . . . . . . . . . . . . 304 28.14 Using Queues: Sleeping Instead Of Spinning . . . . . . . . . 305 28.15 Different OS, Different Support . . . . . . . . . . . . . . . . 307 28.16 Two-Phase Locks . . . . . . . . . . . . . . . . . . . . . . . . 307 28.17 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 29 Lock-based Concurrent Data Structures 311 29.1 Concurrent Counters . . . . . . . . . . . . . . . . . . . . . . 311 29.2 Concurrent Linked Lists . . . . . . . . . . . . . . . . . . . . 316 29.3 Concurrent Queues . . . . . . . . . . . . . . . . . . . . . . . 319 29.4 Concurrent Hash Table . . . . . . . . . . . . . . . . . . . . . 320 29.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 30 Condition Variables 325 30.1 Definition and Routines . . . . . . . . . . . . . . . . . . . . . 326 30.2 The Producer/Consumer (Bound Buffer) Problem . . . . . . 329 OPERATING SYSTEMS [VERSION 0.80] WWW.OSTEP.ORG