Statistics
64
Views
0
Downloads
0
Donations
Uploader

高宏飞

Shared on 2025-11-30
Support
Share

AuthorDimitri Fontaine

The book that teaches SQL to developers: Learn to replace thousands of lines of code with simple queries! SQL is code Learn how to deal with SQL in your developer workflow: versioning, testing, code reviewing, deployment. Best practice and tools are covered. Practical examples Learn advanced SQL with practical examples and datasets that help you get the most of the book! Every query solves a practical use case and is given in context. Database modelling The book covers (de-)normalisation with simple practical examples to dive into this seemingly complex topic, including Caching and Indexing Strategy. SQL for developers This book is for developers, covering advanced SQL techniques for data processing. Learn how to get exactly the result set you need in your application’s code! Write efficient SQL Writing efficient SQL is easier than it looks, and begins with database modeling and writing clear code. The book teaches you how to write fast queries! Batteries included The book comes with open data sets and instructions to get them so that you can run all the SQL queries at home, edit them and learn from them. Advanced psql setup is even included!

Tags
No tags
Publisher: Self-published
Publish Year: 2019
Language: 英文
Pages: 457
File Format: PDF
File Size: 1.5 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

(This page has no text content)
Contents I Preface xi About… xiii 1 About the Book . . . . . . . . . . . . . . . . . . . . . . . . . xiii 2 About the Author . . . . . . . . . . . . . . . . . . . . . . . . xiv 3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . xiv 4 About the organisation of the books . . . . . . . . . . . . . . . xv II Introduction 1 1 Structured Query Language 2 1.1 Some of the Code is Written in SQL . . . . . . . . . . . . . . . 3 1.2 A First Use Case . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Loading the Data Set . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Application Code and SQL . . . . . . . . . . . . . . . . . . . 5 1.5 AWord about SQL Injection . . . . . . . . . . . . . . . . . . 9 1.6 PostgreSQL protocol: server-side prepared statements . . . . . 10 1.7 Back to Discovering SQL . . . . . . . . . . . . . . . . . . . . 12 1.8 ComputingWeekly Changes . . . . . . . . . . . . . . . . . . . 15 2 Software Architecture 18 2.1 Why PostgreSQL? . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 The PostgreSQLDocumentation . . . . . . . . . . . . . . . . 22 3 Getting Ready to read this Book 23
Contents | iii III Writing Sql Queries 25 4 Business Logic 27 4.1 Every SQL query embeds some business logic . . . . . . . . . . 27 4.2 Business Logic Applies to Use Cases . . . . . . . . . . . . . . . 29 4.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.4 E ciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.5 Stored Procedures — a Data Access API . . . . . . . . . . . . . 36 4.6 Procedural Code and Stored Procedures . . . . . . . . . . . . . 38 4.7 Where to Implement Business Logic? . . . . . . . . . . . . . . 39 5 A Small Application 41 5.1 Readme First Driven Development . . . . . . . . . . . . . . . 41 5.2 Loading the Dataset . . . . . . . . . . . . . . . . . . . . . . . 42 5.3 Chinook Database . . . . . . . . . . . . . . . . . . . . . . . . 43 5.4 Music Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.5 Albums by Artist . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.6 Top-N Artists by Genre . . . . . . . . . . . . . . . . . . . . . 46 6 The SQL REPL — An Interactive Setup 52 6.1 Intro to psql . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.2 The psqlrc Setup . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.3 Transactions and psql Behavior . . . . . . . . . . . . . . . . . 54 6.4 A Reporting Tool . . . . . . . . . . . . . . . . . . . . . . . . 56 6.5 Discovering a Schema . . . . . . . . . . . . . . . . . . . . . . 57 6.6 Interactive Query Editor . . . . . . . . . . . . . . . . . . . . . 58 7 SQL is Code 60 7.1 SQL style guidelines . . . . . . . . . . . . . . . . . . . . . . . 60 7.2 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.3 Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.4 Regression Tests . . . . . . . . . . . . . . . . . . . . . . . . . 68 7.5 A Closer Look . . . . . . . . . . . . . . . . . . . . . . . . . . 69 8 Indexing Strategy 71 8.1 Indexing for Constraints . . . . . . . . . . . . . . . . . . . . . 72 8.2 Indexing for Queries . . . . . . . . . . . . . . . . . . . . . . . 73 8.3 Cost of IndexMaintenance . . . . . . . . . . . . . . . . . . . 74 8.4 Choosing Queries to Optimize . . . . . . . . . . . . . . . . . 74
Contents | iv 8.5 PostgreSQL Index Access Methods . . . . . . . . . . . . . . . 74 8.6 Advanced Indexing . . . . . . . . . . . . . . . . . . . . . . . . 77 8.7 Adding Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . 77 9 An Interview with Yohann Gabory 81 IV SQL Toolbox 86 10 Get Some Data 88 11 Structured Query Language 89 12 Queries, DML, DDL, TCL, DCL 91 13 Select, From, Where 93 13.1 Anatomy of a Select Statement . . . . . . . . . . . . . . . . . . 93 13.2 Projection (output): Select . . . . . . . . . . . . . . . . . . . . 93 13.3 Data sources: From . . . . . . . . . . . . . . . . . . . . . . . . 100 13.4 Understanding Joins . . . . . . . . . . . . . . . . . . . . . . . 101 13.5 Restrictions: Where . . . . . . . . . . . . . . . . . . . . . . . 102 14 Order By, Limit, No Offset 105 14.1 Ordering with Order By . . . . . . . . . . . . . . . . . . . . . 105 14.2 kNNOrdering and GiST indexes . . . . . . . . . . . . . . . . 107 14.3 Top-N sorts: Limit . . . . . . . . . . . . . . . . . . . . . . . . 109 14.4 No O fset, and how to implement pagination . . . . . . . . . . 111 15 Group By, Having, With, Union All 114 15.1 Aggregates (aka Map/Reduce): Group By . . . . . . . . . . . . 114 15.2 Aggregates Without a Group By . . . . . . . . . . . . . . . . . 117 15.3 Restrict Selected Groups: Having . . . . . . . . . . . . . . . . 118 15.4 Grouping Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 119 15.5 Common Table Expressions: With . . . . . . . . . . . . . . . 122 15.6 Distinct On . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 15.7 Result Sets Operations . . . . . . . . . . . . . . . . . . . . . . 127 16 Understanding Nulls 131 16.1 Three-Valued Logic . . . . . . . . . . . . . . . . . . . . . . . 131
Contents | v 16.2 Not Null Constraints . . . . . . . . . . . . . . . . . . . . . . 133 16.3 Outer Joins Introducing Nulls . . . . . . . . . . . . . . . . . . 134 16.4 Using Null in Applications . . . . . . . . . . . . . . . . . . . 135 17 Understanding Window Functions 137 17.1 Windows and Frames . . . . . . . . . . . . . . . . . . . . . . 137 17.2 Partitioning into Di ferent Frames . . . . . . . . . . . . . . . . 139 17.3 Available Window Functions . . . . . . . . . . . . . . . . . . 140 17.4 When to UseWindow Functions . . . . . . . . . . . . . . . . 142 18 Understanding Relations and Joins 143 18.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 18.2 SQL Join Types . . . . . . . . . . . . . . . . . . . . . . . . . . 145 19 An Interview with Markus Winand 148 V Data Types 152 20 Serialization and Deserialization 154 21 Some Relational Theory 156 21.1 Attribute Values, Data Domains and Data Types . . . . . . . . 157 21.2 Consistency and Data Type Behavior . . . . . . . . . . . . . . 158 22 PostgreSQL Data Types 162 22.1 Boolean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 22.2 Character and Text . . . . . . . . . . . . . . . . . . . . . . . . 165 22.3 Server Encoding and Client Encoding . . . . . . . . . . . . . . 169 22.4 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 22.5 Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . 174 22.6 Sequences and the Serial Pseudo Data Type . . . . . . . . . . . 174 22.7 Universally Unique Identi er: UUID . . . . . . . . . . . . . . 175 22.8 Bytea and Bitstring . . . . . . . . . . . . . . . . . . . . . . . . 177 22.9 Date/Time and Time Zones . . . . . . . . . . . . . . . . . . . 177 22.10 Time Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . 181 22.11 Date/Time Processing and Querying . . . . . . . . . . . . . . 182 22.12 Network Address Types . . . . . . . . . . . . . . . . . . . . . 187 22.13 Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Contents | vi 23 Denormalized Data Types 193 23.1 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 23.2 Composite Types . . . . . . . . . . . . . . . . . . . . . . . . . 199 23.3 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 23.4 JSON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 23.5 Enum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 24 PostgreSQL Extensions 206 25 An interview with Grégoire Hubert 208 VI Data Modeling 211 26 Object Relational Mapping 213 27 Tooling for Database Modeling 215 27.1 How toWrite a Database Model . . . . . . . . . . . . . . . . . 216 27.2 Generating RandomData . . . . . . . . . . . . . . . . . . . . 219 27.3 Modeling Example . . . . . . . . . . . . . . . . . . . . . . . . 221 28 Normalization 227 28.1 Data Structures and Algorithms . . . . . . . . . . . . . . . . . 227 28.2 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . 230 28.3 Database Anomalies . . . . . . . . . . . . . . . . . . . . . . . 231 28.4 Modeling an Address Field . . . . . . . . . . . . . . . . . . . . 232 28.5 Primary Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 28.6 Surrogate Keys . . . . . . . . . . . . . . . . . . . . . . . . . . 235 28.7 Foreign Keys Constraints . . . . . . . . . . . . . . . . . . . . . 237 28.8 Not Null Constraints . . . . . . . . . . . . . . . . . . . . . . 238 28.9 Check Constraints and Domains . . . . . . . . . . . . . . . . . 238 28.10 Exclusion Constraints . . . . . . . . . . . . . . . . . . . . . . 239 29 Practical Use Case: Geonames 240 29.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 29.2 Countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 29.3 Administrative Zoning . . . . . . . . . . . . . . . . . . . . . . 248 29.4 Geolocation Data . . . . . . . . . . . . . . . . . . . . . . . . . 251 29.5 Geolocation GiST Indexing . . . . . . . . . . . . . . . . . . . 254
Contents | vii 29.6 A Sampling of Countries . . . . . . . . . . . . . . . . . . . . . 256 30 Modelization Anti-Patterns 258 30.1 Entity Attribute Values . . . . . . . . . . . . . . . . . . . . . . 258 30.2 Multiple Values per Column . . . . . . . . . . . . . . . . . . . 261 30.3 UUIDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 31 Denormalization 265 31.1 Premature Optimization . . . . . . . . . . . . . . . . . . . . . 266 31.2 Functional Dependency Trade-O fs . . . . . . . . . . . . . . . 266 31.3 Denormalization with PostgreSQL . . . . . . . . . . . . . . . 267 31.4 Materialized Views . . . . . . . . . . . . . . . . . . . . . . . . 268 31.5 History Tables and Audit Trails . . . . . . . . . . . . . . . . . 270 31.6 Validity Period as a Range . . . . . . . . . . . . . . . . . . . . 272 31.7 Pre-Computed Values . . . . . . . . . . . . . . . . . . . . . . 273 31.8 Enumerated Types . . . . . . . . . . . . . . . . . . . . . . . . 273 31.9 Multiple Values per Attribute . . . . . . . . . . . . . . . . . . 274 31.10 The Spare Matrix Model . . . . . . . . . . . . . . . . . . . . . 274 31.11 Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 31.12 Other Denormalization Tools . . . . . . . . . . . . . . . . . . 276 31.13 Denormalize wih Care . . . . . . . . . . . . . . . . . . . . . . 276 32 Not Only SQL 278 32.1 Schemaless Design in PostgreSQL . . . . . . . . . . . . . . . . 279 32.2 Durability Trade-O fs . . . . . . . . . . . . . . . . . . . . . . 282 32.3 Scaling Out . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 33 An interview with Álvaro Hernández Tortosa 286 VII Data Manipulation and Concurrency Control 291 34 Another Small Application 293 35 Insert, Update, Delete 297 35.1 Insert Into . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 35.2 Insert Into … Select . . . . . . . . . . . . . . . . . . . . . . . . 298 35.3 Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 35.4 Inserting Some Tweets . . . . . . . . . . . . . . . . . . . . . . 303
Contents | viii 35.5 Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 35.6 Tuples and Rows . . . . . . . . . . . . . . . . . . . . . . . . . 307 35.7 Deleting All the Rows: Truncate . . . . . . . . . . . . . . . . 307 35.8 Delete but Keep a Few Rows . . . . . . . . . . . . . . . . . . . 308 36 Isolation and Locking 309 36.1 Transactions and Isolation . . . . . . . . . . . . . . . . . . . . 310 36.2 About SSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 36.3 Concurrent Updates and Isolation . . . . . . . . . . . . . . . . 312 36.4 Modeling for Concurrency . . . . . . . . . . . . . . . . . . . . 314 36.5 Putting Concurrency to the Test . . . . . . . . . . . . . . . . . 315 37 Computing and Caching in SQL 319 37.1 Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 37.2 Materialized Views . . . . . . . . . . . . . . . . . . . . . . . . 321 38 Triggers 324 38.1 Transactional Event Driven Processing . . . . . . . . . . . . . 325 38.2 Trigger and Counters Anti-Pattern . . . . . . . . . . . . . . . 327 38.3 Fixing the Behavior . . . . . . . . . . . . . . . . . . . . . . . . 328 38.4 Event Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . 330 39 Listen and Notify 332 39.1 PostgreSQLNoti cations . . . . . . . . . . . . . . . . . . . . 332 39.2 PostgreSQL Event Publication System . . . . . . . . . . . . . 333 39.3 Noti cations and Cache Maintenance . . . . . . . . . . . . . . 335 39.4 Limitations of Listen and Notify . . . . . . . . . . . . . . . . 340 39.5 Listen and Notify Support in Drivers . . . . . . . . . . . . . . 340 40 Batch Update, MoMA Collection 342 40.1 Updating the Data . . . . . . . . . . . . . . . . . . . . . . . . 343 40.2 Concurrency Patterns . . . . . . . . . . . . . . . . . . . . . . 345 40.3 On Con ict Do Nothing . . . . . . . . . . . . . . . . . . . . . 346 41 An Interview with Kris Jenkins 348
Contents | ix VIII PostgreSQL Extensions 352 42 What’s a PostgreSQL Extension? 354 42.1 Inside PostgreSQL Extensions . . . . . . . . . . . . . . . . . . 356 42.2 Installing and Using PostgreSQL Extensions . . . . . . . . . . 357 42.3 Finding PostgreSQL Extensions . . . . . . . . . . . . . . . . . 358 42.4 A Primer on Authoring PostgreSQL Extensions . . . . . . . . 359 42.5 A Short List of Noteworthy Extensions . . . . . . . . . . . . . 359 43 Auditing Changes with hstore 365 43.1 Introduction to hstore . . . . . . . . . . . . . . . . . . . . . . 365 43.2 Comparing hstores . . . . . . . . . . . . . . . . . . . . . . . . 366 43.3 Auditing Changes with a Trigger . . . . . . . . . . . . . . . . 366 43.4 Testing the Audit Trigger . . . . . . . . . . . . . . . . . . . . 368 43.5 From hstore Back to a Regular Record . . . . . . . . . . . . . 370 44 Last.fm Million Song Dataset 372 45 Using Trigrams For Typos 378 45.1 The pg_trgm PostgreSQL Extension . . . . . . . . . . . . . . . 378 45.2 Trigrams, Similarity and Searches . . . . . . . . . . . . . . . . 379 45.3 Complete and Suggest Song Titles . . . . . . . . . . . . . . . . 383 45.4 Trigram Indexing . . . . . . . . . . . . . . . . . . . . . . . . . 384 46 Denormalizing Tags with intarray 386 46.1 Advanced Tag Indexing . . . . . . . . . . . . . . . . . . . . . 386 46.2 Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388 46.3 User-De ned Tags Made Easy . . . . . . . . . . . . . . . . . . 390 47 The Most Popular Pub Names 392 47.1 A Pub Names Database . . . . . . . . . . . . . . . . . . . . . 392 47.2 Normalizing the Data . . . . . . . . . . . . . . . . . . . . . . 394 47.3 Geolocating the Nearest Pub (k-NN search) . . . . . . . . . . . 395 47.4 Indexing kNN Search . . . . . . . . . . . . . . . . . . . . . . 396 48 How far is the nearest pub? 398 48.1 The earthdistance PostgreSQL contrib . . . . . . . . . . . . 398 48.2 Pubs and Cities . . . . . . . . . . . . . . . . . . . . . . . . . . 399 48.3 TheMost Popular Pub Names by City . . . . . . . . . . . . . 402
Contents | x 49 Geolocation with PostgreSQL 405 49.1 Geolocation Data Loading . . . . . . . . . . . . . . . . . . . . 405 49.2 Finding an IP Address in the Ranges . . . . . . . . . . . . . . 409 49.3 GeolocationMetadata . . . . . . . . . . . . . . . . . . . . . . 410 49.4 Emergency Pub . . . . . . . . . . . . . . . . . . . . . . . . . . 411 50 Counting Distinct Users with HyperLogLog 413 50.1 HyperLogLog . . . . . . . . . . . . . . . . . . . . . . . . . . 413 50.2 Installing postgresql-hll . . . . . . . . . . . . . . . . . . . . . 414 50.3 Counting Unique Tweet Visitors . . . . . . . . . . . . . . . . 415 50.4 Lossy Unique Count with HLL . . . . . . . . . . . . . . . . . 418 50.5 Getting the Visits into Unique Counts . . . . . . . . . . . . . 419 50.6 Scheduling Estimates Computations . . . . . . . . . . . . . . . 422 50.7 Combining Unique Visitors . . . . . . . . . . . . . . . . . . . 424 51 An Interview with Craig Kerstiens 425 IX Closing Thoughts 428 X Index 430
Part I Preface
| xii As a developer, The Art of PostgreSQL is the book you need to read in order to get to the next level of pro ciency. Af er all, a developer’s job encompasses more than just writing code. Our job is to produce results, and for that we have many tools at our disposal. SQL is one of them, and this book teaches you all about it. PostgreSQL is used to manage data in a centralized fashion, and SQL is used to get exactly the result set needed from the application code. An SQL result set is generally used to ll in-memory data structures so that the application can then process the data. So, let’s open this book with a quote about data structures and application code: Data dominat . If you’ve chosen the right data structur and orga- nized things well, the algorithms will almost always be self-evident. Data structur , not algorithms, are central to programming. — Rob Pike
About… About the Book This book is intended for developers working on applications that use a database server. Thebook speci cally addresses thePostgreSQLRDBMS: it actually is the world’smost advancedOpen Source database, just like it says in the tagline on the o cial website. By the end of this book you’ll know why, and you’ll agree! I wanted to write this book af er having worked withmany customers who were making use of only a fraction of what SQL and PostgreSQL are capable of deliv- ering. Inmost cases, developers Imetwithdidn’t knowwhat’s possible to achieve in SQL. As soon as they realized— or more exactly, as soon as they were shown what’s possible to achieve—, replacinghundreds of lines of application codewith a small and e cient SQL query, then in some cases they would nonetheless not know how to integrate a raw SQL query in their code base. Integrating a SQL query and thinking about SQL as code means using the same advanced tooling thatweusewhenusing other programming languages: version- ing, automated testing, code reviewing, and deployment. Really, this is more about the developer’s work ow than the SQL code itself… In this book, you will learn best practices that help with integrating SQL into your own work ow, and through the many examples provided, you’ll see all the reasons why you might be interested in doing more in SQL. Primarily, it means writing fewer lines of code. As Dijkstra said, we should count lines of code as lines spent, so by learning how to use SQL you will be able to spend less to write the same application! The practice pervaded by the reassuring illusion that programs are just devic like any others, the only difference admitted being
About… | xiv that their manufacture might require a new type of craftsmen, viz. programmers. From there it only a small step to measuring “pro- grammer productivity” in terms of “number of lin of code pro- duced per month”. Th a very costly measuring unit because it encourag the writing of insipid code, but today I am less interested in how foolish a unit it from even a pure business point of view. My point today that, if we wish to count lin of code, we should not regard them “lin produced” but “lin spent”: the current conventional wisdom so foolish to book that count on the wrong side of the ledger. On the cruelty of really teaching computing science, Edsger Wybe Dijkstra, EWD1036 About the Author Dimitri Fontaine is a PostgreSQL Major Contributor, and has been using and contributing toOpen Source Sof ware for the better part of the last twenty years. Dimitri is also the author of the pgloader data loading utility, with fully auto- mated support for database migration from MySQL to PostgreSQL, or from SQLite, or MS SQL… and more. Dimitri has taken on roles such as developer, maintainer, packager, release man- ager, sof ware architect, database architect, and database administrator at di fer- ent points in his career. In the same period of time, Dimitri also started several companies (which are still thriving) with a strong Open Source business model, and he has held management positions as well, including working at the execu- tive level in large companies. Dimitri runs a blog at http://tapoueh.org with in-depth articles showing advanced use cases for SQL and PostgreSQL. Acknowledgements First of all, I’d like to thank all the contributors to the book. I know they all had other priorities in life, yet they found enough time to contribute and help make
About… | xv this book as good as I could ever hope for, maybe even better! I’d like to give special thanks tomy friend Julien Danjouwho’s acted as amentor over the course of writing of the book. His advice about every part of the process has been of great value —maybe the one piece of advice that I most took to the heart has been “write the book you wanted to read”. I’d also like to extend my thanks to the people interviewed for this book. In order of appearance, they are Yohann Gabory from the French book “Django Avancé”, Markus Winand from http://use-the-index-luke.com and http://modern-sql.com, Grégoire Hubert author of the PHP POMM project, Álvaro Hernández Tortosa who created ToroDB, bringing MongoDB to SQL, Kris Jenkins, functional programmer and author of the YeSQL library for Clojure, and Craig Kerstiens, head of Could at Citus Data. Having insights fromSQLusers frommanydi ferent backgrounds has been valu- able in achieving one of the major goals of this book: encouraging you, valued readers, to extend your thinking to new horizons. Of course, the horizons I’m referring to include SQL. I also want to warmly thank the PostgreSQL community. If you’ve ever joined a PostgreSQL community conference, or even asked questions on themailing list, you know these people are both incredibly smart and extremely friendly. It’s no wonder that PostgreSQL is such a great product as it’s produced by an excellent group of well-meaning people who are highly skilled and deeply motivated to solve actual users problems. Finally, thank you dear reader for having picked this book to read. I hope that you’ll have a good time as you read through themany pages, and that you’ll learn a lot along the way! About the organisation of the books Each part of “The Art of PostgreSQL” can be read on its own, or you can read this book from the rst to the last page in the order of the parts and chapters therein. A great deal of thinking have been put in the ordering of the parts, so that reading “TheArt of PostgreSQL” in a linear fashion should provide the best experience.
About… | xvi The skill progression throughout the book is not linear. Each time a new SQL concept is introduced, it is presented with simple enough queries, in order to make it possible to focus on the new notion. Then, more queries are introduced to answer more interesting business questions. Complexity of the queries usually advances over the course of a given part, chap- ter af er chapter. Sometimes, when a new chapter introduces a new SQL con- cept, complexity is reset to very simple queries again. That’s because for most people, learning a new skill set does not happen in a linear way. Having this kind of di culty organisation also makes it easier to dive into a given chapter out-of- order. Here’s a quick breakdown of what each chapter contains: Part 1, Preface You’re reading it now, the preface is a presentationof the book andwhat to expect from it. Part 2, Introduction The introduction of this book intends to convince application developers such as you, dear reader, that there’s more to SQL than you might think. It begins with a very simple data set and simple enough queries, that we compare to their equivalent Python code. Thenwe expand from therewith a very important trick that’s not well known, and a pretty advanced variation of it. Part 3, Writting SQLQueries The third part of the book covers how to write a SQL query as an application developer. We answer several important questions here: • Why using SQL rather than your usual programming language? • How to integrate SQL in your application source code? • How to work at the SQL prompt, the psqlREPL? • What’s an indexing strategy and how to approach indexing?
About… | xvii A simple Python application is introduced as a practical example illustrating the di ferent answers provided. In particular, this part insists on when to use SQL to implement business logic. Part 3 concludes with an interview with YohanGabory, author of a French book that teaches how to write advanced web application with Python and Django. Part 4, SQL Toolbox The fourth part of “The Art of PostgreSQL” introduces most of the SQL con- cepts that you need to master as an application developer. It begins with the basics, because you need to build your knowledge and skill set on-top of those foundations. Advanced SQL concepts are introduced with practical examples: every query refers to a data model that’s easy to understand, and is given in the context of a “business case”, or “user story”. This part covers SQL clauses and features such as ORDER BY and k-NN sorts, the GROUP BY and HAVING clause and GROUPING SETS, along with clas- sic and advanced aggregates, and then window functions. This part also covers the infamous NULL, and what’s a relation and a join. Part 5 concludeswith an interviewwithMarkusWinand, author of “SQLPerfor- mance explained” and http://use-the-index-luke.com. Markus is a master of the SQL standard and he is awizard on using SQL to enable fast application delivery and solid run-time performances! Part 5, Data Types The f h part of this book covers the main PostgreSQL data types you can use and bene t from as an application developer. PostgreSQL is an ORDBMS: Object-Oriented Relation Database Manager. As a result, data types in Post- greSQL are not just the classics numbers, dates, and text. There’s more to it, and this part covers a lot of ground. Part 5 concludes with an interviewwith GrégoireHubert, author of the POMM project, which provides developers with unlimited access to SQL and database features while proposing a high-level API over low-level drivers.
About… | xviii Part 6, Data Modeling The sixth part of “The Art of PostgreSQL” covers the basics of relational data modeling, which is themost important skill you need tomaster as an application developer. Given a good databasemodel, every single SQLquery is easy towrite, things are kep logical, and data is kept clean. With a bad design…well my guess is that you’ve seen what happens with a not-great data model already, and inmany cases that’s the root of developers’ disklike for the SQL language. This part comes late in the book for a reason: without knowledge of some of the advancedSQLfacilities, it’s hard to anticipate that a datamodel is going tobe easy enough to work with, and developers then tend to apply early optimizations to the model to try to simplify writing the code. Well, most of those optimizations are detrimental to our ability to bene t from SQL. Part 6 concludes with an interview with Álvaro Hernández Tortosa, who built the ToroDB project, a MongoDB replica solution based on PostgreSQL! His take on relation database modeling when compared to NoSQL and document based technologies and APIs is the perfect conclusion of the database modeling part. Part 7, Data Manipulation and Concurrency Control The seventh part of this book covers DML and concurrency, the heart of any live database. DML stands for “Data Modi cation Language”: it’s the part of SQL that includes INSERT, UPDATE, and DELETE statements. Themain feature of anyRDBMS is how it dealswith concurrent access to a single data set, in both reading and writing. This part covers isolation and locking, computing and caching in SQL complete with cache invalidation techniques, and more. Part 7 concludes with an interview with Kris Jenkins, a functional programmer and open-source enthusiast. He mostly works on building systems in Elm, Haskell & Clojure, improving the world one project at a time, and he’s is the author of the YeSQL library.
About… | xix Part 8, PostgreSQL Extensions The eighthpart of “TheArt of PostgreSQL” covers a selectionof very useful Post- greSQL Extensions and their impact on simplifying application development when using PostgreSQL. We cover auditing changes with hstore, the pg_trgm extension to implement auto-suggestions and auto-correct in your application search forms, user-de ned tags and how to e ciently use them in search queries, and then we use ip4r for implementing geolocation oriented features. Finally, hyperlolog is introduced to solve a classic problem with high cardinality estimates and how to combine them. Part 8 concludes with an interview with Craig Kerstiens who heads the Cloud team at Citus Data, af er having been involved in PostgreSQL support at Heroku. Craig shares his opinion about using PostgreSQL extensions when deploying your application using a cloud-based PostgreSQL solution.
Part II Introduction
The above is a preview of the first 20 pages. Register to read the complete e-book.