PostgreSQL Mistakes and How to Avoid Them (Jimmy Angelakos) (Z-Library)

M A N N I N G Jimmy Angelakos Foreword by Gianni Ciolli

2 EPILOGUE This diagram illustrates the narrative approach of the book, moving from real-world use cases to identifying problems, exploring resolution attempts, and uncovering common mistakes. It highlights the process of understanding causes, consequences, and, ultimately, finding and implementing the correct solution, supported by practical examples and code. Workaround Explanation Consequences Mistake Implementation Resolution attempt Correct solution Narrative Use case Problem How this book works Licensed to ashwin sampath <ashwin.manning@gmail.com>

PostgreSQL Mistakes and How to Avoid Them

ii

PostgreSQL Mistakes and How to Avoid Them JIMMY ANGELAKOS FOREWORD BY GIANNI CIOLLI M A N N I N G SHELTER ISLAND

For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: orders@manning.com ©2025 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. The author and publisher have made every effort to ensure that the information in this book was correct at press time. The author and publisher do not assume and hereby disclaim any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause, or from any usage of the information herein. Manning Publications Co. Development editor: Katie Sposato 20 Baldwin Road Technical editor: Simon Pane PO Box 761 Review editor: Radmila Ercegovac Shelter Island, NY 11964 Production editor: Andy Marinkovich Copy editor: Alisa Larson Proofreader: Mike Beady Technical proofreader: Frank Moore Typesetter and cover designer: Marija Tudor ISBN 9781633436879 Printed in the United States of America

For Elpida Anastasiou, my beloved partner, who has brought joy to my life over the past decade. Without her constant support, this book would not have been possible. Dedicated to the memory of My father, Giannis Angelakos, who sparked my lifelong passion for computing in 1984 by bringing home a Sinclair ZX Spectrum. He was always supportive and believed in me. Simon Riggs, my mentor, colleague, and friend, who gave me my start in the professional PostgreSQL world. His trust and guidance were an invaluable gift to my growth.

brief contents 1 ■ Why PostgreSQL matters—and why talking about mistakes does too 1 2 ■ Bad SQL usage 12 3 ■ Improper data type usage 43 4 ■ Table and index mistakes 58 5 ■ Improper feature usage 75 6 ■ Performance bad practices 98 7 ■ Administration bad practices 124 8 ■ Security bad practices 142 9 ■ High availability bad practices 153 10 ■ Upgrade/migration bad practices 167 11 ■ PostgreSQL, best practices, and you: Final insights 180 appendix A ■ Frogge Emporium database 191 appendix B ■ Cheat sheet 196 vi

contents foreword xi preface xiii acknowledgments xv about this book xvii about the author xxi about the cover illustration xxii 1 Why PostgreSQL matters—and why talking about mistakes does too 1 1.1 Why learning about PostgreSQL matters 1 1.2 Why talking about PostgreSQL mistakes matters 2 1.3 What you will learn 3 1.4 Typical kinds of PostgreSQL mistakes 4 Coming with expectations from other databases 4 Misunderstanding PostgreSQL 4 ■ Misunderstanding the documentation 5 ■ Using relics from the SQL Standard 5 Not following best practices 5 1.5 How this book works 5 Mental models 6 ■ Example mistake 7 1.6 Sample database: Frogge Emporium 11vii

CONTENTSviii2 Bad SQL usage 12 2.1 Using NOT IN to exclude 12 Performance implications 15 ■ Alternative 16 2.2 Selecting ranges with BETWEEN 17 2.3 Not using CTEs 19 2.4 Using uppercase identifiers 22 2.5 Dividing INTEGERs 24 2.6 COUNTing NULL values 27 2.7 Querying indexed columns with expressions 28 2.8 Upserting NULLs in a composite unique key 30 2.9 Selecting and fetching all the data 33 2.10 Not taking advantage of checkers/linters or large language models 35 Code checkers/linters 36 ■ Large language models 39 3 Improper data type usage 43 3.1 TIMESTAMP (WITHOUT TIME ZONE) 43 3.2 TIME WITH TIME ZONE 46 3.3 CURRENT_TIME 47 3.4 CHAR(n) 47 3.5 VARCHAR(n) 49 3.6 MONEY 51 3.7 SERIAL data type 53 3.8 XML 55 4 Table and index mistakes 58 4.1 Table inheritance 58 4.2 Neglecting table partitioning 62 4.3 Partitioning by multiple keys 66 4.4 Using the wrong index type 68 5 Improper feature usage 75 5.1 Selecting SQL_ASCII as the encoding 75 5.2 CREATE RULE 81 5.3 Relational JSON 83 5.4 Putting UUIDs everywhere 86

CONTENTS ix5.5 Homemade multi-master replication 89 5.6 Homemade distributed systems 94 6 Performance bad practices 98 6.1 Default configuration in production 99 6.2 Improper memory allocation 101 6.3 Having too many connections 104 6.4 Having idle connections 108 What is MVCC? 108 ■ The problem with idle connections 108 6.5 Allowing long-running transactions 110 Idle in transaction 110 ■ Long-running queries in general 112 6.6 High transaction rate 113 XID wraparound 113 ■ Burning through lots of XIDs 114 6.7 Turning off autovacuum/autoanalyze 115 6.8 Not using EXPLAIN (ANALYZE) 117 6.9 Locking explicitly 118 6.10 Having no indexes 119 6.11 Having unused indexes 121 6.12 Removing indexes used elsewhere 121 7 Administration bad practices 124 7.1 Not tracking disk usage 124 Deleting the Write-Ahead Log 126 ■ What can eat up your disk space? 126 ■ What can you do? 127 7.2 Logging to PGDATA 127 7.3 Ignoring the logs 130 Bad configuration 130 ■ Performance issues 131 Locks 131 ■ Corruption 132 ■ Security 132 7.4 Not monitoring the database 133 7.5 No tracking of statistics over time 135 7.6 Not upgrading Postgres 137 7.7 Not upgrading your system 139 8 Security bad practices 142 8.1 Specifying psql -W or - -password 143 8.2 Setting listen_addresses = '*' 144

CONTENTSx8.3 trust-ing in pg_hba.conf 145 8.4 Database owned by a superuser 146 8.5 Setting SECURITY DEFINER carelessly 149 8.6 Choosing an insecure search path 150 9 High availability bad practices 153 9.1 Not taking backups 154 9.2 No Point-in-Time Recovery 156 9.3 Backing up manually 158 9.4 Not testing backups 160 9.5 Not having redundancy 162 9.6 Using no HA tool 164 10 Upgrade/migration bad practices 167 10.1 Not reading all release notes 167 10.2 Performing inadequate testing 171 10.3 Succumbing to encoding chaos 174 10.4 Not using proper BOOLEANs 176 10.5 Mishandling differences in data types 178 11 PostgreSQL, best practices, and you: Final insights 180 11.1 What type of user are you? 180 The dabbler 181 ■ The cautious steward 182 ■ The oblivious coder 182 ■ The freefaller 183 11.2 Be proactive: Act early 184 11.3 All right, so you inherited a bad database 185 “Historical reasons” 185 ■ What now? 185 First things first 186 11.4 Treat Postgres well, and it will treat you well 187 appendix A Frogge Emporium database 191 appendix B Cheat sheet 196 index 203

foreword This book definitely contains what its title promises: there are many examples of mis- takes, and there are also suggestions on how to avoid them. The interested reader, however, will find much more than that. Examples are written in a clear and factual way, and represent scenarios that can really happen. This is not a surprise: I have known Jimmy for many years, and we worked together on many real-world projects, helping customers address the chal- lenges that they faced. At some point, we were both part of the solutions architecture team, which focuses on combining technologies into consistent solutions. That kind of work provides a wide-angle viewpoint, driven by tangible outcomes and by how these outcomes impact the customer’s business, but it also requires attention to detail and top-level hands-on skills. While working in that role, technical skills are gradually reinforced, and you are exposed to a valuable variety of demanding production envi- ronments. While reading PostgreSQL Mistakes and How to Avoid Them, you can recognise the eye of the solutions architect, using theory to filter facts and organize them, while being always ready for a technical deep dive to whatever level of technical detail and practicality that the problem requires. This book is a recommended read for whoever needs to work with PostgreSQL in production. Job roles such as system administrator, DBA, application developer, appli- cation architect, etc., are clearly different from each other in terms of goals and skills required, but they all share the common tract that a detailed PostgreSQL knowledge does help, as, in my experience, many customers have learned over the decades. Chapter titles clearly refer to relevant topics such as SQL, data types, tables, and indexes, but also more generally to topics such as high availability, performance, administration, security, upgrades, and migrations. One can start from a chapter thatxi

FOREWORDxiithey feel more familiar, and then extend the reading to the other chapters, not neces- sarily in the order presented in the book. More general conclusions are drawn in the last chapter of the book, and to reaffirm its nature as a work tool, the book ends with a few pages (“cheat sheet”) providing a compact summary of the entire book. I am sure you will find this book very interesting and useful, as I did when I read it. I am grateful to Jimmy Angelakos for the effort he undertook in writing it, which admittedly is a little bit easier in his case, as he could use his experience as a popular speaker at PostgreSQL conferences to explain complex things clearly and pick the right selection of details. —DR. GIANNI CIOLLI, VICE PRESIDENT, PRACTICE LEAD FOR HIGH AVAILABILITY AT EDB

preface PostgreSQL (or Postgres to its friends) is a general-purpose database management sys- tem (or DBMS) that has over 25 years of development behind it and a thriving, active community. We call it “general-purpose” because it is not specialized toward a particu- lar use case. Rather the opposite: the many design choices spanning its long develop- ment history have tended to favor balanced behavior so that Postgres can offer equally good performance in many scenarios. This strategy has panned out over time, result- ing in an extremely varied user base spanning many application areas, often comple- mented by specialized extensions. According to the Stack Overflow 2024 Developer Survey (https://survey.stackover flow.co/2024), PostgreSQL is the most admired and desired database among develop- ers in general, and professional developers in particular, for the second year in a row. It took over the top popularity spot from MySQL the previous year and shows no signs of slowing down. This popularity is a testament to its growing impact and signals that Postgres is a disruptive force that continues to evolve to meet the demands of the modern database market and software development industries. I had been aware of PostgreSQL as a fully fledged DBMS with a good reputation since the late 1990s and started experimenting with it in the early 2000s. I started using it professionally around 2008 after being prodded by a friend and colleague. I’ve never looked back or used another database system since. xiii

PREFACExiv “PostgreSQL’s biggest mistake” Let’s talk about the elephant in the room—the name. After Ingres, an older database created by Michael Stonebraker, came Postgres (Post-Ingres). The addition of SQL capabilities to Postgres brought along a name change to PostgreSQL, pronounced “post-gress-cue-ell.” Many contributors and community members recognized that this name became an impediment but felt it was too late to change the project’s name. As far as everyone’s concerned, though, “Postgres” is perfectly acceptable and com- pletely equivalent, so we’re going to use both names interchangeably in this book. Oh, and it’s never “Postgre.”

acknowledgments I’d like to thank Jonathan Gennick, acquisitions editor, who approached me with the idea to transform my experiences into this book. His encouragement and insight pro- vided the foundation for this project, and his support throughout has been invaluable. Katie Sposato, development editor, deserves my heartfelt thanks for her expert guidance and thoughtful advice. Her ability to steer me in the right direction with practical tips and constructive feedback has greatly enhanced the clarity and structure of this book. I am equally indebted to Simon Pane, technical editor, whose sharp eye and tech- nical expertise ensured the accuracy and reliability of the content. His meticulous attention to detail and deep understanding of PostgreSQL have been instrumental in shaping this work. My thanks also go out to Frank Moore, technical proofreader, for his careful review and attention to detail, which further strengthened the book. Thank you to all of the reviewers, Adam Wan, Alex Elistratov, AJ Bhandal, Amit Sharma, Andres Sacco, Andrew Eleneski, Anuj Tyagi, Bassam Ismail, Bill Mitchell, Cornel Ghiban, Daniel Vásquez, Dirk Gomez, Fernando Bugni, Frank Moore, Grego- rio Piccoli, Harsh Ranjan, Henrietta Dombrovskaya, Ihsan Akin, Iyabo Sindiku, Jer- emy Chen, Jose Alberto Reyes Quevedo, Joseph Pachod, Kiran Krishnamurthy, Luca Ferrari, Marcus Geselle, Manohar Sai Jasti, Mikael Dautrey, Milorad Imbra, Nadir Doc- tor, Naga Rishyendar Panguluri, Paul Snow, Potito Coluccelli, Prabhu Patel, Rani Sharim, Regina Obe, Ruben Vandeginste, Serge Smertin, Sergio Britos Arevalo, Sim- one Sguazza, Viktoria Dolzhenko, Wes Shaddix, and William Jamir Silva. Your thoughtful feedback helped improve this book. xv

ACKNOWLEDGMENTSxvi A special thanks to SomaFM (https://somafm.com/), and specifically their Synphaera channel, for providing the perfect soundtrack necessary for concentration and deep focus. Finally, I would like to reiterate my gratitude to the open and generous Post- greSQL community and the open source world in general. Your collaborative spirit, willingness to share knowledge, and enduring support have been an inspiration throughout my journey. Thank you all for making this book a reality.

about this book In February 2023, I gave a talk titled “Don’t Do This” at the annual FOSDEM (https://fosdem.org) conference in Brussels, Belgium. The talk explored some of the common mistakes, pitfalls, and misconceptions that PostgreSQL users can face and discussed possible ways to undo them or work around them. The talk was well received, and I gave it at two further conferences. Afterward, I was left with the impression that there was much more to talk about on this subject. However, the vol- ume of material was such that I no longer felt it fit the conference talk format well. The search for a better medium for this content is what spurred the discussions that eventually led to the writing of PostgreSQL Mistakes, my first book. Spoiler alert: there are more mistakes than those listed in this book! I like to think that there is literally nothing that you cannot get wrong. But these are the most com- mon or most dangerous mistakes that you are likely to encounter. This book is a medium to explore the subject, which I feel is more suitable than a single-purpose website, disparate blog posts, answers to forum questions, or a series of conference talks. It can concentrate my thoughts on the proper way to do things in Postgres in a single place, and at the same time, it can dive deeper and go further than any of the aforementioned media. Needless to say, a book doesn’t require the time needed to sift through all related Internet posts or the perseverance to sit through hours of live or awkwardly recorded talks. The mistakes described in this book have been identified through experience with the PostgreSQL database and its software ecosystem. I have benefited from my daily involvement with Postgres in various capacities while working for two of the foremost PostgreSQL database companies, which has given me insight into how end-user sys- tems are utilized, configured, and maintained, with use cases including both goodxvii

ABOUT THIS BOOKxviiipractices to be followed and bad practices to be avoided. Through many years of deploying, observing, and fixing Postgres systems in the field, I have gained the neces- sary perspective to write this book. Of course, no one is an island, and none of this would have been possible without the mentoring, knowledge sharing, and guidance that the PostgreSQL community has so generously provided to me over the years. Who should read this book To benefit fully from this book, you should have a general knowledge of relational database system operation and a working grasp of the SQL language. This means being able to design a rudimentary relational database using SQL DDL with a basic understanding of data modeling. Consequently, you should be able to write SQL que- ries with WHERE clauses and understand the concepts behind indexing. A basic knowledge of PostgreSQL usage is also expected. You should be able to install and configure a PostgreSQL environment (just the basics; don’t worry about things like performance tuning). What also helps is familiarity with the UNIX com- mand line (specifically Linux). You will find it helpful to be able to run command-line utilities, with an understanding of piping and output redirection. How this book is organized: A road map This book is divided into 11 chapters and 2 appendices, each focusing on a critical aspect of PostgreSQL usage, pitfalls, and best practices. The chapters cover a variety of topics, from foundational concepts to advanced topics, while the appendices provide practical resources to support your learning. Here is a brief overview of each chapter and appendix:  Chapter 1: Why PostgreSQL matters—and why talking about mistakes does too—Pro- vides an overview of PostgreSQL and highlights the goals of this book, includ- ing how to identify, categorize, and learn from common PostgreSQL mistakes  Chapter 2: Bad SQL usage—Discusses common SQL pitfalls, ways to ensure the correctness of queries, and techniques for improving the performance of com- plex queries  Chapter 3: Improper data type usage—Explores the implications of choosing the wrong data type, issues with time zones and daylight savings, and data types that are best avoided  Chapter 4: Table and index mistakes—Covers advanced table features like inheri- tance, the importance of partitioning, and selecting the most effective keys and indexes for your tables  Chapter 5: Improper feature usage—Examines the challenges and risks of misusing PostgreSQL features, such as encoding choices, creating rules, and using NoSQL features inappropriately

PostgreSQL Mistakes and How to Avoid Them (Jimmy Angelakos) (Z-Library)

AI Reading Assistant

Passage locations

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Recommended for You

AI Reading Assistant

Passage locations

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Recommended for You

Reply to Comment

Edit Comment