21.7 pg_stat_user_indexes: Finding Unused Indexes

Right, let’s talk about cleaning up your own mess. You’ve been creating indexes like a kid with a free pass to an all-you-can-eat buffet. Some of them were brilliant ideas that save milliseconds on every query. Others… well, you were probably just guessing and hoping for the best. Every single one of those indexes, even the unused ones, is a tax you pay on every INSERT, UPDATE, and DELETE. They take up space, they slow down writes, and they make the vacuum worker hate you. It’s time to find the dead weight and cut it loose. Our best friend for this job is the pg_stat_user_indexes view.

21.6 Index Bloat: Causes, Detection, and REINDEX

Right, let’s talk about index bloat. This is the digital equivalent of your closet being full of clothes you haven’t worn since 2015. Your table’s logical size (the actual data you care about) is fine, but the physical size on disk (the index files) is horrifically inflated. It wastes storage, but more importantly, it murders performance because the query planner has to wade through all that useless cruft. It’s your database’s way of screaming for an intervention.

21.5 Concurrent Index Builds: CREATE INDEX CONCURRENTLY

Right, let’s talk about CREATE INDEX CONCURRENTLY. This is the command you use when you’d rather not bring your entire application to a grinding, screeching halt while you build an index on a production table. It’s PostgreSQL’s way of saying, “I can chew gum and walk at the same time,” and for the most part, it’s shockingly good at it. The standard CREATE INDEX command takes a table lock that blocks all writes (INSERTs, UPDATEs, DELETEs) for the entire duration of the build. On a large table, that duration can be measured in minutes or even hours. That’s a non-starter for most systems. CREATE INDEX CONCURRENTLY (let’s call it CIC because I’m tired of typing it) avoids this by taking a much weaker lock and working in multiple passes, allowing normal operations to continue almost uninterrupted.

21.4 Covering Indexes: INCLUDE Columns for Index-Only Scans

Right, let’s talk about covering indexes and the INCLUDE clause. This is one of those features that, once you understand it, you’ll wonder how you ever lived without it. It’s the difference between your database engine doing all its work in the tidy, fast, ordered world of the index, versus having to schlep all the way back to the main table (the “heap” or clustered index) to grab a few more pieces of data. That round trip is expensive, and we’re going to eliminate it.

21.3 Expression Indexes: Indexing a Function of a Column

Right, so you’ve got a table. You’re querying it. It’s slow. You slap an index on a column. It gets faster. You feel like a wizard. Life is good. But then you hit a query like this, and the magic spell fizzles: SELECT user_id, order_date FROM orders WHERE DATE_TRUNC('month', order_date) = '2023-10-01'; You check the query plan (EXPLAIN ANALYZE is your best friend, by the way—never forget that), and sure enough, it’s doing a full table scan. Your beautiful index on order_date is just sitting there, useless. Why? Because you’re not searching for order_date; you’re searching for a function of order_date. The index stores the raw values 2023-10-15 14:32:01, 2023-10-02 09:15:44, etc., but it has no idea what the DATE_TRUNC('month'...) of those values is.

21.2 Partial Indexes: Indexing Only a Subset of Rows

Right, so you’ve got a table. It’s a big table. A real chonker. And you’ve wisely decided to index a column to speed things up. But here’s the thing: do you really need to index every single row? Often, the most frequent and performance-critical queries are only interested in a specific subset of your data. Indexing every row for that is like building a library catalog that includes every single book, but your patrons only ever ask for the sci-fi paperbacks published after 1985. It’s wasteful. It bloats your index, slows down writes, and frankly, it’s a bit gauche.

21.1 CREATE INDEX: Basic Syntax and Options

Right, let’s talk about creating indexes. This is where you stop politely asking the database to find your data and start giving it a detailed map. The CREATE INDEX statement is your cartography tool. It seems simple on the surface, but the devil—and the real performance gains—are in the details. The most basic form is a thing of beauty in its simplicity. You tell it the name of the index, the table it belongs to, and the column(s) you want it to track.

— joke —

...