27.2 tsquery: Search Queries with AND, OR, NOT, and Phrase Operators

Right, so you’ve got your documents neatly packaged into tsvector. Now it’s time to actually ask for something back. That’s where tsquery comes in. Think of tsvector as the pre-processed, indexed filing cabinet, and tsquery as the set of instructions you hand to your very literal, slightly pedantic intern on how to search it.

A tsquery is a representation of the Boolean operators & (AND), | (OR), and ! (NOT), along with phrase search. It’s what you use within the @@ operator to see if it matches a tsvector.

The Basic Operators: AND, OR, and NOT

The syntax is about as straightforward as it gets. No fancy SQL AND/OR keywords here; we’re using symbols like it’s the 1970s and we’re saving every byte.

-- Find documents that contain both 'quick' AND 'brown'
SELECT 'The quick brown fox' @@ 'quick & brown' AS matches; -- true

-- Find documents that contain either 'quick' OR 'lazy'
SELECT 'The quick brown fox' @@ 'quick | lazy' AS matches; -- true

-- Find documents that contain 'quick' but NOT 'lazy'
SELECT 'The quick brown fox' @@ 'quick & !lazy' AS matches; -- true

The order of operations is what you’d expect: ! (NOT) binds first, then & (AND), then | (OR). When in doubt, or if you just enjoy writing clear code, use parentheses. Your future self, debugging a query at 2 AM, will thank you.

-- This means: ("fat" OR "rat") AND "cat"
SELECT 'a fat cat' @@ 'fat | rat & cat' AS matches; -- true

-- This means: "fat" OR ("rat" AND "cat")
SELECT 'a fat cat' @@ '(fat | rat) & cat' AS matches; -- true, same result here
SELECT 'a cat' @@ 'fat | rat & cat' AS matches; -- false! Because it's 'fat' OR ('rat' AND 'cat')

Phrase Search: The `<->` (FOLLOWED BY) Operator

This is the killer feature. A simple & will tell you if both words are somewhere in the document. But what if you need to find “quick brown” as a phrase, not just the words “quick” and “brown” scattered randomly about? Enter the “followed by” operator, <->.

-- This matches because 'quick' is immediately followed by 'brown'
SELECT 'The quick brown fox' @@ 'quick <-> brown' AS matches; -- true

-- This does NOT match. 'brown' is not followed by 'quick'.
SELECT 'The quick brown fox' @@ 'brown <-> quick' AS matches; -- false

-- You can chain these for sequences of more than two words.
SELECT 'The quick brown fox' @@ 'quick <-> brown <-> fox' AS matches; -- true

Here’s the best part: the <-> operator respects the positions stored in the tsvector. It’s not a naive string search. It’s checking the actual positional data we painstakingly created. This is why we went through all that to_tsvector trouble.

The Weird and Wonderful: Distance and Weights

The designers, in a moment of clarity, realized you might want words near each other, not necessarily adjacent. So <-> can take a number n to mean “followed by n words”. <1> is the same as <-> (adjacent). <2> means with one word in between, and so on.

-- Find 'quick' and 'fox' with exactly one word between them ('brown')
SELECT 'The quick brown fox' @@ 'quick <2> fox' AS matches; -- true

-- Find 'quick' and 'jumps' with *up to* two words between them.
-- We do this by OR'ing the possibilities. It's a bit verbose, but it works.
SELECT 'The quick brown fox jumps' @@ 'quick <2> jumps | quick <3> jumps' AS matches; -- true

A crucial point: tsquery terms can also be filtered by weight. Remember those A, B, C, and D weights from tsvector? You can target them.

-- Find the word 'quick' only if it is in the title (A weight)
SELECT to_tsvector('The quick brown fox') @@ 'quick:A' AS matches; -- false
-- Our earlier example only had the default weight (which is 'D')

This is incredibly powerful for building advanced search interfaces where a match in the title is more relevant than a match in the body.

Common Pitfalls and the Art of Debugging

Your first tsquery will probably fail. Mine always do. The number one reason? Stemming. You must use the stemmed version of the word in your query. You’re querying the tsvector, not the original text.

-- This will FAIL. 'jumps' is stemmed to 'jump' in the tsvector.
SELECT to_tsvector('The fox jumps') @@ 'jumps' AS matches; -- false

-- This will WORK. Query with the stemmed form.
SELECT to_tsvector('The fox jumps') @@ 'jump' AS matches; -- true

-- Let the `to_tsquery` function handle this for you. PLEASE.
SELECT to_tsvector('The fox jumps') @@ to_tsquery('jumps') AS matches; -- true

Always, always, always use to_tsquery() or plainto_tsquery() to build your queries. It applies the same dictionary rules as to_tsvector, ensuring you’re speaking the same language. plainto_tsquery is simpler but dumber—it basically just throws & between all the words you give it. to_tsquery is where you can use the full power of the operators we just discussed.

-- plainto_tsquery: good for simple "google-style" search boxes.
SELECT plainto_tsquery('quick brown fox'); -- 'quick' & 'brown' & 'fox'

-- to_tsquery: for when you need precision and power.
SELECT to_tsquery('quick & (brown | red) <-> fox'); -- 'quick' & ( 'brown' | 'red' ) <-> 'fox'

Mastering tsquery is about embracing its pedantic nature. Be precise, let the functions handle the stemming, and use parentheses like they’re free. Because when you get it right, you’re not just filtering text—you’re conducting a symphony of Boolean logic against a gigabyte-sized corpus, and getting answers back in milliseconds. And that never gets old.

The Basic Operators: AND, OR, and NOT

Phrase Search: The <-> (FOLLOWED BY) Operator

The Weird and Wonderful: Distance and Weights

Common Pitfalls and the Art of Debugging

Phrase Search: The `<->` (FOLLOWED BY) Operator