Alright, let’s talk about the WHERE clause. This is where you stop just fetching data and start interrogating it. Think of SELECT as your “grab everything” command. WHERE is the part where you say, “…but only the stuff that meets these conditions.” It’s the bouncer at the club of your database, deciding which rows get to come in and which get left out in the cold.

The magic of WHERE is in its operators. They’re the vocabulary you use to describe exactly what you’re looking for.

The Usual Suspects: Comparison Operators

These are the bread and butter. You know these from math class, but SQL uses a slightly different syntax for “not equal to.”

SELECT title, author, page_count FROM books
WHERE page_count > 300; -- Finds the doorstoppers

SELECT product_name, price FROM products
WHERE price <= 49.99; -- Finds the affordable stuff

SELECT first_name, last_name FROM employees
WHERE hire_date != '2023-01-01'; -- Finds everyone *not* hired on New Year's Day
-- Pro Tip: '<>' is the ISO standard for 'not equal'. Use it, and feel fancy.

BETWEEN: For the Lazy (and I Mean That as a Compliment)

Writing WHERE column >= 10 AND column <= 20 is fine. But it’s verbose. BETWEEN exists because someone rightfully decided that was too much typing. It’s inclusive—a crucial detail that bites everyone eventually. This means the range includes the boundary values you specify.

SELECT product_name, price FROM products
WHERE price BETWEEN 15 AND 25; -- This will include products priced at 15 *and* 25.

-- This is identical to:
SELECT product_name, price FROM products
WHERE price >= 15 AND price <= 25;

The Pitfall: Because it’s inclusive, BETWEEN can be a nightmare with dates if you’re not careful. BETWEEN '2023-01-01' AND '2023-01-31' will grab everything up until midnight on the 31st. But what if your datetime field has a timestamp of '2023-01-31 14:22:00'? It’s included. If you want the whole day of the 31st, you often need to use BETWEEN '2023-01-01' AND '2023-02-01' and then filter the upper bound differently. It’s a mess. BETWEEN is great for numbers, but for datetimes, tread carefully.

IN: The Party Invite List

Need to filter for multiple, arbitrary values? IN() is your best friend. It’s essentially a shorthand for a long chain of OR conditions. Don’t do this: WHERE name = 'Alice' OR name = 'Bob' OR name = 'Charlie'. Do this instead:

SELECT first_name, last_name FROM customers
WHERE first_name IN ('Alice', 'Bob', 'Charlie');

-- This is also gold for subqueries:
SELECT product_name FROM products
WHERE category_id IN (SELECT category_id FROM categories WHERE department = 'Electronics');

The Insight: IN() is not just for convenience; it’s often more performant than a string of ORs because the database optimizer can handle it more efficiently.

IS NULL: Finding the Missing Pieces

Here’s where things get weird. You can’t use = NULL. In SQL, NULL represents an unknown value. The logic is: is this unknown value equal to that unknown value? The answer is also unknown. So = NULL never returns TRUE; it always returns NULL (which, in a WHERE clause, is treated as FALSE). It’s a philosophical nightmare.

You must use IS NULL or IS NOT NULL to check for this unknown state.

SELECT order_id, shipped_date FROM orders
WHERE shipped_date IS NULL; -- Finds orders that haven't shipped yet.

SELECT first_name, last_name FROM employees
WHERE termination_date IS NOT NULL; -- Finds our dearly departed colleagues.

When you only know part of the story, LIKE is your go-to. It uses two wildcard characters:

  • %: Matches any sequence of zero or more characters. (The “anything goes” card)
  • _: Matches exactly one character. (The “fill in the blank” card)
SELECT title FROM books
WHERE title LIKE '%SQL%'; -- Finds titles with 'SQL' anywhere in them.

SELECT first_name FROM customers
WHERE first_name LIKE 'J_n'; -- Finds 'Jon', 'Jan', 'Jen', 'Jun'... you get the idea.

SELECT product_code FROM products
WHERE product_code LIKE 'ABC-202_'; -- Finds all product codes for the year 2020-2029.

The Rough Edge: LIKE can be brutally slow on large tables. If you’re searching for '%term' (a leading wildcard), you’re forcing a full table scan because the database can’t use a regular index. It has to check every single row. For serious text search, you’d look into full-text search engines. But for ad-hoc queries on smaller datasets, LIKE is indispensable. Also, remember case-sensitivity depends on your database’s collation. LIKE 'a%' might behave very differently from LIKE 'A%'.