Right, so you’ve got data. And some of that data probably exists somewhere on this big, round rock we call home. An address, a city, a set of latitude and longitude coordinates, the outline of a fire hydrant in Toledo—these aren’t just strings or numbers. They’re spatial data, and treating them as such is a superpower. That’s where PostGIS comes in. It’s not just an extension; it’s a full-blown geospatial database engine that they somehow bolted onto the side of PostgreSQL. And it’s spectacular.

Think of it this way: without PostGIS, asking “which of my coffee shops is closest to this customer?” is a trigonometry problem from hell. With PostGIS, it’s a single, elegant query. Let’s enable it. This isn’t a suggestion; it’s step one.

CREATE EXTENSION IF NOT EXISTS postgis;

And just like that, your database is now geographically aware. You’ve got new data types, hundreds of new functions, and spatial reference systems to play with. Let’s break down the bits you’ll actually use.

The Workhorses: Geometry and Geography

PostGIS gives you two main data types to store shapes, and choosing the right one is your first important decision.

  • geometry: This is the older, faster, more mathematically straightforward type. It assumes your data lives on a flat, Cartesian plane. It’s perfect for data that is inherently projected, like the floor plan of a building, a circuit board layout, or if you’re working in a local coordinate system that already accounts for the earth’s curvature. Distance is calculated in the units of your coordinate system (meters, feet, etc.).

  • geography: This is the newer, smarter type for when you’re not ignoring the fact that the Earth is an oblate spheroid (a fancy term for “slightly squashed ball”). It expects coordinates in longitude/latitude (SRID 4326, which you’ll learn about in a second). The killer feature? When you calculate the distance between two geography points, you get the true geodesic distance on the surface of the Earth in meters. No more haversine formulas in your application code. The trade-off is that it’s generally a bit more computationally expensive.

So, the rule of thumb: if your data is global or you care about real-world distances, use geography. If your data is local/projected or pure geometry, use geometry. You can’t go too far wrong starting with geography for real-world stuff.

Getting Your Data In: WKT and EWKT

You can’t just INSERT '40.7128, -74.0060' into a geography column. PostGIS needs to know what kind of shape you’re making. You use Well-Known Text (WKT) and PostGIS’s extended version, EWKT.

WKT for a point is POINT(-74.0060 40.7128) (note the lack of a comma and the longitude-first order, which is a classic “gotcha”). But EWKT is more useful because it lets you specify the SRID (Spatial Reference System Identifier) right there. The SRID tells PostGIS what coordinate system you’re using. 4326 is the SRID for the classic longitude/latitude pair on the WGS 84 ellipsoid (i.e., the one Google Maps and GPS devices use).

CREATE TABLE interesting_places (
    id serial PRIMARY KEY,
    name text NOT NULL,
    -- Let's use geography for real Earth distances
    geog geography(Point, 4326)
);

INSERT INTO interesting_places (name, geog)
VALUES
    ('Statue of Liberty', ST_GeogFromText('SRID=4326;POINT(-74.0445 40.6892)')),
    -- The ST_GeomFromText function is more common for geometry
    ('Some Local Landmark', ST_GeomFromText('SRID=2263;POINT(1000000 200000)'));

The ST_ prefix stands for “Spatial Type” and is used for all PostGIS functions. It’s a nice bit of namespacing that prevents your database from becoming a total mess.

The Magic: Actually Querying Your Data

This is why we’re here. Let’s say you have a table of coffee shops and a user’s location. Finding the five closest shops is trivial.

SELECT
    name,
    ST_Distance(
        geog,
        ST_GeogFromText('SRID=4326;POINT(-73.9857 40.7484)')
    ) AS distance_meters
FROM coffee_shops
ORDER BY geog <-> ST_GeogFromText('SRID=4326;POINT(-73.9857 40.7484)')
LIMIT 5;

See that <-> operator? That’s the “distance” operator. When you use it in the ORDER BY clause, PostGIS is smart enough to use a spatial index (which you absolutely must create, more on that next) to quickly find the nearest neighbors without having to calculate the distance to every single point in the table. It’s incredibly fast. The ST_Distance in the SELECT clause then calculates the actual distance for the final, filtered results.

The Non-Negotiable: Spatial Indexing

If you do nothing else after creating your PostGIS tables, create a spatial index. Without it, every query becomes a full-table scan, calculating distances or intersections for every single row. It will grind to a halt faster than you can say “table scan”. For a geography column, it uses a slightly different kind of index than normal.

-- For a geography column:
CREATE INDEX idx_coffee_shops_geog ON coffee_shops USING GIST (geog);

-- For a geometry column (the more common syntax):
CREATE INDEX idx_some_table_geom ON some_table USING GIST (geom);

The GIST index is the key. It allows for efficient nearest-neighbor searches (<->) and bounding box comparisons. This isn’t a “maybe later” optimization; it’s step two, right after creating the table.

Beyond Points: Lines, Polygons, and the Real World

The real world isn’t just points. You have delivery routes (LineStrings), building footprints and sales territories (Polygons), and collections of these things. PostGIS handles them all. You can find out if a point (a user) is inside a polygon (a sales region) with ST_Within, or check if two geometries intersect with ST_Intersects.

-- Find all coffee shops in the Manhattan polygon
SELECT name
FROM coffee_shops
WHERE ST_Within(
    geog::geometry, -- Cast geography to geometry for this function
    (SELECT geom FROM boroughs WHERE name = 'Manhattan')
);

-- A classic: find all subway stations within 500 meters of a coffee shop
SELECT s.name AS station_name, c.name AS coffee_shop_name
FROM subway_stations s, coffee_shops c
WHERE ST_DWithin(s.geog, c.geog, 500); -- 'DWithin' = Distance Within

Note the cast in the first query (::geometry). Some functions, especially those dealing with spatial relationships, are more developed for the geometry type. Casting is a perfectly valid way to handle this. The ST_DWithin function is a workhorse for “find things near other things” and, crucially, it can use a spatial index.

The designers made a questionable choice here by not making every function work seamlessly with both types, but the cast is a small price to pay for the flexibility. The key is to never, ever forget that index. Your database will remember, and it will make you pay.