Right, so sort by itself is fine for a quick-and-dirty alphabetical sort of a file. But let’s be honest, your data is rarely that polite. It’s usually in columns, like some kind of data spreadsheet that got lost and ended up in the terminal. This is where sort graduates from a simple tool to a data-wrangling ninja, using the -k (key) and -t (delimiter) options.

The basic idea is simple: instead of sorting the entire line, you tell sort to look at a specific part of each line, a specific column or “field.” But as with all things in the shell, the devil is in the details, and those details will bite you if you’re not careful.

The -t Delimiter: Setting the Table

First, you have to tell sort how your line is divided into fields. That’s what -t is for. By default, sort treats any sequence of whitespace (spaces, tabs) as a field separator, which is a common first mistake. If your fields are separated by something else, like a comma, colon, or pipe character, you must specify it with -t.

Let’s say you have a classic CSV (comma-separated values) file, users.csv:

alice,smith,engineering,55000
bob,jones,marketing,48000
carol,williams,sales,52000
dave,brown,engineering,60000

If you just run sort users.csv, it will sort on the entire line, starting with the first name. Useful, but boring. To sort by department (the third field), you need to specify the comma as the delimiter.

sort -t ',' -k3 users.csv

This outputs:

alice,smith,engineering,55000
dave,brown,engineering,60000
bob,jones,marketing,48000
carol,williams,sales,52000

See? Both “engineering” lines are together. But why is “engineering” before “marketing”? Because we’re sorting alphabetically, and ’e’ comes before ’m’. We’ll get to numeric sorts in a bit.

The -k Key Specifier: It’s Weirder Than You Think

The -k option is where most of the magic and confusion happens. The syntax -kn means “start sorting at field n.” But here’s the first gotcha: it sorts from field n to the end of the line by default.

Let’s use a trickier file, spaces.txt, where fields are separated by spaces (the default, so we can omit -t for now).

# Let's create it on the fly with a here-string
cat << 'EOF' > spaces.txt
zebra 99 stripes
apple 1 fruit
banana 50 fruit
EOF

If you want to sort by the second field (the number), you might try sort -k2 spaces.txt. Let’s see what happens:

sort -k2 spaces.txt
apple 1 fruit
banana 50 fruit
zebra 99 stripes

It worked! But only by accident. It sorted correctly because it compared “1”, “50”, and “99” as strings from the second field onward. “1 fruit” is indeed lexicographically less than “50 fruit”. But this is fragile. What if we had a line like zebra 99 is an animal? The sort would include all those extra words. You need to be explicit.

To specify where the sort key ends, you use -kn,m, where the sort key is from the start of field n to the end of field m.

To sort only by the second field, you should write:

sort -k2,2 spaces.txt

This command says “use a key that starts at field 2 and ends at field 2.” Now it will only use the number for the comparison, completely ignoring the rest of the line if other fields have different values. This is a best practice and saves you from countless hidden bugs.

The Order of Operations: Global vs. Key-Specific Flags

This is the part that feels like the designers had a bet about how confusing they could make it. Flags like -n (numeric sort), -r (reverse), and -b (ignore leading blanks) can be applied globally (before any -k option) or per key (after the key specifier).

Global flags affect all keys. If you have multiple -k options, they all get the global behavior.

A flag specified after a key specifier (-k2,2n) only affects that specific key.

This is incredibly powerful. Let’s go back to our CSV. What if we want to sort primarily by department (field 3) and then secondarily by salary (field 4) within each department? And let’s sort the salary numerically and in descending order (highest first).

sort -t ',' -k3,3 -k4,4nr users.csv

Let’s break this down:

  • -t ',': Use comma as the delimiter.
  • -k3,3: First sort key is field 3 only (department). It uses the default alphabetical sort.
  • -k4,4nr: Second sort key is field 4 only (salary). The n tells this key to be sorted numerically, and the r tells this key to be sorted in reverse (descending) order.

The output is exactly what we wanted: departments in alphabetical order, with salaries from high to low within them.

dave,brown,engineering,60000
alice,smith,engineering,55000
bob,jones,marketing,48000
carol,williams,sales,52000

If we had used -nr globally (sort -t ',' -nr -k3,3 -k4,4), it would have tried to apply numeric sort to the department names, which makes no sense, and the entire command would fail spectacularly. Always attach the flags to the specific key that needs them.

The Leading Blanks Gotcha

One last landmine. The -b (ignore leading blanks) option is often needed when your fields are aligned with spaces. By default, those spaces are considered part of the field. If you’re sorting a numbered list where numbers are right-aligned, you’ll get a string sort order (" 10", " 2") instead of a numeric one (2, 10). Using -n usually implies -b, but it’s good to be explicit. If your data has leading spaces and you’re not using -n, you’ll probably need -b on your key.

Master -k and -t, and you’ve just unlocked one of the most powerful data manipulation tools in the shell. It’s fussy, I know, but it’s fussy in a way that gives you precise control. And that’s always worth the headache.