23.1 RAID Levels: 0 (Striping), 1 (Mirroring), 5, 6, and 10
Alright, let’s talk RAID levels. Forget the marketing fluff from hardware vendors; we’re going to look at this from the perspective of someone who has to actually use and, more importantly, recover these things. RAID isn’t a backup. Let me say that again so it sinks in: RAID is not a backup. It’s a tool for uptime and performance. You back up your data to a separate system, preferably off-site. Got it? Good. Now, let’s get our hands dirty with the main levels you’ll configure with mdadm.
RAID 0: Striping (AKA Living on the Edge)
This is the performance nut’s dream and the data hoarder’s nightmare. RAID 0 takes your data, chops it into blocks, and spreads those blocks across all disks in the array. Two 1TB drives in RAID 0 give you a glorious, fast 2TB volume. The “why” is simple: it allows read and write operations to happen in parallel across multiple disks. It’s like having two checkout lanes open instead of one.
The colossal, glaring, “why would you do this” downside? There is zero redundancy. If any drive in a RAID 0 array dies, the entire array is gone. Kaput. All your data is now abstract art. You use this for scratch space, render caches, or anything you can afford to lose instantly. Never for anything important.
# Creating a RAID 0 array with two disks
sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sdb /dev/sdc
RAID 1: Mirroring (The Safe Bet)
The polar opposite of RAID 0. RAID 1 takes your data and writes an identical copy to every disk in the array. Two 1TB drives in RAID 1 give you a safe, redundant 1TB volume. The “why” is all about fault tolerance. Read performance can be slightly faster (you can read from both disks), but write performance is the same as a single disk (you have to write to all of them).
The cost is obvious: you lose 50% of your raw capacity to redundancy. It’s simple, robust, and what you should use for a small, critical array like a boot drive. If a drive fails, you just yank it out, pop a new one in, and tell mdadm to add it back. The array never skips a beat.
# Creating a RAID 1 array for your boot drives
sudo mdadm --create --verbose /dev/md1 --level=1 --raid-devices=2 /dev/nvme0n1p2 /dev/nvme1n1p2
RAID 5: Striping with Parity (The Classic Workhorse)
Here’s where things get clever. RAID 5 requires at least three disks. It stripes data across the disks like RAID 0, but it also calculates and writes “parity” information—a checksum that can be used to reconstruct missing data. This parity is spread across all the drives. The “why” is a great balance: you get the capacity of N-1 disks (so three 1TB drives yield 2TB) plus redundancy. You can lose any one drive and keep on running.
The catch? Write performance can suffer because of the parity calculation overhead (the “write hole” is a real issue we’ll cover later). Rebuilding an array after a drive failure is a massive, intensive process that puts a heavy read load on all the remaining drives. If a second drive fails during this rebuild, you’re toast. With modern multi-terabyte drives, the rebuild time and associated risk are non-trivial.
# Creating a RAID 5 array with three disks
sudo mdadm --create --verbose /dev/md5 --level=5 --raid-devices=3 /dev/sdd /dev/sde /dev/sdf
RAID 6: Striping with Double Parity (RAID 5 for the Paranoid)
RAID 6 is basically RAID 5’s bigger, more cautious sibling. It requires at least four disks and uses two independent parity schemes. This means you can lose any two drives in the array and still keep running. The “why” is simple: it dramatically reduces the risk of a total failure during the long rebuild process of a large array. The cost is capacity: you get the capacity of N-2 disks (four 1TB drives yield 2TB). Write performance is even slower due to the double parity calculation, but for large, archival-style arrays, the trade-off is often worth it for the peace of mind.
RAID 10: A Mirror of Stripes (The Performance King)
RAID 10 (or 1+0) is what you use when you need both speed and redundancy. It requires an even number of disks, at least four. You create two or more RAID 1 mirrors and then stripe data across those mirrors. The “why” is brilliant: you get the read/write performance of RAID 0 applied to the robust mirrors of RAID 1. You can survive multiple drive failures, as long as they don’t knock out an entire mirror. Capacity is 50% of your raw total, just like RAID 1.
It’s often the best choice for database servers or application VMs where I/O is critical. The designers got this one right.
# Creating a RAID 10 array with four disks. Note the specific layout for performance.
sudo mdadm --create --verbose /dev/md10 --level=10 --raid-devices=4 --layout=f2 /dev/sdg /dev/sdh /dev/sdi /dev/sdj
The key takeaway? There’s no “best” RAID level. There’s only the right tool for the job. Choose based on your need for capacity, performance, and, most importantly, your tolerance for risk. Now go build something.