22.1 LVM Architecture: Physical Volumes, Volume Groups, Logical Volumes

Right, let’s get our hands dirty with LVM. You’ve probably been told it’s like “dynamic disks” or “software RAID,” but that sells it short. LVM is the closest thing you get to a storage Swiss Army knife in Linux, and understanding its architecture is the key to not accidentally stabbing yourself with it. The core idea is simple: abstract your physical storage into a flexible pool so you can carve out logical chunks (volumes) without caring about the underlying disks. It’s a layer of indirection, and as we all know, all problems in computer science can be solved by another layer of indirection. Except the problem of too many layers. But I digress.

The whole system stands on three pillars: Physical Volumes (PVs), Volume Groups (VGs), and Logical Volumes (LVs). Think of it like making a cake. The PVs are your ingredients (eggs, flour), the VG is the batter you mix them all into, and the LVs are the actual slices of cake you serve. You can’t serve a slice directly from the egg, and you can’t add flour to a slice once it’s served. This metaphor is already making me hungry and slightly concerned about data integrity. Let’s break it down properly.

Physical Volumes (PVs): The Raw Ingredients

A Physical Volume is, at its heart, a block device that has been initialized for use by LVM. This is usually a whole disk (/dev/sdb), a partition (/dev/sdb1), or even a loop device or a software RAID array. LVM doesn’t really care. You’re taking something physical and slapping a label on it that says, “Hey LVM, I’m part of your pool now.”

The key tool here is pvcreate. You use it to place an LVM header on the device. This header is like a tiny, well-organized filing cabinet that keeps track of what this PV is and what parts of it are used.

# Let's assume we have a fresh, unused disk at /dev/sdb
sudo pvcreate /dev/sdb

# You can verify it worked with pvdisplay
sudo pvdisplay /dev/sdb

A common pitfall here is trying to pvcreate on a device that already has a filesystem on it. LVM will ask you, “Are you absolutely sure? Because this will obliterate any data there.” It’s not kidding. Always, always double-check your device names with lsblk before you run this command. I’ve had to restore from backup for less.

Volume Groups (VG): The Storage Pool

Once you have one or more Physical Volumes, you mash them together into a Volume Group. This is your storage pool. The VG is the central management unit in LVM; it’s where the total storage capacity of all its member PVs is aggregated. The VG then chops up this pooled space into small, fixed-size chunks called Physical Extents (PEs). Every PV in the VG uses the same PE size (default 4 MiB, which is usually fine). This is why the abstraction works: the VG just sees a bunch of free PEs it can hand out, regardless of which physical disk they’re on.

Creating one is straightforward with vgcreate.

# Create a new Volume Group named 'my_vg' containing the PV /dev/sdb
sudo vgcreate my_vg /dev/sdb

# Now let's add another disk (/dev/sdc) to the pool to make it bigger
sudo vgextend my_vg /dev/sdc

# Show me the money!
sudo vgdisplay my_vg

Why is this brilliant? Need more space? Just vgextend with a new disk. The VG seamlessly grows. You can even remove a PV with vgreduce if you’re careful and move it to another system, which is a trick you can’t pull with traditional partitions. The best practice is to give your VGs meaningful names. vg00 tells you nothing; db_ssd_vg or media_hdd_vg tells you everything.

Logical Volumes (LVs): The Illusion

Here’s the magic trick. From your pool of PEs (the VG), you now create Logical Volumes. These are the block devices you’ll actually format with a filesystem like ext4 or XFS and mount (/home, /var/lib/mysql, etc.). The LV is just a mapping—a clever list that says, “My first 1000 PEs are on /dev/sdb, the next 500 are on /dev/sdc,” and so on.

You create them with lvcreate.

# Create a 20GB logical volume named 'my_data' in the 'my_vg' volume group
sudo lvcreate -L 20G -n my_data my_vg

# Now you can see a new block device: /dev/my_vg/my_data
# Format it and mount it like any other disk
sudo mkfs.ext4 /dev/my_vg/my_data
sudo mkdir /mnt/data
sudo mount /dev/my_vg/my_data /mnt/data

The beauty is that this LV /dev/my_vg/my_data now exists across two physical disks, but the filesystem and the user just see one contiguous, perfectly normal block device. LVM handles the complexity. You can even create different types of LVs for performance or redundancy (like striped or mirrored LVs), which is where the real power lies, but that’s a topic for another section.

The most important thing to remember is the path. It’s /dev/<vg_name>/<lv_name>. This is a huge win for consistency. Your database volume is always there, even if the underlying physical disk controller changes and sdb becomes sdc on the next boot. The VG and LV names are constant, which makes your fstab entries much more reliable.

So there you have it. PVs are the bricks, the VG is the wall you build with them, and the LVs are the shelves you attach to that wall. It’s a brilliantly simple architecture that unlocks an insane amount of flexibility. Just remember: with great power comes great responsibility. Always know what physical device a VG is on before you yank a disk out of a running system. I may have learned that the hard way so you don’t have to.