68.7 Memory-Efficient Data Structures: array vs list
Right, let’s settle the classic array-vs-list debate once and for all. You’re probably thinking, “It’s just a bunch of data, what’s the big deal?” Oh, my sweet summer child. The choice between these two is one of the most fundamental performance decisions you’ll make, and getting it wrong is a fantastic way to turn a snappy application into a bloated, gasping mess. It all boils down to one word: contiguity. I’ll explain.
An array is a contiguous block of memory. Think of it like a row of theater seats, all bolted to the floor in one unbroken line. When you create an array of 100 integers, the runtime finds a single, uninterrupted space in memory big enough to fit all 100 integers and plops them down, one after the other. A list (like a List<T> in C# or an ArrayList in Java), under the hood, is an array that’s playing a complicated game of pretend. It uses an array as its backing store, but it has this clever—and sometimes costly—logic to resize itself when it runs out of space.
The Performance of Access: Why Arrays Are Blistering Fast
This contiguity is a superpower for access. Because all the elements are sitting right next to each other, your CPU can get at any one of them with terrifying efficiency. It’s simple math: the memory address of the fifth element is just the starting address of the array plus (4 * the size of each element). The CPU’s hardware can do this calculation in a single clock cycle. This is called constant-time access, or O(1). It doesn’t matter if the array has 10 elements or 10 million; accessing myArray[9999999] is just as fast as accessing myArray[0].
A list exposes the same indexer syntax (myList[5]), but don’t be fooled—it’s just a thin wrapper around the exact same array access. So, for pure reading, a list is just as fast as an array. The performance divergence isn’t in reading; it’s in writing and growing.
// This access is lightning fast for both array and List<T>
int[] fastArray = new int[1_000_000];
List<int> fastList = new List<int>(1_000_000);
// These two operations are virtually identical in speed.
int elementFromArray = fastArray[867530]; // Jenny, I got your number.
int elementFromList = fastList[867530];
The Cost of Growing: Why Lists Feel “Heavy”
Here’s where the designers made a choice that’s both brilliant and, if you’re not paying attention, a hidden tax. You create a new List<int>(). By default, its capacity is 0. You add the first element. What happens? The list allocates an array with a default capacity (e.g., 4 elements in .NET). You add a fifth element. The list panics—it’s full! So it does something expensive:
- It allocates a new, larger array (usually doubling the current capacity, so from 4 to 8).
- It copies every single element from the old, small array into the new, spacious one.
- It abandons the old array for the garbage collector to clean up.
This “double-and-copy” pattern is actually very smart mathematically—it averages out to constant-time, O(1), for each add operation. But in the real world, that copy operation causes a spike. If you’re adding 100,000 items one-by-one to a list that started empty, you’re triggering this copy about 18 times. Each copy is bigger than the last. The final copy will involve copying 65,536 elements to a new array of 131,072 elements. That’s a lot of work happening silently behind your back.
// This looks innocent but can cause multiple large allocations and copies.
List<int> naiveList = new List<int>();
for (int i = 0; i < 100_000; i++)
{
naiveList.Add(i); // This will cause several internal resizes. Bad.
}
// This is the professional move. You know you need ~100k items, so just say so.
List<int> smartList = new List<int>(100_000);
for (int i = 0; i < 100_000; i++)
{
smartList.Add(i); // No resizes. The internal array is big enough from the start.
}
The best practice is blindingly obvious once you see it: if you know even vaguely how many items you’ll need, initialize your list with a capacity. It’s the single easiest way to avoid a ton of unnecessary garbage and CPU cycles.
So When Should You Use Which?
This isn’t academic. Your choice has real consequences.
Use an array when:
- The size is fixed and known. A chess board is always 8x8. The days of the week are always 7. Use an array.
- You need absolute minimal memory overhead and maximum performance for reads. A
List<T>is a class, so it’s a reference type stored on the heap with its own object header. An array is just the data. For millions of small, fixed-size collections, arrays are more memory-efficient. - You’re working with low-level code, interop, or algorithms that rely on memory layout.
Use a list when:
- The size is dynamic or unknown. This is the 99% case. You’re reading records from a database, processing user input, or building a collection where the final count is a mystery. Lists exist for this reason. Use them.
- You need the flexibility of
AddandRemove. Arrays are fixed-size. Removing an element from the middle of an array is a manual, painful process of shifting every subsequent element left. A list provides methods that abstract this away (though be warned,RemoveAtis still an O(n) operation because it does that same shifting!).
The most common pitfall is using a list badly—letting it resize constantly because you didn’t provide an initial capacity. The second most common is using an array for a truly dynamic collection, forcing you to manually implement the exact same resizing logic that the list provides (but worse, because you’ll probably make mistakes). Know the trade-offs, and you’ll write code that’s not just correct, but elegantly efficient.