37.8 Benchmarking Best Practices and Avoiding Compiler Tricks

Right, let’s get our hands dirty. Benchmarking in Go is deceptively simple, which is precisely why so many people get it subtly, tragically wrong. The testing package gives you just enough rope to hang yourself with, and the compiler—oh, the clever, clever compiler—is actively looking for a reason to snip your code into oblivion. Our job is to outsmart it, to force it to show us the real performance cost, not the cost of a cleverly optimized mirage.

First, the cardinal rule: you must use the result. The moment the compiler detects a variable or function result that isn’t used, it considers it “dead code” and gleefully eliminates it from the final binary. Your beautiful, expensive benchmark loop? Poof. Gone. Vanished into the ether.

// DON'T DO THIS. The compiler will vaporize this entire loop.
func BenchmarkBad(b *testing.B) {
  for i := 0; i < b.N; i++ {
    ExpensiveFunction() // Result not used -> optimized away.
  }
}

// DO THIS. Force the result to be accounted for.
func BenchmarkGood(b *testing.B) {
  var result SomeType
  for i := 0; i < b.N; i++ {
    result = ExpensiveFunction() // Result is used outside the loop!
  }
  _ = result // Prevent the compiler from optimizing the variable itself away.
}

Notice the _ = result after the loop? That’s our second line of defense. We have to use the result outside the loop. If we stored it in a variable inside the loop, the compiler could still be smart enough to see that each iteration just overwrites the previous value, making the whole exercise pointless. By moving the variable declaration outside the loop and using the result after, we create a true data dependency that the compiler can’t easily break.

The `runtime.GC()` Pitfall

Here’s a classic gotcha that feels logical but is actually a performance lie. Let’s say you’re benchmarking a function that does a lot of allocation. You think, “I should run the garbage collector before each iteration to get a consistent, clean-slate timing.” It sounds responsible. It is also completely wrong.

// DON'T DO THIS. You're benchmarking GC, not your function.
func BenchmarkWithGC(b *testing.B) {
  for i := 0; i < b.N; i++ {
    runtime.GC()              // You're now timing GC + your function
    ExpensiveAllocatingFunction()
  }
}

// The benchmarking system already handles this for you.
func BenchmarkCorrect(b *testing.B) {
  for i := 0; i < b.N; i++ {
    ExpensiveAllocatingFunction()
  }
}

By calling runtime.GC() yourself, you’re baking the cost of a full garbage collection into every single iteration of your benchmark. The testing framework is already smarter than you here; it runs a GC between benchmark runs (i.e., between different BenchmarkX functions) to prevent one test from polluting the heap for the next. Forcing it inside the loop just gives you a wildly inaccurate, much slower number. Trust the framework.

Fighting for Loop Invariants

The compiler is lazy in the best way possible. It will look for any calculation inside a loop that doesn’t change—a “loop invariant”—and hoist it out, performing it just once. This is fantastic for production code and a nightmare for micro-benchmarks where you want to measure the cost of that repeated operation.

// DON'T DO THIS. The compiler will hoist the constant out.
func BenchmarkInvariantHoist(b *testing.B) {
  for i := 0; i < b.N; i++ {
    // 'x' is constant for every loop iteration. The compiler
    // will calculate crypt.expensiveConstant() ONCE outside the loop.
    x := crypt.expensiveConstant()
    use(x)
  }
}

// DO THIS. Use a local variable the compiler can't reason about.
func BenchmarkInvariantCorrect(b *testing.B) {
  // Initialize a local variable that points to the function.
  // The compiler can't assume the return value is constant, so
  // it must call it every time.
  f := crypt.expensiveConstant
  for i := 0; i < b.N; i++ {
    x := f() // This call cannot be hoisted.
    use(x)
  }
}

The trick is to introduce a level of indirection the compiler can’t statically analyze. By storing the function call in a local variable (f), we create a black box. The compiler can’t guarantee that f() returns the same value every time, so it has to actually call it on each iteration. Victory.

Reset and Stop Timers for Complex Setup

Not all benchmarks are a single function call in a vacuum. Sometimes you have extremely expensive setup—like populating a massive data structure—that you need to do once, but you don’t want that setup time polluting your benchmark results.

func BenchmarkWithExpensiveSetup(b *testing.B) {
  giantMap := make(map[int]string, 10_000_000)
  // ... expensive population logic that takes seconds ...
  b.ResetTimer() // Resets the benchmark timer to now, ignoring all setup.

  for i := 0; i < b.N; i++ {
    // Now we benchmark just the operation we care about.
    _ = giantMap[i%10_000_000]
  }
}

For even more control, b.StopTimer() and b.StartTimer() let you pause and resume the benchmark clock. Use this if you have to do periodic cleanup inside your loop that isn’t part of the code under test. It’s a bit clunky, but it’s essential for getting clean measurements on the specific hot path you’re trying to optimize.

The takeaway? Benchmarking is a game of intellectual honesty against a hyper-intelligent opponent: the Go compiler. Your job is to structure your code so it can’t cheat, giving you the real, ugly, glorious truth about your code’s performance. Now go run go test -bench=. -benchmem and see what you’ve really built.

The runtime.GC() Pitfall

Fighting for Loop Invariants

Reset and Stop Timers for Complex Setup

The `runtime.GC()` Pitfall