Right, let’s pull back the curtain on the race detector’s main act: happens-before tracking. Forget what you think that term means from philosophy class; here, it’s a brutally precise, logical mechanism for reconstructing order from the chaos of concurrent execution. The core problem it solves is this: when you have two threads accessing the same variable without synchronization, how can a tool, after the fact, possibly know if one access was supposed to happen before the other? The answer is, it can’t read your mind. But it can read the explicit synchronization points you did use, and it builds a partial ordering of events based on them.

Think of it like a detective at a crime scene with a scattered timeline. They don’t know the exact order of every event, but if a witness says, “I heard the scream after the clock struck midnight,” that establishes a “happens-before” relationship. The clock striking midnight happened before the scream. The race detector is that detective, and your synchronization primitives—channels, mutexes, sync.WaitGroup, etc.—are its witnesses.

The Synchronization Witnesses

The Go race detector instruments your code at build time. It doesn’t just watch memory accesses; it specifically watches for these synchronization events. Every time you perform an operation like:

  • Send on a channel
  • Receive from a channel
  • Lock a mutex (sync.Mutex, sync.RWMutex)
  • Unlock a mutex
  • Wait on a sync.WaitGroup
  • Add to a sync.WaitGroup (that unblocks a waiter)

…the detector logs it. These events are the bedrock it uses to construct its timeline.

Building the Happens-Before Graph

For any two events in your program, the detector can determine if a happens-before relationship exists. Let’s make this concrete with the classic example, but we’ll do it wrong first so the detector can show off.

package main

import (
	"fmt"
	"sync"
)

func main() {
	var wg sync.WaitGroup
	var data int

	wg.Add(1)
	go func() {
		defer wg.Done()
		data++ // Access 1: Write
	}()

	fmt.Println("The value is", data) // Access 2: Read
	wg.Wait()
}

This is a textbook data race. The read and the write are not ordered. Running this with -race will rightly cause an explosion. But why? Let’s trace the logic:

  1. The main goroutine calls wg.Add(1). This is a sync event.
  2. It launches a new goroutine. The start of a goroutine is synchronizes-with the go statement that created it. So, everything in main before the go statement happens-before the first instruction in the new goroutine.
  3. The new goroutine executes data++ (a write).
  4. Meanwhile, back in main, we execute fmt.Println(...) which reads data.
  5. Finally, main calls wg.Wait(). This is another sync event.

Here’s the critical insight: There is no synchronizing event between the write in the goroutine and the read in main. The wg.Wait() in main happens after the read, and the wg.Done() in the goroutine happens after the write. The Wait() and Done() establish a relationship with each other, but they don’t create a bridge that makes the write in the goroutine happen-before the read in main. The read and write are concurrent, and the detector screams.

Now, let’s fix it. The correct way is to move the read to after the synchronization point that guarantees the write is complete.

func main() {
	var wg sync.WaitGroup
	var data int

	wg.Add(1)
	go func() {
		defer wg.Done()
		data++ // Access 1: Write
	}()

	wg.Wait()           // This WAITS for the goroutine's Done() call.
	fmt.Println(data) // Access 2: Read. Now, the write happens-before this read.
}

The detective now has its witness statement: “The wg.Done() call happened-before the wg.Wait() returned.” Therefore, the write (data++), which happened before wg.Done(), also happens-before anything that comes after wg.Wait(), like our fmt.Println read. The partial order is established. No race.

What It Doesn’t See (The Pitfalls)

This is where developers get tripped up. The race detector only knows what you tell it through its recognized synchronization primitives. If you roll your own, the detector is blind.

Imagine you use an atomic to signal completion. Atomics are synchronization operations themselves, so this works:

var data int
var completed atomic.Bool

func main() {
	go func() {
		data = 42
		completed.Store(true) // Atomic write as sync event
	}()

	for !completed.Load() { // Atomic read as sync event
		// wait
	}
	fmt.Println(data) // This read is now ordered after the atomic write.
}

The atomic operations are recognized, so the detector sees: The write to data happens-before the Store, which synchronizes-with the Load, which happens-before the read of data. Safe.

But now, let’s be clever and stupid. You decide to use a simple boolean protected by a mutex for a “lock-free” design (a terrible idea, but let’s go with it).

var data int
var completed bool
var mu sync.Mutex

func main() {
	go func() {
		mu.Lock()
		data = 42
		completed = true // Writing 'completed' inside the critical section
		mu.Unlock()
	}()

	for {
		mu.Lock()       // Acquiring the mutex here...
		if completed {  // ...means this read of 'completed' is safe.
			mu.Unlock()
			break
		}
		mu.Unlock()
	}
	fmt.Println(data) // But what orders the write to 'data' and this read?
}

This code is probably safe logically, but the race detector might still report a race on data. Why? Because the mutex only protects access to the completed flag. The relationship is:

  1. The goroutine unlocks the mutex after setting completed=true.
  2. The main goroutine locks the mutex and sees completed=true.

This establishes that the write to completed in the goroutine happens-before the read of completed in main. But there is no direct synchronization linking the write to data and the read of data in fmt.Println. The mutex operations create a happens-before relationship for the variable they directly protect (completed), but not for other memory accesses like data. The correct fix is to also read data inside the critical section, making the mutex protect it too. The detector isn’t being pedantic; it’s pointing out a genuine, albeit subtle, flaw in your reasoning. It’s your brilliant, hyper-logical friend saying, “You can’t assume that.” And it’s almost always right.