What Is a Data Race vs a Race Condition in Go

The counter that lies to you

You write a web server that tracks how many times a user clicks a button. You run it on your laptop, click the button ten times, and the counter prints ten. You deploy to production, traffic spikes, and the counter prints four. The code did not change. The timing did. This is the moment concurrency stops being a feature and starts being a trap.

The bug is not in your logic. It is in the invisible choreography of the CPU. Two goroutines touched the same memory at the same time, and the result depends on which one the scheduler picked to run first. In Go, this falls into two buckets. A race condition is any bug where the outcome depends on timing. A data race is a specific violation of Go's memory model where two goroutines access the same memory concurrently and at least one is a write.

Go forbids data races. If your program has a data race, the behavior is undefined. The compiler assumes data races do not exist. It uses that assumption to reorder instructions and optimize memory access. If you introduce a data race, the compiler's optimizations can make the bug appear even on a single-core machine. The result is silent corruption, panics, or counters that drift downward.

Race condition versus data race

A race condition is the broad category. It describes any situation where the program's correctness depends on the relative timing of events. A data race is a subset. It describes concurrent memory access without synchronization.

Think of a shared notebook in a busy office. A race condition happens when two people try to write on the same page at the same time. One person's notes get overwritten, or the page ends up with half-finished sentences. The outcome depends on who grabbed the pen first.

A data race is what happens if the notebook itself is broken. The paper allows two pens to write to the exact same millimeter simultaneously, and the ink mixes into an unreadable blob. The structure of the data is corrupted. In Go, maps and slices are like that broken notebook. Accessing them concurrently without protection triggers a data race. The runtime often catches this and panics, but not always.

The race detector finds data races. It does not find logical race conditions where the synchronization is technically correct but the algorithm is flawed. You can have a race condition without a data race if you use channels or mutexes correctly but still have a check-then-act bug. The race detector is a tool for memory safety, not algorithm verification.

The compiler's dangerous assumption

Here is the part that catches everyone off guard. The Go compiler optimizes your code under the assumption that data races never happen. It reorders memory operations to improve performance. It caches values in registers. It eliminates redundant loads.

If you have a data race, the compiler's reordering can expose the bug in ways that defy intuition. A variable might appear to change value without any write happening. A loop might run forever because a flag update is never visible to another goroutine. The hardware cache coherency protocol usually keeps things in sync, but the compiler is allowed to break that sync if it thinks no race exists.

This is why data races are insidious. They do not just cause lost updates. They can break the fundamental guarantees of memory visibility. The race detector instruments your code to track every memory access. It forces the compiler to respect synchronization points. Running with the race detector often changes the timing enough to hide or reveal bugs, but its primary job is to report violations of the memory model.

Minimal example: the lost update

Here is the simplest data race. Two goroutines increment a shared integer without protection. The code looks safe. The result is wrong.

package main

import (
	"fmt"
	"sync"
)

func main() {
	var count int
	var wg sync.WaitGroup

	// Add two to the wait group so main waits for both goroutines.
	for i := 0; i < 2; i++ {
		wg.Add(1)
		// Launch a goroutine that increments the shared counter.
		go func() {
			defer wg.Done()
			// count++ is three steps: read, add, write.
			// The scheduler can pause between read and write, causing a lost update.
			count++
		}()
	}

	// Block until both goroutines finish.
	wg.Wait()
	fmt.Println(count)
}

Run this code normally. You might see 2. You might see 1. The result is non-deterministic. The CPU executes count++ as a load, an add, and a store. Goroutine A loads 0. Goroutine B loads 0. Goroutine A stores 1. Goroutine B stores 1. The final value is 1. One increment vanished.

Run the code with the race detector.

go run -race main.go

The detector reports the violation immediately. The output includes WARNING: DATA RACE and shows the stack traces of the conflicting accesses. It points to the exact line where the read and write collide. The detector does not stop the program. It logs the error and continues execution. This allows you to see how far the corruption propagates.

The race detector adds significant overhead. It instruments every memory access. Do not run it in production. Use it during development and in your test suite.

Realistic example: the map panic

Real Go code rarely uses bare integers for shared state. It uses maps. Maps are the most common source of data races in production. Maps in Go are not safe for concurrent use. The runtime tracks map state internally. If two goroutines modify the map structure at the same time, the internal pointers get corrupted.

Here is a web handler that counts requests by path. It looks reasonable. It crashes under load.

package main

import (
	"fmt"
	"net/http"
)

// Cache holds request counts by path.
// Maps are not safe for concurrent access in Go.
var cache = make(map[string]int)

// Handler increments the count for the requested path.
// This function runs in a separate goroutine per request.
func handler(w http.ResponseWriter, r *http.Request) {
	// Concurrent writes to the map will trigger a panic.
	// The runtime detects this and crashes the process.
	cache[r.URL.Path]++
	fmt.Fprintf(w, "Count: %d", cache[r.URL.Path])
}

func main() {
	// Start the server.
	// Multiple requests will hit handler simultaneously.
	http.HandleFunc("/", handler)
	fmt.Println("Server starting on :8080")
	http.ListenAndServe(":8080", nil)
}

Send two requests at the same time. The runtime panics with fatal error: concurrent map writes. This is a hard crash. The process terminates. The panic is the runtime's safety net. It prevents silent corruption of the map structure. The race detector can find this before the panic if you run the server with -race, but the panic is the immediate consequence.

The fix is synchronization. You need a mutex to protect the map. Or you need to use channels to serialize access. Or you need to use sync.Map for specific workloads. The choice depends on your access pattern.

Logical races versus data races

The race detector finds data races. It does not find logical races. A logical race occurs when the synchronization is correct, but the algorithm has a timing dependency.

Consider a check-then-act pattern. You check a condition, then act on it. If the condition can change between the check and the act, you have a logical race.

var mu sync.Mutex
var value int

func update() {
	mu.Lock()
	defer mu.Unlock()

	// Check the value.
	if value == 0 {
		// Act on the value.
		// Another goroutine might change value between the check and this line
		// if the lock is released, but here the lock is held.
		// This is safe because the lock is held across both steps.
		value = 1
	}
}

If you hold the lock across both the check and the act, the code is safe. The race detector will not complain. The logic is correct.

Now consider a broken version.

func brokenUpdate() {
	// Check without lock.
	mu.Lock()
	current := value
	mu.Unlock()

	if current == 0 {
		// Act with lock.
		mu.Lock()
		value = 1
		mu.Unlock()
	}
}

This code has no data race. The race detector stays silent. The reads and writes are protected by the mutex. But the logic is racy. Another goroutine can change value between the unlock and the second lock. The check is stale. This is a logical race condition. The race detector cannot find this. You need careful reasoning or formal verification to catch logical races.

Pitfalls and hidden traps

Data races hide in places that look safe. Here are the common traps.

The len function on a map is not safe for concurrent access. If one goroutine writes to the map and another calls len, you have a data race. The race detector catches this. The runtime might panic with concurrent map read and map write.

Range loops over maps are not safe. Iterating a map while another goroutine writes to it triggers a data race. The iteration order is already random. Adding concurrency makes it worse.

Channels synchronize, but they do not protect shared memory. If you send a pointer over a channel, the receiver and sender both hold the pointer. If they both access the pointed-to value without synchronization, you have a data race. The channel only synchronizes the transfer of the pointer, not the usage of the data.

Atomic operations are safe, but they are limited. atomic.AddInt64 is atomic. It does not have a data race. But atomic.LoadInt64 followed by a non-atomic write is a data race. You must use atomic operations for every access, or use a mutex. Mixing atomic and non-atomic access to the same variable is a data race.

Goroutine leaks can mask races. If a goroutine blocks on a channel send and the receiver is gone, the goroutine leaks. The leak prevents cleanup. The leaked goroutine might hold a lock or reference to shared state. This can cause deadlocks or memory exhaustion. Always ensure channels have a cancellation path. Use context.Context to signal shutdown.

Convention asides

The Go community has strong conventions around synchronization. Follow them to keep your code readable and safe.

Mutexes should be named mu. This is the standard receiver name for a mutex field. type Counter struct { mu sync.Mutex; count int }. The name mu signals that this is a synchronization primitive.

Always pair Lock with Unlock. Use defer mu.Unlock() immediately after mu.Lock(). This ensures the lock is released even if the function panics. Do not hold locks across I/O calls. Locks should protect critical sections, not entire request handlers.

Use sync.RWMutex when reads are frequent and writes are rare. RWMutex allows multiple readers but only one writer. If writes are common, RWMutex adds overhead and can cause writer starvation. Stick to sync.Mutex unless profiling shows a read-heavy bottleneck.

Accept interfaces, return structs. This mantra applies to synchronization too. Pass a sync.Locker interface if you need to abstract the locking mechanism. Return concrete types like *sync.Mutex only when necessary.

The race detector is part of the standard toolchain. It requires no extra installation. Run go test -race ./... in your CI pipeline. Treat race detector warnings as build failures. A data race is a bug. Fix it before merging.

Decision matrix

Choose the right tool for your concurrency pattern. Each option has trade-offs.

Use a sync.Mutex when multiple goroutines need exclusive access to a shared variable or map. The mutex serializes access. Only one goroutine can hold the lock at a time. This is the safest and most common solution.

Use a sync.RWMutex when reads are frequent and writes are rare. The read lock allows concurrent reads. The write lock excludes all other access. This improves throughput for read-heavy workloads but adds complexity.

Use channels when goroutines need to communicate data rather than just sharing a variable. Channels synchronize the sender and receiver. The value is transferred safely. This follows the "share memory by communicating" principle.

Use atomic operations when you need lock-free performance for simple counters or flags. Atomic operations are faster than mutexes for single variables. They are harder to reason about for complex invariants.

Use sync.Once when a function must run exactly once across all goroutines. This is ideal for initialization. The first call runs the function. Subsequent calls return immediately.

Use the race detector (-race) when testing concurrent code to catch data races early. Run it on every test suite. It finds bugs that are hard to reproduce manually.

Where to go next

The race detector finds the invisible bugs. Run it. Synchronization is not optional. It is the foundation of correct concurrent code. Trust the tool. Fix the races.

A race condition is a bug where your code breaks because two parts of the program run at the same time and step on each other's toes. A data race is the most common type of this bug, specifically happening when two parts try to read or write the same piece of memory at once. Think of it like two people trying to write on the same whiteboard at the exact same time; the result is messy and unpredictable.