The counter that refuses to cooperate
You are building a lightweight HTTP server that tracks active requests. Every time a request arrives, a goroutine increments a counter. Every time it finishes, another goroutine decrements it. You run the server, send ten requests, and the counter only reaches three. The numbers are dropping on the floor.
The problem is not your logic. It is timing. Two goroutines read the same value, add one to it in their own registers, and write it back. The second write overwrites the first. You have a race condition. The obvious fix is a sync.Mutex. It works, but it forces goroutines to wait in line, acquire a lock, do the math, and release the lock. For a single integer that changes thousands of times a second, that queue is unnecessary overhead.
Go gives you a faster path. The sync/atomic package exposes CPU instructions that guarantee a read-modify-write sequence happens as one indivisible step. No goroutine slips in between. No lock is acquired. The counter updates correctly, and the scheduler never pauses your goroutines to hand them a mutex.
What atomic actually means
Atomic comes from the Greek word for indivisible. In programming, an atomic operation is a single instruction that completes entirely before the processor can switch to another thread or goroutine. You cannot interrupt it halfway through.
Think of a mechanical turnstile at a subway entrance. A mutex is like a security guard who checks each person, opens the gate, lets them through, and closes it. The guard is reliable, but people have to wait. An atomic operation is like the turnstile itself. The mechanism is designed so that only one person can pass per push, and the push happens so fast that no one else can interfere. The hardware guarantees the step completes before anything else touches that memory address.
Go's sync/atomic package wraps these hardware instructions. It works on single-word values: 64-bit integers, 32-bit integers, pointers, and booleans. It does not work on slices, maps, structs, or strings. Those types span multiple memory addresses, and the CPU cannot update them in one instruction. You need a lock for those. Atomics are strictly for single values that fit in one CPU register.
A minimal counter without locks
Here is the simplest way to track a value across goroutines without a mutex. The code uses the generic atomic.Int64 type, which Go added in version 1.19 to remove the need for manual pointer passing.
package main
import (
"fmt"
"sync/atomic"
)
func main() {
// atomic.Int64 lives on the stack or heap, but handles its own pointer internally.
var counter atomic.Int64
// AddInt64 performs a hardware-level increment. It returns the new value.
newVal := counter.Add(1)
fmt.Printf("After add: %d\n", newVal)
// Load reads the current value without modifying it.
// This prevents the compiler from caching the value in a register.
current := counter.Load()
fmt.Printf("Loaded: %d\n", current)
// CompareAndSwap only writes if the current value matches the expected one.
// It returns true if the swap happened, false if someone else changed it.
swapped := counter.CompareAndSwap(1, 100)
fmt.Printf("CAS result: %v, value: %d\n", swapped, counter.Load())
}
The Add method does not just read the value, add one, and write it back in three separate steps. It emits a single CPU instruction like LOCK XADD on x86. The LOCK prefix tells the processor to lock the memory bus or the cache line for that exact address. The addition happens in the cache, and the result is written back before any other core can touch that line.
Load is equally important. Without it, the compiler or CPU might keep a stale copy of the variable in a register. Load forces a fresh read from memory, guaranteeing you see the latest value written by another goroutine.
CompareAndSwap is the workhorse for lock-free algorithms. It checks the current value against an expected value. If they match, it writes the new value and returns true. If they differ, it leaves the memory untouched and returns false. You typically wrap it in a loop: read, compute the next state, attempt the swap, and retry if it fails. This pattern is called optimistic concurrency. You assume no one will interfere, and you only retry when they do.
Goroutines are cheap. Atomics are faster. Use the right tool for the job.
Real-world: swapping state safely
Counters are straightforward. Pointers are where atomics shine in production code. You often need to swap out a configuration object, a cache instance, or a database connection pool without stopping the server. A mutex would block every read while you swap the pointer. An atomic pointer swap takes nanoseconds.
Here is how you safely rotate a configuration object across goroutines.
package main
import (
"fmt"
"sync/atomic"
)
type Config struct {
Timeout int
Retries int
}
func main() {
// atomic.Pointer handles the unsafe pointer casting for you.
var current atomic.Pointer[Config]
// Store the initial configuration safely.
initial := &Config{Timeout: 5, Retries: 3}
current.Store(initial)
// Simulate a goroutine reading the config concurrently.
loaded := current.Load()
fmt.Printf("Active config: timeout=%d, retries=%d\n", loaded.Timeout, loaded.Retries)
// Swap to a new configuration without blocking readers.
// Swap returns the old pointer, which you can inspect or discard.
old := current.Swap(&Config{Timeout: 10, Retries: 5})
fmt.Printf("Swapped from: timeout=%d\n", old.Timeout)
// Readers automatically see the new config on their next Load.
updated := current.Load()
fmt.Printf("New config: timeout=%d\n", updated.Timeout)
}
The Store method writes the pointer to memory with the correct memory barriers. The Swap method atomically replaces the old pointer with the new one and returns the previous value. Readers calling Load will either see the old pointer or the new one. They will never see a half-written address or a dangling reference.
This pattern is common in hot-reload systems. You compile the new configuration, allocate a fresh struct, and swap the pointer. Old goroutines finish their work with the old config. New goroutines pick up the new one. No downtime, no locks, no complex coordination.
Trust the hardware. Let the CPU handle the cache coherency.
Where things go sideways
Atomics are powerful, but they are not a replacement for every synchronization primitive. They have strict rules, and breaking them causes subtle bugs that the compiler will not catch.
The first trap is alignment. On 32-bit systems, the CPU cannot atomically read or write 64-bit values unless they start at a memory address divisible by 8. If you declare a plain int64 variable in a struct, the compiler might pack it at an odd offset to save space. When you pass its address to atomic.AddInt64, the program panics at runtime with a runtime error: 64-bit alignment message. The fix is to place the 64-bit field as the first element in the struct, or use atomic.Int64 which manages alignment automatically. The Go community convention for raw atomic variables is to always align them manually or avoid raw variables entirely in favor of the generic types.
The second trap is memory ordering. Atomic operations guarantee that the specific read or write happens indivisibly. They do not automatically reorder your entire program's memory accesses. If you update a flag with atomic.Store and then expect another goroutine to see a related variable change, you must use the correct memory barrier. atomic.Store acts as a release barrier. atomic.Load acts as an acquire barrier. If you mix them incorrectly, you might read stale data from unrelated variables. The compiler will not warn you. It will simply compile, and your program will behave unpredictably under load.
The third trap is scope creep. Developers sometimes try to atomically update two fields at once. You cannot. Atomics work on single words. If you need to update a counter and a timestamp together, or a balance and a transaction ID, you must use a sync.Mutex. Trying to fake it with two atomic calls creates a window where the state is inconsistent. The compiler rejects attempts to pass structs to atomic functions with an invalid operation: cannot use struct value in atomic operation error. Accept the limitation. Wrap the multi-field update in a mutex.
The worst atomic bug is the one that silently corrupts state without panicking. Test under contention. Use the race detector. Verify invariants.
When to reach for atomics
Concurrency tools are not interchangeable. Pick the primitive that matches your data shape and your performance needs.
Use an atomic counter when you have a single integer or boolean that many goroutines increment, decrement, or toggle, and you need maximum throughput with zero lock contention.
Use an atomic pointer swap when you need to replace a read-only configuration object, cache, or handler map without blocking concurrent readers.
Use a sync.Mutex when you must update multiple fields together, maintain complex invariants, or protect data structures like slices and maps.
Use a sync.RWMutex when reads vastly outnumber writes and you want to allow concurrent reads while blocking only during writes.
Use a channel when goroutines need to pass data, signal completion, or coordinate work flow rather than just sharing a single variable.
Use plain sequential code when you do not actually need concurrency. The simplest thing that works is usually the right thing.