Your service climbs the memory wall
Your Go service has been running for three weeks. It handles requests fine, but the memory graph on your dashboard looks like a ski slope climbing straight up. You restart the pod, memory drops, and the climb starts again. Eventually, the OOM killer terminates the process. You didn't write malloc and forget free. You're using Go. Where did the memory go?
The garbage collector is not a magician
Go has a garbage collector. The collector scans memory and frees anything your program can no longer reach. A memory leak happens when your program keeps a reference to data it no longer needs. The collector sees the reference and assumes the data is still important. It stays in memory. Over time, these "important" references pile up until the process runs out of RAM.
The leak isn't the allocation. The leak is the reference that won't let go.
Think of the garbage collector like a cleanup crew. They walk through the building and throw away anything no one is touching. If you leave a sticky note on a box saying "Keep this," the crew leaves the box alone. Even if you forgot why you wrote the note, the box stays. A memory leak is a sticky note you never remove.
A map that never forgets
The most common leak in Go is a data structure that grows without bound. Maps and slices are frequent culprits. If you add entries to a map but never remove them, the map holds references to every value forever.
Here's the simplest leak: a map that grows without bound.
package main
import (
"fmt"
"time"
)
// cache stores data globally. Global variables live for the entire process.
var cache = make(map[string][]byte)
func main() {
// Simulate a long-running service adding entries.
for i := 0; i < 10000; i++ {
// Allocate a buffer for each request.
data := make([]byte, 1024)
key := fmt.Sprintf("key-%d", i)
// Store the buffer. The map now holds a reference to data.
// The GC cannot free data because cache[key] points to it.
cache[key] = data
// Sleep to simulate real work and prevent instant exit.
time.Sleep(time.Millisecond)
}
// The map holds 10,000 entries. Memory usage stays high.
fmt.Println("Entries:", len(cache))
}
The garbage collector is smart, but it's not psychic. If you hold a reference, the memory stays.
How the leak grows
When you call make([]byte, 1024), Go allocates memory on the heap. The variable data points to it. When you assign cache[key] = data, the map stores that pointer. The local variable data goes out of scope at the end of the loop iteration.
The collector runs. It sees cache[key] still points to the buffer. The buffer is reachable. The collector leaves it alone. Next iteration, another buffer. The map grows. Memory grows. The pattern repeats until the heap fills up.
This happens silently. The program doesn't crash immediately. It just uses more RAM. If the service runs for days or weeks, the leak accumulates until the operating system steps in.
Goroutines that never wake up
A goroutine leak is a memory leak. Every goroutine has a stack. If the stack grows, it consumes memory. If a goroutine waits on a channel that never closes, it stays alive. The stack stays alive. The memory stays allocated.
Real-world leaks often hide in concurrency. A goroutine that blocks forever consumes memory for its stack and local variables. If you spawn goroutines in a loop and one gets stuck, you leak memory on every iteration.
Here's a worker that leaks because the channel never closes and the context never cancels.
package main
import (
"context"
"time"
)
// worker processes tasks from a channel.
// It must exit when work is done or context is cancelled.
func worker(ctx context.Context, tasks <-chan string) {
for {
select {
case <-ctx.Done():
// Context cancelled. Stop the goroutine.
return
case task, ok := <-tasks:
if !ok {
// Channel closed. No more tasks.
return
}
// Process the task.
_ = task
}
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
tasks := make(chan string)
// Start a worker goroutine.
go worker(ctx, tasks)
// Send a task.
tasks <- "job-1"
// The main function exits. The worker is still blocked on the select.
// If this were a long-running server, the worker would leak.
// In this snippet, the process exits, so the OS reclaims memory.
// In a server, the goroutine stays alive and consumes resources.
time.Sleep(time.Second)
}
The worst goroutine bug is the one that never logs. A stuck goroutine sits in silence while memory drains.
Finding the leak with pprof
When memory runs out, the runtime panics with fatal error: runtime: out of memory. The program crashes. You need to find the leak before this happens.
Go ships with pprof. Import net/http/pprof to add debug endpoints.
package main
import (
// Import pprof to register /debug/pprof/* handlers.
_ "net/http/pprof"
"net/http"
)
func main() {
// Start a debug server on localhost.
// Never expose this port to the public internet.
go func() {
http.ListenAndServe("localhost:6060", nil)
}()
// Your application logic runs here.
// pprof captures heap and goroutine profiles on demand.
}
Once the server is running, connect with go tool pprof http://localhost:6060/debug/pprof/heap. The tool downloads the profile and opens an interactive shell. Type top to see which functions are retaining the most memory. The output lists functions by the amount of memory they hold. Type list functionName to see the exact source lines. Look for allocations in loops or maps that grow without bound.
In the pprof shell, you'll see two metrics: inuse_space and alloc_space. alloc_space shows total memory allocated since the program started. inuse_space shows memory currently held. A leak shows up as inuse_space growing over time. If alloc_space grows but inuse_space stays flat, the GC is doing its job and you don't have a leak.
For goroutine leaks, use go tool pprof http://localhost:6060/debug/pprof/goroutine?debug=1. This prints a text dump of all goroutines. If you see thousands of goroutines stuck on the same line, you have a leak. Check the stack trace to find the blocking channel or mutex.
Trust the profiler. Your intuition about where memory goes is usually wrong.
Pitfalls and runtime errors
CGO calls C functions. C manages memory manually. If you call C.malloc and don't call C.free, the Go GC never sees that memory. It sits outside the Go heap. The profiler won't catch it easily. You need the Leak Sanitizer. Build with CGO_LDFLAGS="-fsanitize=leak". Run the program. The sanitizer prints a stack trace of every leaked C allocation when the process exits.
When writing functions that might leak resources, pass context.Context as the first argument. Name it ctx. This convention lets callers cancel long-running operations and stop leaks early. Functions that take a context should respect cancellation and deadlines.
The compiler rejects programs with unused imports with imported and not used. It rejects undefined variables with undefined: pkg. These errors help you keep code clean, but they won't catch logical leaks. A leak is valid Go code that behaves badly over time.
Context is plumbing. Run it through every long-lived call site.
When to use what
Use a bounded cache with eviction when you need fast lookups but memory must stay constant. Use a context with timeout when a goroutine might block indefinitely on I/O or a channel. Use pprof when memory usage grows over time and you need to identify the retaining object. Use sync.Pool when you allocate and discard many small objects in a tight loop to reduce GC pressure. Use explicit cleanup when calling C code via cgo, because the garbage collector cannot free C memory. Use plain sequential code when you don't need concurrency: the simplest thing that works is usually the right thing.