How to Detect and Fix Memory Leaks in Go

The whiteboard that never gets erased

Your API starts at fifty megabytes of RAM. Three days later, it sits at two gigabytes. The garbage collector runs every few seconds, but the memory curve never bends back down. Eventually the kernel steps in with an OOM kill and your pod restarts. This is a memory leak. In Go, it rarely means you forgot to free memory. It means something is still holding a reference to data you thought was gone.

How Go actually tracks memory

Go uses a concurrent tracing garbage collector. It does not count references or track allocations manually. Instead, it starts at your root variables, follows every pointer it can find, marks everything reachable, and sweeps the rest. If a variable stays reachable, the collector assumes you still need it. A leak happens when your code accidentally keeps a reference alive.

Think of it like a whiteboard in a meeting room. The cleaner only erases sections that are completely empty. If someone leaves a single sticky note on a corner, the whole board stays up. In Go, that sticky note is a pointer, a slice backing array, a map entry, or a goroutine waiting on a channel. The garbage collector only frees what it cannot find.

A minimal reference leak

Here is the simplest way to accidentally hold memory forever. A global map grows without bounds, and old entries are never removed.

package main

import "fmt"

var globalCache map[string][]byte

// StoreData keeps appending to a global map without eviction.
func StoreData(key string, data []byte) {
    if globalCache == nil {
        globalCache = make(map[string][]byte)
    }
    // The map grows forever. Old keys are never removed.
    // Each value is a heap-allocated slice that stays reachable.
    globalCache[key] = data
}

func main() {
    for i := 0; i < 10000; i++ {
        StoreData(fmt.Sprintf("key-%d", i), make([]byte, 1024))
    }
    fmt.Println(len(globalCache))
}

The map itself is a root variable. The map entries point to heap-allocated byte slices. As long as the map lives, the slices live. The collector runs, sees the references, and leaves everything alone. Memory pressure triggers more frequent GC cycles, which steal CPU time. Eventually the process hits the OS limit.

What the garbage collector sees

Go's collector runs in phases. The mark phase walks the object graph from roots. The sweep phase reclaims unmarked pages. The whole process is concurrent with your application, which is why Go feels responsive under load. It also means the collector cannot guess your intent. It only follows pointers.

When you pass a slice to a function, you pass a header containing a pointer, length, and capacity. The header is cheap. The backing array is not. If you store that slice in a long-lived structure, the entire backing array stays pinned. The same applies to strings, maps, and channels. A goroutine that blocks on a channel also holds its stack and any local variables. If that goroutine never exits, its memory never returns.

The runtime will eventually panic with fatal error: runtime: out of memory when it cannot allocate more pages. You might also see runtime.throw traces in logs when the GC spends more than 90 percent of CPU time trying to free space. These are late signals. By then, the leak has already consumed your headroom.

A realistic service leak

Production code rarely leaks in a single function. It leaks across boundaries. Here is a background syncer that spawns a goroutine, reads from a channel, and forgets to respect context cancellation.

package main

import (
    "context"
    "fmt"
)

// SyncWorker reads jobs until the channel closes or context cancels.
func SyncWorker(ctx context.Context, jobs <-chan []byte) {
    for {
        select {
        case <-ctx.Done():
            // Exit immediately when the parent cancels.
            return
        case data, ok := <-jobs:
            if !ok {
                // Channel closed. No more work.
                return
            }
            // Process the buffer. It stays alive until the next iteration.
            fmt.Println(len(data))
        }
    }
}

func main() {
    ctx := context.Background()
    jobs := make(chan []byte)
    // Spawn a worker that holds a reference to the channel.
    go SyncWorker(ctx, jobs)
    // Send one large message.
    jobs <- make([]byte, 1024*1024)
    // Close channel to signal done.
    close(jobs)
}

This example does not leak because it closes the channel and respects cancellation. A leak appears when the sender stops sending but never closes the channel, or when the worker ignores ctx.Done(). The goroutine blocks forever on <-jobs, holding its stack and any local buffers. The channel itself stays alive because the goroutine references it. The memory never returns.

Convention matters here. context.Context always goes as the first parameter, conventionally named ctx. Functions that take a context should respect cancellation and deadlines. If you drop the context check, you create a goroutine leak. The worst goroutine bug is the one that never logs.

Catching leaks with ASAN and pprof

Go does not ship with a built-in leak detector. The language relies on two standard tools: AddressSanitizer and heap profiling.

AddressSanitizer instruments your binary at compile time. It tracks every allocation and deallocation, and reports leaks when the program exits. Build with the -asan flag.

go build -asan -o myapp main.go
./myapp

ASAN adds roughly two to three times the memory overhead and slows execution. It is ideal for local debugging and test suites. When the program exits, ASAN prints a summary of leaked objects, their sizes, and the stack traces where they were allocated. The output is plain text. You do not need extra tools to read it.

For running services, use heap profiles. The net/http/pprof package exposes profiling endpoints automatically when imported. If you cannot import it, use runtime/pprof.WriteHeapProfile to write a snapshot to disk. Fetch the profile with go tool pprof.

go tool pprof http://localhost:6060/debug/pprof/heap

The interactive shell loads the snapshot. Type top to list the largest memory consumers by allocation size. Type web to generate a graph visualization if you have Graphviz installed. Type list YourFunction to see the exact lines in your code that allocated memory. The shell also supports peek to drill into specific stack frames.

Profiles show you where memory lives. Your code decides why.

Common traps and runtime signals

Memory profiling reveals patterns, not bugs. You still need to interpret the data. Here are the most common traps.

Unbounded caches are the fastest way to leak. A map that grows with every request will eventually dominate your heap. Add eviction policies, size limits, or TTLs. Do not assume the GC will clean up what you keep referencing.

Slice growth without trimming is another silent leak. When you append to a slice, the backing array doubles in capacity. If you pass a subslice to a long-lived structure, the entire original array stays pinned. Use s = s[:len(s):len(s)] to cap the capacity, or copy only the data you need.

Goroutine leaks happen when a goroutine waits on a channel that never gets closed. Always have a cancellation path. Pass a context, select on ctx.Done(), and close channels when the producer finishes. The compiler will not catch a missing close. The runtime will only complain with fatal error: all goroutines are asleep - deadlock! if the program blocks completely, or fatal error: runtime: out of memory if it consumes everything.

False positives in profiling are common. alloc_space shows total allocations over time. inuse_space shows memory currently held. A function might appear high in alloc_space because it creates temporary buffers that the GC reclaims immediately. Focus on inuse_space when hunting leaks. Focus on alloc_space when hunting GC pressure.

Convention aside: _ discards a value intentionally. result, _ := someFunc() says you considered the second return value and chose to drop it. Use it sparingly with errors. Dropping errors silently is a different kind of leak.

A slow leak is still a leak. Fix the root, not the symptom.

Which tool fits your workflow

Pick the tool that matches your environment and your goal.

Use -asan when you are debugging a local binary and need exact leak locations at program exit. Use pprof heap profiles when you are running a long-lived service and need to see live memory pressure. Use go tool pprof with inuse_space when you want to find objects currently held in memory. Use go tool pprof with alloc_space when you want to find hot allocation paths that trigger frequent GC cycles. Use manual tracing with runtime/pprof.WriteHeapProfile when you need to capture snapshots across deployment environments. Use plain sequential code when you do not need concurrency: the simplest thing that works is usually the right thing.

Where to go next

Go automatically cleans up memory you aren't using, so true memory leaks are rare and usually mean your code is holding onto data it doesn't need anymore. You can find these issues by running your program with special debugging tools that track memory usage or by taking a snapshot of your program's memory to see what is stuck there. Think of it like a housekeeper who cleans up everything except the items you are still holding in your hands.