What is goroutine leak and how to prevent it

A goroutine leak is a non-terminating goroutine that consumes resources, prevented by ensuring clear exit conditions via context cancellation or channel closure.

The silent killer in your server

You deploy a new feature. It works fine for an hour. Then the memory usage starts climbing. Not fast, but steady. The CPU usage jumps in spikes. After a day, the container hits its memory limit and the OOM killer shuts it down. You check the logs. Nothing crashed. No panics. Just a silent, slow death.

The culprit is likely a goroutine that started its job and never finished. It is stuck waiting for something that will never happen, holding onto memory and a scheduler slot, refusing to die. This is a goroutine leak. It is the most common concurrency bug in Go, and it is insidious because the program keeps running while it bleeds resources.

What a leak actually is

A goroutine leak happens when a goroutine is spawned but never exits. Go's runtime keeps the goroutine alive as long as it is running. If a goroutine blocks forever, it stays forever. Every goroutine takes up stack memory. The stack starts small and grows if needed. If you leak one goroutine per request, and you get a thousand requests a second, you will run out of memory quickly.

Think of a restaurant kitchen. The chef sends an order to a line cook. The line cook starts cooking. If the line cook finishes and cleans the station, the kitchen is ready for the next order. If the line cook gets distracted and stands there staring at a pot that never boils, they occupy the station. The next order cannot use that station. If this happens enough times, the kitchen runs out of stations and stops serving food. The line cook did not quit. They are just stuck. That is a leak.

Goroutines are cheap. Leaks are expensive.

Minimal example

Here is the simplest leak: a goroutine waits on a channel that never receives a value.

package main

import (
    "fmt"
    "time"
)

// leakyWorker blocks forever because nobody sends to the channel.
func leakyWorker(done <-chan struct{}) {
    // This receive blocks until a value arrives.
    // No one sends to done, so this line never returns.
    <-done
    fmt.Println("This never prints")
}

func main() {
    // Create a channel but never close it or send to it.
    done := make(chan struct{})

    // Spawn the worker. It starts and immediately blocks.
    go leakyWorker(done)

    // Main function exits, but the goroutine is still running.
    // In a real program, the process would stay alive.
    time.Sleep(time.Second)
    fmt.Println("Main finished, but the goroutine is still stuck")
}

# output:
Main finished, but the goroutine is still stuck

When you run this, main creates the channel and spawns leakyWorker. The worker hits <-done and stops. The scheduler sees it is blocked and moves on. main sleeps and prints. If main exits, the process terminates, killing the goroutine. But in a server, main never exits. The server loop keeps running. The goroutine sits there, consuming stack space. Over time, thousands of these accumulate. The runtime grows the stack for each one. Memory usage climbs. The garbage collector cannot reclaim the stack of a running goroutine. The leak is invisible to the GC.

Realistic leak in an HTTP handler

Real leaks happen in pipelines or HTTP handlers. A common pattern is spawning a goroutine to do background work. If that work depends on a channel, and the channel closes unexpectedly, or if the goroutine forgets to check for cancellation, you leak.

Here is a realistic leak in an HTTP handler where a background goroutine outlives the request.

package main

import (
    "context"
    "time"
)

// processRequest simulates slow work.
// It ignores context cancellation, creating a leak risk.
func processRequest(ctx context.Context, resultCh chan<- string) {
    // Sleep simulates work.
    // In real code, this might be a database query or API call.
    time.Sleep(5 * time.Second)

    // Send result to channel.
    // If the receiver is gone, this blocks forever.
    resultCh <- "done"
}

import (
    "fmt"
    "net/http"
    "time"
)

// handler spawns a goroutine but doesn't clean up on timeout.
func handler(w http.ResponseWriter, r *http.Request) {
    resultCh := make(chan string)

    // Start background work.
    go processRequest(r.Context(), resultCh)

    // Wait for result or timeout.
    select {
    case res := <-resultCh:
        fmt.Fprintln(w, res)
    case <-time.After(1 * time.Second):
        // Timeout. Handler returns.
        // The goroutine is still running and will leak.
        http.Error(w, "timeout", http.StatusGatewayTimeout)
    }
}

The handler spawns processRequest. The request takes five seconds. The client times out after one second. The handler returns and sends a timeout response. The connection closes. The goroutine is still sleeping. When it wakes up, it tries to send to resultCh. No one is reading resultCh anymore. The send blocks. The goroutine is now a zombie. It holds the stack. It holds the closure variables. It holds the channel reference. Nothing can garbage collect it. The worst goroutine bug is the one that never logs.

How to fix and prevent leaks

The fix is to give every goroutine a way to exit. The standard tool is context.Context. Pass the context to the goroutine. Check for cancellation. If the context is done, stop the work and return.

context.Context always goes as the first parameter, conventionally named ctx. Functions that take a context should respect cancellation and deadlines. This is the convention across the standard library and the ecosystem. If you write a function that takes a context, you must check ctx.Done() or pass the context to other functions that do.

Here is the corrected handler.

import (
    "context"
    "fmt"
    "net/http"
)

// handlerFixed respects context cancellation.
func handlerFixed(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    // Buffered channel so the send doesn't block if the handler returns.
    resultCh := make(chan string, 1)

    go func() {
        // Check context before work.
        // If the client disconnects, ctx.Done() fires.
        select {
        case <-ctx.Done():
            return
        case resultCh <- "done":
            // Send succeeded.
        }
    }()

    select {
    case <-ctx.Done():
        // Client disconnected.
        http.Error(w, "", http.StatusServiceUnavailable)
    case res := <-resultCh:
        fmt.Fprintln(w, res)
    }
}

The goroutine checks ctx.Done() before sending. If the context is cancelled, it returns immediately. The channel is buffered to one, so the send does not block if the handler has already returned. The goroutine exits cleanly. No leak.

Run gofmt on your code. The community expects it. Most editors run it on save. It enforces a consistent style so you can focus on logic, not formatting.

Pitfalls and compiler behavior

The compiler will not catch a goroutine leak. Go does not analyze runtime behavior like that. You get no error message. The program compiles fine. The leak manifests at runtime.

If you try to send on a closed channel, you get a panic: panic: send on closed channel. That is not a leak. That is a crash. A leak is silent.

If you forget to capture a loop variable correctly in older Go versions, the compiler warns with loop variable i captured by func literal. In Go 1.22+, loop variables are scoped per iteration, so this is safer. But always be careful with closures in loops.

Calling time.After in a loop is a classic leak. Each call creates a timer. If the loop breaks, the timer still fires. The timer holds internal resources. Use time.NewTimer and call Stop when you are done.

// Bad: time.After leaks if the loop breaks early.
for {
    select {
    case <-time.After(1 * time.Second):
        // Work
    }
}

// Good: NewTimer can be stopped.
timer := time.NewTimer(1 * time.Second)
defer timer.Stop()
for {
    select {
    case <-timer.C:
        // Work
        timer.Reset(1 * time.Second)
    }
}

The compiler complains with cannot use x (untyped int constant) as string value in argument if you pass the wrong type. It rejects loop variable i captured by func literal in older versions. It catches syntax errors. It does not catch logic errors like leaks. You need profiling tools.

Use runtime/pprof to profile active goroutines. Run go tool pprof on the profile. Look for goroutines that are blocked on channel operations. If you see thousands of goroutines stuck on the same line, you have a leak. Trust pprof. If memory climbs, check the goroutine profile.

Decision matrix

Concurrency is a tool, not a goal. Pick the right pattern for the job.

Use context.Context when you need to cancel a long-running task or propagate a deadline.

Use a buffered channel when the sender must not block and you can tolerate dropping old values or buffering a few.

Use select with a timeout when you need to prevent a goroutine from waiting indefinitely.

Use sync.WaitGroup when you need to wait for multiple goroutines to finish before proceeding.

Use a single goroutine with a loop when one worker processes a stream of tasks sequentially.

Use plain sequential code when the task is fast and does not require concurrency.

The simplest thing that works is usually the right thing.

Where to go next

A goroutine leak is like starting a worker who never stops working because they are waiting for a signal that never arrives. This wastes your computer's memory and processing power, eventually slowing down or crashing your application. You fix it by giving the worker a clear way to stop, like a 'quit' button or a timer, so they can finish their job and leave.