How to Implement Timeout Middleware in Go

Web
Implement Go timeout middleware by wrapping your handler with a context that enforces a deadline using context.WithTimeout.

The hanging request problem

A production server receives a request. The handler calls a downstream API that is experiencing network latency. The call blocks for forty seconds. The connection stays open. Another request arrives. Another goroutine spawns. Another downstream call hangs. Within minutes, the server has thousands of blocked goroutines holding file descriptors and memory. New connections get rejected. The service appears dead.

This happens because Go's net/http server creates a fresh goroutine for every incoming request. That goroutine lives until the handler returns. If the handler blocks indefinitely, the goroutine never exits. The server does not automatically kill slow requests. You have to tell it when to give up.

Timeout middleware solves this by attaching a deadline to the request before it reaches your logic. When the deadline passes, the middleware signals every part of the request chain to stop, clean up, and return. The connection closes. The goroutine exits. The server stays healthy.

How middleware wraps the request chain

Middleware in Go is just a function that takes an http.Handler and returns a new http.Handler. The http.Handler interface requires a single method: ServeHTTP(http.ResponseWriter, *http.Request). When you wrap a handler, you create a chain. The outermost wrapper runs first. It can modify the request, inspect the response, or enforce rules before passing control inward.

Think of middleware as a series of checkpoints. Each checkpoint gets the request, does its job, and hands it to the next one. The final checkpoint is your actual business logic. When the logic finishes, control flows back outward through the chain. Timeout middleware sits near the top of this chain. It sets a timer, attaches it to the request, and ensures the timer is cleaned up when the request finishes or fails.

Go carries deadlines and cancellation signals through context.Context. The context is a lightweight object that travels with the request. It holds a channel that closes when a deadline is reached or when a parent context is cancelled. Any function that receives the context can listen to that channel and stop working early.

Context is plumbing. Run it through every long-lived call site.

The minimal timeout wrapper

Here is the smallest working timeout wrapper. It creates a deadline, attaches it to the request, and passes control to the next handler.

func TimeoutMiddleware(next http.Handler, timeout time.Duration) http.Handler {
    // Return a new handler that implements the ServeHTTP method
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Derive a new context with a hard deadline from the request's existing context
        ctx, cancel := context.WithTimeout(r.Context(), timeout)
        // Ensure the timer is stopped and resources are freed when this function returns
        defer cancel()
        // Replace the request's context with the deadline-aware version
        r = r.WithContext(ctx)
        // Hand control to the next handler in the chain
        next.ServeHTTP(w, r)
    })
}

The wrapper returns an http.HandlerFunc, which is a function type that satisfies the http.Handler interface. Inside, it calls context.WithTimeout. This function returns a new context and a cancellation function. The new context carries a timer that will automatically fire after the specified duration. The cancellation function stops the timer early if the request finishes before the deadline.

The defer cancel() call is mandatory. Without it, the timer continues running until the garbage collector cleans it up. In high-traffic servers, thousands of abandoned timers will leak memory and CPU cycles. Deferring the cancellation guarantees cleanup regardless of how the function exits.

The r.WithContext(ctx) call creates a shallow copy of the request with the new context attached. The original request object remains unchanged. This is safe because http.Request is designed to be copied when you need to modify its context. The next handler in the chain receives the modified request and can access the deadline through r.Context().

Goroutines are cheap. Context cancellation is not optional.

What happens under the hood

When a request arrives, the HTTP server calls the outermost middleware. The middleware creates a context with a deadline. It defers the cancellation function. It swaps the request's context. It calls the next handler.

If the handler finishes quickly, it returns. The deferred cancel() runs. The timer is stopped. The middleware returns. The HTTP server sends the response and reclaims the goroutine.

If the handler takes too long, the context timer fires. The context's Done() channel closes. Any code listening to that channel receives a signal. The handler should check ctx.Err() and see context.DeadlineExceeded. It should stop its work, clean up local resources, and return. The middleware returns. The HTTP server closes the connection.

The HTTP server does not force the handler to stop. It only closes the underlying network connection. If your handler ignores the context and keeps running, it will write to a closed connection. The http.ResponseWriter will panic or silently drop the data. Your goroutine will eventually finish, but it will have wasted resources and potentially crashed the server if panic recovery is not in place.

You must design your handlers to respect the context. The middleware only sets the deadline. The handler must listen to it.

Building a handler that respects the deadline

A real handler needs to check the context regularly. Here is a handler that simulates work and stops when the deadline hits.

func slowHandler(w http.ResponseWriter, r *http.Request) {
    // Extract the deadline-aware context from the request
    ctx := r.Context()
    // Simulate work in small increments to allow early cancellation
    for i := 0; i < 100; i++ {
        // Check if the deadline has passed or the client disconnected
        select {
        case <-ctx.Done():
            // Return early with a 503 status before doing more work
            http.Error(w, "request timed out", http.StatusServiceUnavailable)
            return
        default:
            // Perform a small unit of work
            time.Sleep(100 * time.Millisecond)
        }
    }
    // All work completed successfully before the deadline
    w.WriteHeader(http.StatusOK)
    w.Write([]byte("done"))
}

The select statement blocks until one of its cases is ready. The <-ctx.Done() case fires immediately when the context is cancelled. The default case runs when the context is still alive. This pattern lets you interleave work with cancellation checks without blocking forever.

When the deadline passes, ctx.Err() returns context.DeadlineExceeded. The handler writes a 503 Service Unavailable response and returns. The middleware's deferred cancel() runs. The timer stops. The goroutine exits cleanly.

If you skip the context check and call a blocking function like time.Sleep(10 * time.Second), the handler will ignore the deadline. The middleware will return, the connection will close, but the goroutine will keep sleeping. This is a goroutine leak. The leak grows with every slow request. Eventually, the server runs out of memory.

Always pass ctx as the first parameter to functions that perform I/O. Name it ctx. Respect cancellation and deadlines. The Go community accepts this convention because it makes the cancellation path explicit and consistent across packages.

Common traps and compiler feedback

Forgetting to return the correct type breaks the build. If you accidentally return http.HandlerFunc without wrapping it, or if you mismatch the signature, the compiler rejects the program with cannot use func literal (value of type func(http.ResponseWriter, *http.Request)) as http.Handler value in return argument. The fix is to wrap the function in http.HandlerFunc(...) so it satisfies the interface.

If you forget to call defer cancel(), the compiler will not complain. The program will compile and run. The memory leak will appear gradually in production. Profiling tools will show thousands of active timers. The fix is to always pair context.WithTimeout or context.WithCancel with an immediate defer cancel().

Handlers that try to write after the timeout fires will trigger a panic. The HTTP server closes the connection when the middleware returns. Writing to a closed http.ResponseWriter causes http: superfluous response.WriteHeader call or panic: runtime error: invalid memory address or nil pointer dereference depending on the exact timing. The fix is to check ctx.Err() before any I/O or response writing.

If you pass the wrong type to a function expecting a context, the compiler complains with cannot use r (variable of type *http.Request) as context.Context value in argument. Context is not the request. It is a field inside the request. Use r.Context() to extract it.

The if err != nil { return err } pattern looks verbose. It is verbose by design. The community accepts the boilerplate because it makes the unhappy path visible. When you check ctx.Err(), you are following the same pattern. Write the check. Return early. Keep the happy path clean.

The worst goroutine bug is the one that never logs.

Choosing the right timeout strategy

Timeouts belong at different layers depending on your architecture. Pick the layer that matches your failure mode.

Use timeout middleware when you want a uniform deadline across all routes and you want to protect the server from slow clients or runaway handlers.

Use router-level timeouts when different endpoints have different performance guarantees, such as a fast health check versus a heavy report generation endpoint.

Use database or HTTP client-level timeouts when you want to fail fast at the I/O boundary instead of killing the entire request, allowing the handler to retry or return a partial result.

Use no timeout when the operation is local, deterministic, and guaranteed to complete in microseconds, such as reading from an in-memory cache or performing a mathematical calculation.

Middleware order matters. Place timeout middleware before logging or authentication middleware if you want the timer to start before any processing begins. Place it after authentication if you want to exclude unauthenticated requests from the timer budget. Test the order. Measure the impact.

Where to go next