How to Handle Keep-Alive Connections in Go

The invisible handshake

You are building a service that calls three different microservices. Each request takes fifty milliseconds. You need to handle one hundred requests per second. If every request opens a fresh TCP connection, negotiates TLS, sends the request, reads the response, and tears down the socket, you spend more time on handshakes than on actual data. The CPU sits idle while the kernel waits for network round trips. Memory fills up with half-open sockets. The system chokes.

Go solves this by default. You do not need to configure anything to get connection reuse. The standard library handles it silently. When you make your first request, Go opens a socket. When the response finishes, Go does not close it. It puts the socket back in a pool. Your second request grabs that same socket. No new handshake. No new TLS negotiation. The connection stays warm.

How connection reuse actually works

HTTP keep-alive is not a Go invention. It is the default behavior of HTTP/1.1 and HTTP/2. The protocol assumes the underlying TCP connection will stay open after the response finishes. The client and server agree to reuse it for the next request. Go's net/http package implements this through the http.Transport type. The transport owns the connection pool. It tracks every idle socket, matches them to hostnames, and hands them out when you call client.Get or client.Do.

The pool operates on three levers. MaxIdleConns caps the total number of idle connections across all hosts. MaxIdleConnsPerHost caps how many idle connections you can keep open for a single destination. IdleConnTimeout sets how long an unused connection survives before the transport closes it. The defaults are conservative. They work for most applications. They prevent socket exhaustion on the client side and avoid holding servers hostage with stale connections.

Connection reuse saves more than CPU cycles. TLS handshakes require multiple round trips and heavy cryptographic work. Reusing a connection skips the handshake entirely. DNS lookups happen once per host. TCP window scaling and congestion control algorithms have time to warm up. The network stack performs better when it does not constantly tear down and rebuild state.

Keep-alive is automatic. You only touch the knobs when the defaults clash with your traffic pattern.

The default client

The standard library ships with http.DefaultClient. It is a ready-to-use instance that already reuses connections. You can call it directly, or you can create your own client with a custom transport. Creating your own client gives you control over timeouts, headers, and pool limits. It also makes your configuration explicit in code reviews.

Here is the simplest client that reuses connections across multiple calls:

package main

import (
	"fmt"
	"net/http"
)

func main() {
	// Default client already pools connections behind the scenes
	client := http.DefaultClient

	// First request opens a TCP socket and negotiates TLS
	resp1, err := client.Get("https://httpbin.org/get")
	if err != nil {
		panic(err)
	}
	// Closing the body returns the socket to the pool
	resp1.Body.Close()

	// Second request grabs the same socket from the pool
	resp2, err := client.Get("https://httpbin.org/get")
	if err != nil {
		panic(err)
	}
	resp2.Body.Close()

	fmt.Println("Both requests shared a connection")
}

The http.Client type is safe for concurrent use. Create one instance at startup and pass it to every handler or worker. Do not create a new client inside a request handler. Each new client brings its own transport and its own pool. You will leak sockets and exhaust file descriptors within minutes under load.

The default client trusts the pool. Trust the pool back.

What happens under the hood

When you call client.Get, the request travels through a fixed pipeline. The client attaches a context if you provided one. It hands the request to the transport. The transport checks its internal map for an idle connection matching the target host and port. If it finds one, it verifies the connection is still alive by checking for read errors. If the socket is healthy, the transport writes the HTTP request on it. If no idle connection exists, the transport dials a new TCP socket, runs the TLS handshake, and writes the request.

After the server sends the response, the client reads the headers and hands you a *http.Response. The body is a stream. When you call resp.Body.Close(), the client checks the Connection header. If the server sent Connection: keep-alive (or nothing, since keep-alive is the default), the client returns the underlying connection to the transport. The transport marks it as idle and starts the timeout timer. If the server sent Connection: close, the client tears down the socket immediately.

The pool enforces its limits aggressively. If you exceed MaxIdleConns, the transport closes the oldest idle connection to make room. If you exceed MaxIdleConnsPerHost, it closes the oldest connection for that specific host. If an idle connection sits past IdleConnTimeout, a background goroutine closes it. This prevents your process from holding onto dead sockets or consuming memory for connections that will never be used again.

The transport also handles pipelining and HTTP/2 multiplexing automatically. You do not need to configure those features. They activate when the server advertises support. The pool just hands out the right underlying connection type.

Pool limits are safety rails. Set them once and let the transport manage the rest.

Tuning the pool for production

Default limits work for low-traffic services. They break when you fan out to dozens of backends or handle thousands of concurrent requests. High concurrency exposes two problems. The global pool cap might be too low, causing the transport to close healthy connections prematurely. The per-host cap might be too low, forcing sequential requests to the same service even when the backend can handle parallelism.

Here is a tuned transport for a high-throughput proxy or aggregator:

package main

import (
	"net/http"
	"time"
)

func main() {
	// Bump the global pool to survive traffic spikes
	transport := &http.Transport{
		MaxIdleConns: 400,
		// Allow more parallel connections to each backend
		MaxIdleConnsPerHost: 100,
		// Drop stale sockets faster to free OS file descriptors
		IdleConnTimeout: 30 * time.Second,
		// Skip DNS caching for dynamic backends
		DisableKeepAlives: false,
	}

	// Attach the transport to a client with a hard timeout
	client := &http.Client{
		Transport: transport,
		// Covers DNS, TCP, TLS, and full response read
		Timeout: 5 * time.Second,
	}

	_ = client
}

The Timeout field on http.Client is a hard deadline. It covers everything from DNS resolution to reading the last byte of the response body. If the deadline passes, the client cancels the request and returns an error. The transport does not reuse that connection. It marks it as broken and discards it. This prevents slow servers from starving your pool.

You can also use context.WithTimeout for per-request deadlines. The context takes precedence over the client timeout. If you pass a context with a shorter deadline, the request fails early. The transport still returns the connection to the pool if the failure happened during the read phase and the socket is still healthy.

Convention note: http.Client is designed to be created once and shared. The community treats it as a singleton per service. Pass it through constructors or dependency injection. Do not recreate it per request. The boilerplate of wiring it through your code is cheaper than debugging socket exhaustion.

Tune the pool to match your traffic shape. Do not guess.

When keep-alive fights back

Connection reuse is not always helpful. Some legacy proxies terminate persistent connections abruptly. Some load balancers track connections by IP and get confused when a single client reuses sockets for different virtual hosts. Some APIs return Connection: close on every response because they want to reset state or enforce rate limits per handshake.

When the network infrastructure fights you, you can force Go to open a fresh connection for every request. Set DisableKeepAlives to true on the transport. The transport will close the socket immediately after reading the response. No pooling. No reuse.

transport := &http.Transport{
	// Force a new TCP connection for every single request
	DisableKeepAlives: true,
}

This kills performance. You pay the full TCP and TLS handshake cost on every call. You consume more CPU and memory. You increase latency. Use it only when you have measured the problem and confirmed that connection reuse is causing failures. Do not enable it as a debugging shortcut and leave it in production.

Another common pitfall is forgetting to close the response body. The compiler will not catch it. The code compiles cleanly. At runtime, the connection stays checked out of the pool. The pool fills up. New requests block waiting for an available socket. You eventually see context deadline exceeded or connection reset by peer errors in your logs. The fix is always the same: defer resp.Body.Close() immediately after checking for errors.

Goroutine leaks also happen when you spawn a background worker that holds a connection open and never returns it. Always attach a context with a deadline to long-running HTTP calls. Cancel the context when the parent scope exits. The transport will abort the request and release the socket.

The worst connection bug is the one that never logs. Close bodies. Respect deadlines. Watch your file descriptor count.

Choosing your connection strategy

Use the default client when you are calling a handful of external APIs and want connection reuse without configuration. Use a custom transport with tuned limits when your service makes hundreds of concurrent requests to the same host and the default pool cap causes premature socket closure. Use DisableKeepAlives set to true when you are routing through a legacy proxy that drops persistent connections or when you need to force a fresh IP address per request for geo-routing. Use CloseIdleConnections on the transport when your program is shutting down and you want to drain the pool gracefully instead of leaving sockets dangling. Use a single client instance shared across all goroutines when you want predictable memory usage and avoid socket exhaustion.

Connection pooling is infrastructure. Treat it like one.

Where to go next

Keep-alive connections let your program reuse the same network line for multiple requests instead of dialing up a new one every time. This saves time and resources, much like keeping a phone line open for a conversation rather than hanging up and redialing for every sentence. You only turn this off if you specifically need to force a fresh connection for every single request.