The zombie connection problem
You send a request to your database. The query runs. The database is busy. Your Go program sits there, holding a connection, waiting for a response that never comes. One request hangs. Then ten. Then a hundred. Your connection pool fills up with dead weight. New requests fail because there are no free connections left. The service grinds to a halt.
This is the zombie connection problem. It happens when you forget to tell Go how long to wait before giving up. Without timeouts, a single slow query or a dropped network packet can exhaust your resources. Timeouts are not optional safety features. They are the boundaries that keep your system stable under load.
Timeouts are deadlines
A timeout is a deadline. It tells the runtime how long to wait for an I/O operation before aborting. Go provides two main mechanisms for setting deadlines. The first mechanism is the connection pool on your *sql.DB instance. This controls the lifecycle of connections: how many you can have, how long they stay idle, and when they get rotated. The second mechanism is the context package. This controls the duration of a single query or operation.
You also need to handle HTTP transport timeouts if you are talking to a database over an HTTP API, such as a database proxy or a cloud provider's REST endpoint. The net/http package gives you fine-grained control over the TCP handshake, TLS negotiation, and data transfer.
The GODEBUG environment variable does not control timeouts. It controls internal runtime flags for debugging, such as disabling HTTP/2 or changing tar path security. Using GODEBUG to manage timeouts is a dead end. Configuration belongs in code, not in environment variables reserved for runtime internals.
Configure the connection pool
The database/sql package uses a pool of connections. When you call sql.Open, you get a *sql.DB object. This object is a pool, not a single connection. It is thread-safe and designed to be shared across goroutines. The pool manages the lifecycle of connections behind the scenes.
Convention aside: sql.Open is lazy. It parses the connection string but does not actually connect to the database. This is by design. It lets you set up pool parameters before any network traffic happens. The first query triggers the actual connection.
Here's the minimal pool configuration. You set the limits immediately after opening the database.
import (
"database/sql"
"log"
"time"
)
func newDB() *sql.DB {
// sql.Open parses the DSN but does not connect yet
db, err := sql.Open("postgres", "host=localhost user=app password=secret")
if err != nil {
log.Fatal(err)
}
// Cap total connections to prevent overwhelming the database
db.SetMaxOpenConns(25)
// Keep a baseline of idle connections warm for instant reuse
db.SetMaxIdleConns(25)
// Rotate connections every 5 minutes to handle network changes
db.SetConnMaxLifetime(5 * time.Minute)
return db
}
SetMaxOpenConns sets the maximum number of open connections to the database. Without this cap, a traffic spike could create thousands of connections and crash the database server. The default is zero, which means unlimited. Always set a limit.
SetMaxIdleConns sets the maximum number of connections in the idle pool. Idle connections are closed connections that the pool keeps open for reuse. If you set this to zero, the pool closes idle connections immediately. This adds latency because every request must create a new TCP connection. Setting this equal to MaxOpenConns is a common pattern for high-throughput services.
SetConnMaxLifetime sets the maximum amount of time a connection may be reused. Connections are closed after this duration. This is crucial for handling network changes. If a load balancer rotates IP addresses, or if a TLS certificate expires, old connections might fail. Rotating connections ensures the pool uses fresh connections that reflect the current network state. Five minutes is a safe default.
The pool is a resource manager. Treat it like one.
Context and query deadlines
Pool settings control the infrastructure. They do not control individual queries. If a query is slow, the pool settings won't stop it. You need a context to set a deadline for the query itself.
The context package is the standard way to pass deadlines and cancellation signals. Functions that perform I/O should accept a context.Context as the first parameter. The convention is to name it ctx. The context carries the deadline. When the deadline passes, the context is canceled. The driver receives the cancellation and aborts the query on the database side.
Here's how you use a context to limit query duration. You create a context with a timeout and pass it to QueryContext.
import (
"context"
"database/sql"
"fmt"
"time"
)
func getUser(db *sql.DB, id int) (string, error) {
// Create a context that cancels after 2 seconds
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
// Always defer cancel to release resources
defer cancel()
var name string
// Pass ctx to the query so the driver can cancel on deadline
err := db.QueryRowContext(ctx, "SELECT name FROM users WHERE id = $1", id).Scan(&name)
if err != nil {
return "", fmt.Errorf("query user: %w", err)
}
return name, nil
}
The context.WithTimeout function returns a context and a cancel function. The cancel function must be called to release resources associated with the context. Using defer cancel ensures the cleanup happens when the function returns.
If the query takes longer than two seconds, the context expires. The driver cancels the query. The Scan call returns an error. You can check the error to see if it was a timeout. The compiler rejects the program with undefined: context if you forget to import the package. The compiler complains with cannot use ctx (type context.Context) as string value in argument if you pass the context in the wrong position.
Context is the kill switch. Every query needs one.
HTTP transport timeouts
Sometimes you talk to a database via HTTP. This happens with database proxies, cloud APIs, or when your database exposes a REST interface. In these cases, you configure timeouts on the http.Client transport.
The http.Transport struct controls the low-level network behavior. It handles TCP dialing, TLS handshakes, connection reuse, and idle timeouts. You should create one http.Client and reuse it across your application. Creating a new client per request is a performance anti-pattern.
Here's how you configure an HTTP client with comprehensive timeouts. The transport fields control different phases of the request.
import (
"net/http"
"time"
)
func newHTTPClient() *http.Client {
return &http.Client{
// Timeout covers the entire request: dial, write, read, and TLS
Timeout: 30 * time.Second,
Transport: &http.Transport{
// DialContext controls TCP connection establishment
DialContext: (&net.Dialer{
// Timeout for TCP handshake
Timeout: 5 * time.Second,
// Keep-alive ping interval
KeepAlive: 30 * time.Second,
}).DialContext,
// Force HTTP/2 if the server supports it
ForceAttemptHTTP2: true,
// MaxIdleConns limits idle connections per host
MaxIdleConns: 100,
// IdleConnTimeout closes idle connections after 90 seconds
IdleConnTimeout: 90 * time.Second,
// TLSHandshakeTimeout limits TLS negotiation time
TLSHandshakeTimeout: 10 * time.Second,
// ExpectContinueTimeout handles 100-continue responses
ExpectContinueTimeout: 1 * time.Second,
},
}
}
The Timeout field on the http.Client is a hard deadline for the entire request. It includes dialing, writing the request, reading the response headers, and reading the response body. If any part exceeds the timeout, the request fails.
The DialContext timeout controls the TCP handshake. If the server is unreachable, this timeout fails fast. The TLSHandshakeTimeout controls the TLS negotiation. If the server is slow to respond with certificates, this timeout prevents hanging.
The IdleConnTimeout controls how long idle connections stay in the pool. Servers and load balancers often close idle connections. If the client keeps them open, the next request might fail with a connection reset error. Setting this timeout ensures the client prunes stale connections.
Convention aside: Don't pass a *string or *int to functions. Pass values. Strings and integers are cheap to copy. The same applies to http.Client. Pass the client by value or pointer, but don't wrap simple types in pointers unnecessarily.
Goroutines are cheap. Channels are not magic. HTTP clients are expensive. Reuse them.
Common pitfalls and errors
Timeouts seem simple, but there are traps. The most common trap is confusing pool settings with query deadlines. Pool settings manage connections. Context deadlines manage queries. You need both.
Another trap is sql.Open errors. sql.Open returns an error if the connection string is malformed. It does not return an error if the database server is down. The first query fails. If you don't check the error from the query, your program might crash later with a nil pointer dereference. Always check errors. The verbose if err != nil { return err } pattern is standard in Go. It makes the unhappy path visible.
Runtime panics happen when you forget to handle context cancellation. If a goroutine waits on a channel that never gets closed, the goroutine leaks. Always have a cancellation path. The worst goroutine bug is the one that never logs.
Compiler errors help you catch mistakes early. If you forget to capture a loop variable, the compiler rejects the program with loop variable i captured by func literal. If you pass the wrong type to a function, the compiler complains with cannot use x (type int) as string value in argument. If you import a package and don't use it, the compiler rejects the build with imported and not used. Trust the compiler. It enforces correctness.
Decision matrix
Timeouts are a tool. Pick the right tool for the job.
Use SetConnMaxLifetime when you need to rotate connections to handle network changes or certificate renewals.
Use SetMaxOpenConns when you need to protect the database from connection storms caused by traffic spikes.
Use SetMaxIdleConns when you want to reduce latency by keeping a pool of warm connections ready for reuse.
Use context.WithTimeout when you need to limit the duration of a single query or operation.
Use http.Transport timeouts when you are making HTTP requests to a database API or proxy and need control over dial, TLS, and idle behavior.
Use GODEBUG when you need to debug internal runtime behavior, never for production timeout configuration.
Use plain sequential code when you don't need concurrency. The simplest thing that works is usually the right thing.