The zombie connection problem
You have a chat application. A user walks out of Wi-Fi range. The TCP connection doesn't break immediately. The server keeps a goroutine running for that user, waiting for a message that will never come. Meanwhile, the client thinks it's still connected and tries to send a message, which fails silently or hangs. The server has a "zombie" connection. You need a way to detect this liveness gap without polling the database or guessing.
Network equipment kills idle connections. ISPs, NAT gateways, and load balancers drop TCP sessions that sit quiet for too long. The server never sees a FIN or RST packet. The socket stays open in the kernel, your goroutine stays blocked on ReadMessage, and memory leaks accumulate until the process crashes. A heartbeat solves this by forcing traffic through the connection at regular intervals. If the traffic stops flowing, you close the connection and reclaim the resources.
Heartbeats as protocol insurance
WebSockets include built-in control frames for this exact purpose: Ping and Pong. The server sends a Ping frame. The client must respond with a Pong frame. If the Pong doesn't arrive within a timeout, the server closes the connection. This works at the protocol level, not your application logic. The WebSocket library handles the frame encoding and decoding. You just register handlers and send the Ping.
Think of it like a library book due date. The connection is the book. The Ping is the librarian checking the due date. The Pong is the borrower renewing the book. If the borrower never shows up to renew, the librarian marks the book as overdue and reclaims the shelf space.
The minimal heartbeat setup
Here's the skeleton: upgrade the connection, register the Pong handler, and start a background goroutine to send pings. The Pong handler resets the read deadline. The ping loop sends frames periodically and exits if the write fails or the request context cancels.
func handler(w http.ResponseWriter, r *http.Request) {
conn, err := upgrader.Upgrade(w, r, nil)
if err != nil {
log.Println("upgrade failed:", err)
return
}
// Close connection when the handler returns to prevent resource leaks.
defer conn.Close()
// SetPongHandler runs when the client responds to a Ping.
conn.SetPongHandler(func(appData string) error {
// Reset the read deadline to keep the connection alive.
conn.SetReadDeadline(time.Now().Add(pingPeriod))
return nil
})
// Start the background goroutine that sends Ping frames.
go pingLoop(conn, r.Context())
}
The ping loop runs in a separate goroutine. It sends a Ping frame periodically and exits if the write fails or the request context cancels.
// pingLoop sends Ping frames to maintain liveness and respects request cancellation.
func pingLoop(conn *websocket.Conn, ctx context.Context) {
ticker := time.NewTicker(pingPeriod / 2)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
// WriteControl is safe to call concurrently with WriteMessage.
err := conn.WriteControl(websocket.PingMessage, []byte{}, time.Now().Add(writeWait))
if err != nil {
return
}
}
}
}
Goroutines are cheap. Zombie connections are expensive.
How the deadline pattern works
The robust pattern uses the read deadline as the heartbeat timer. You set a deadline on the connection using SetReadDeadline. This deadline applies to the next ReadMessage call. If no frame arrives by the deadline, ReadMessage returns a timeout error. The Pong handler resets the deadline every time a Pong arrives.
The ping loop sends a Ping frame roughly halfway through the deadline period. This ensures the client has time to respond before the deadline expires. If the client responds with a Pong, the handler resets the deadline. If the client is dead, the Pong never arrives, the deadline expires, and ReadMessage returns an error. The read loop detects the error and returns, triggering defer conn.Close().
This pattern avoids race conditions. You don't need shared timestamps or mutexes. The deadline is part of the connection state. The Pong handler updates the deadline atomically. The read loop checks the deadline implicitly by blocking on the read.
Context is plumbing. Run it through every long-lived call site. The ping loop checks ctx.Done() to stop when the client leaves the page or the HTTP request is cancelled. This prevents goroutine leaks when the connection closes via the HTTP layer.
Production-ready read loop
A production handler needs to read messages, write messages, and manage the heartbeat all at once. The standard pattern uses a read loop that checks for stale connections and a write loop that sends application messages. The ping loop runs independently.
Here's the read loop: it blocks until a message arrives or the deadline expires. The Pong handler resets the deadline, so a missing Pong triggers a timeout error.
// readLoop processes incoming messages and detects stale connections via the read deadline.
func readLoop(conn *websocket.Conn) {
// SetReadDeadline forces ReadMessage to return if no frame arrives.
conn.SetReadDeadline(time.Now().Add(pingPeriod))
for {
_, _, err := conn.ReadMessage()
if err != nil {
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseNormalClosure) {
log.Println("read error:", err)
}
return
}
}
}
Trust the deadline. Let the read loop die when the silence gets too loud.
Pitfalls and compiler errors
The compiler won't stop you from blocking forever. If you omit the read deadline, ReadMessage waits indefinitely. The server holds the goroutine until the TCP stack gives up, which can take minutes. You get a goroutine leak. The error you see when the connection finally dies is often use of closed network connection or a timeout from the OS.
If you call WriteMessage from multiple goroutines, you risk a panic or deadlock. WriteMessage is not safe for concurrent use. Use WriteControl for pings. WriteControl can be called concurrently with WriteMessage. The library handles the synchronization. If you use WriteMessage for pings, the compiler won't complain, but the runtime will corrupt the write buffer.
Forgetting defer conn.Close() is a common mistake. The connection holds file descriptors and memory. If the handler returns without closing the connection, the resources leak. The compiler doesn't enforce cleanup. You must add the defer explicitly.
The compiler rejects the program with loop variable i captured by func literal if you capture a loop variable in a closure without assigning it to a new variable. This applies if you spawn goroutines in a loop. Always assign the loop variable to a local variable before passing it to the goroutine.
The worst goroutine bug is the one that never logs.
Decision matrix
Use WebSocket Ping/Pong when you maintain long-lived bidirectional connections and need to detect silent drops caused by NAT timeouts or network partitions. Use Server-Sent Events when your data flows only from server to client and you want to avoid the overhead of WebSocket control frames. Use HTTP long-polling when you must support legacy browsers or traverse proxies that block WebSocket upgrades. Use a plain read deadline without pings when connections are short-lived and the cost of heartbeat logic outweighs the benefit of early detection.