When a single goroutine brings down the process
You are running a Go service. It handles requests, processes background jobs, and stays alive for weeks. Then a malformed input hits a worker goroutine. The worker panics. The panic unwinds the worker's stack. Nothing catches it. The worker dies. The main goroutine is still running, so the process survives. Or maybe the panic happened in the main goroutine, or a critical goroutine that main was waiting on, and the entire process crashes with a stack trace. One bad execution path took down the system.
In Go, panics are contagious within a goroutine but isolated between goroutines. A panic in one goroutine does not automatically crash others. However, if you do not recover the panic, that goroutine terminates abruptly. If that goroutine was holding a lock, writing to a channel, or was the only thing keeping the process alive, the consequences ripple outward. You need a mechanism to catch the panic, log the failure, and decide whether to continue or abort.
The mechanics of panic and recover
A panic is an unrecoverable error in the current execution flow. It signals that something went wrong so badly the function cannot return a normal value. When a panic occurs, the runtime stops normal execution and begins unwinding the stack. It calls deferred functions in last-in-first-out order. If the panic reaches the top of the stack without being stopped, the program prints a stack trace and exits.
recover is the only built-in function that can stop a panic. It must be called inside a deferred function. If you call recover directly in the function body, it returns nil and does nothing. The defer is the safety net. The panic is the fall. You must deploy the net before you jump.
When recover runs inside a deferred function during a panic, it returns the panic value and halts the unwinding. The function returns normally, as if the panic never happened. If no panic is in progress, recover returns nil. This distinction lets you check whether a panic occurred and handle it gracefully.
Convention aside: Go developers prefer returning errors for expected failures. Use if err != nil for missing files, network timeouts, or invalid input. Reserve panic for truly unexpected conditions like programming bugs, invariant violations, or initialization failures where continuing is impossible. Panics are for crashes, not control flow.
Minimal example: catching a panic
Here is the basic pattern: defer a function that calls recover, check the result, and log the failure.
package main
import "fmt"
// worker runs a task that might panic and recovers gracefully.
func worker(id int) {
// Defer runs when the function returns, even if it panics.
defer func() {
// Recover only works inside a deferred function.
// It returns the panic value if a panic is in progress, else nil.
if r := recover(); r != nil {
fmt.Printf("goroutine %d caught panic: %v\n", id, r)
}
}()
// Simulate a panic.
panic("something went wrong")
}
func main() {
// Spawn a goroutine that panics.
go worker(1)
// Wait for the goroutine to finish.
// In real code, use a sync.WaitGroup or channel.
// This select blocks forever for the demo.
select {}
}
The deferred anonymous function captures the panic. recover() returns the string "something went wrong". The if check confirms a panic occurred. The goroutine prints the message and exits cleanly. The main goroutine continues running.
Walkthrough: what happens at runtime
When worker calls panic, the runtime interrupts normal execution. It marks the goroutine as panicking and starts unwinding the stack. The deferred function is the last thing on the stack, so it runs first.
Inside the deferred function, recover() checks the panic state. Since a panic is active, it returns the panic value and clears the panic state. The unwinding stops. The deferred function completes. The worker function returns normally. The goroutine finishes.
If you remove the defer or call recover outside the defer, the behavior changes. The compiler accepts recover() anywhere in the code, but the runtime returns nil if it is not inside a deferred function during a panic. You might write code that compiles fine but fails to catch the panic at runtime. The panic continues unwinding, the goroutine dies, and the process might crash if the panic was in the main goroutine.
Recover returns any, which is an interface type. You can type-assert the value if you need to inspect it. For example, if you panic with a custom error type, you can recover and check the type. This lets you handle different panic values differently.
Convention aside: The receiver name for methods is usually one or two letters matching the type, like (w *Worker). This keeps code concise. When writing recovery wrappers, keep the receiver name short and consistent.
Realistic patterns: HTTP handlers and error transformation
In production code, you rarely recover just to log. You often want to transform a panic into an error, return a safe response, or propagate the failure to a supervisor. HTTP handlers are a common place to add recovery, because a panic in a handler can crash the server or return a broken response.
Here is a middleware pattern that wraps an HTTP handler and recovers panics.
package main
import (
"fmt"
"net/http"
)
// panicHandler wraps an http.HandlerFunc to recover from panics.
func panicHandler(h http.HandlerFunc) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
// Defer recovery logic to catch panics in the handler.
defer func() {
if r := recover(); r != nil {
// Log the panic details for debugging.
fmt.Printf("panic in handler: %v\n", r)
// Return a 500 error to the client.
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
}
}()
// Call the original handler.
h(w, r)
}
}
func main() {
// Wrap the handler with panic protection.
handler := panicHandler(func(w http.ResponseWriter, r *http.Request) {
panic("bad data")
})
// Start server.
http.ListenAndServe(":8080", handler)
}
The wrapper defers recovery before calling the handler. If the handler panics, the deferred function catches it, logs the panic, and sends a 500 response. The server stays alive and can handle the next request.
Another pattern is transforming a panic into an error value. This is useful in worker pools where you want to report failures without crashing the worker.
package main
import (
"fmt"
)
// safeWorker runs a task and sends errors to a channel.
func safeWorker(id int, done chan<- error) {
// Defer to convert panic to error and send to channel.
defer func() {
if r := recover(); r != nil {
// Wrap the panic value in an error.
done <- fmt.Errorf("worker %d panicked: %w", id, r)
} else {
// No panic, signal success.
done <- nil
}
}()
// Simulate a panic.
panic("fail")
}
func main() {
ch := make(chan error, 1)
go safeWorker(1, ch)
// Receive the result.
err := <-ch
if err != nil {
fmt.Printf("worker failed: %v\n", err)
}
}
The worker recovers the panic, wraps it in an error using fmt.Errorf with %w, and sends it to the channel. The caller receives the error and handles it. This turns an uncontrolled panic into a structured error flow.
Convention aside: context.Context always goes as the first parameter, conventionally named ctx. Functions that take a context should respect cancellation and deadlines. Use context for stopping operations, not panic. Panic is for crashes; context is for lifecycle management.
Pitfalls and common mistakes
Recovering panics introduces subtle bugs if you get the details wrong. Here are the most common traps.
Recovering in the wrong goroutine does nothing. Each goroutine has its own stack. A defer in the main goroutine cannot catch a panic in a worker goroutine. You must place the recovery logic inside the goroutine that might panic. If you defer in main and spawn a worker, the worker's panic will kill the worker, and main's defer will never see it. The compiler does not warn you about this. You get a runtime panic and a dead goroutine.
Swallowing panics hides bugs. If you recover and do nothing, you lose the stack trace and the panic value. The code continues as if nothing happened, but the state might be corrupted. Always log the panic or transform it into an error. The worst goroutine bug is the one that never logs.
Calling recover outside a deferred function returns nil. The compiler accepts this code, but the runtime ignores it. You might write a helper function that calls recover and expects it to work, only to find it always returns nil. Remember that recover only works inside a deferred function during a panic.
Panic for control flow breaks expectations. Some developers use panic and recover to jump out of nested loops or return early from deep call stacks. This is an anti-pattern. It makes code hard to read and reason about. Use error returns, channels, or context cancellation instead. Panics should be rare and exceptional.
Resource leaks can occur if panic bypasses cleanup. If a function acquires a lock or opens a file and then panics, the deferred cleanup runs. However, if you recover and return, you must ensure the cleanup still happens. Defer runs on both normal return and panic, so deferred cleanup is safe. But if you recover and re-panic, or if you have multiple defers, the order matters. Trust defer for cleanup, but verify the logic.
Convention aside: gofmt is mandatory. Do not argue about indentation or brace placement. Let the tool decide. Most editors run gofmt on save. Focus on logic, not formatting.
Decision: when to use recover vs alternatives
Use recover in a deferred function when you need to catch a panic in the current goroutine and prevent process termination.
Use error returns (error type) for expected failure conditions like missing files, network timeouts, or invalid input.
Use context cancellation to stop long-running operations when a deadline passes or the caller gives up.
Use panic only for truly unrecoverable errors like programming bugs, invariant violations, or initialization failures where continuing is impossible.
Use a wrapper middleware or decorator when you want to protect multiple handlers or workers with a single recovery point.
Use a channel to propagate errors from goroutines when you need to report failures to a supervisor without crashing the worker.