The missing breadcrumb trail
You're debugging a production incident. The logs show connection refused, but your service has fifty places where it connects to databases, caches, and upstream APIs. The error message tells you what went wrong, not where. You spend an hour grepping for the string, only to realize the error bubbled up from a deep helper function and lost its origin on the way out.
Go's standard error interface gives you the message, but it doesn't carry the stack trace. To fix this, you need to build an error type that captures the call stack at the moment of failure. This turns a vague error into a precise diagnostic tool.
Why Go doesn't do this by default
A stack trace is a list of function calls leading to the error. It's the breadcrumb trail of execution. When a panic happens, Go prints one automatically. Regular errors are different. They are just values. They don't know they're errors until something checks them.
The standard library keeps errors lightweight. Adding a stack trace costs memory and CPU because capturing the stack requires asking the runtime for the current program counter addresses. You trade a bit of performance for debugging clarity. The trade-off is usually worth it in complex systems where finding the source of an error matters more than saving a few nanoseconds.
Go encourages explicit error handling. The convention is to return errors and let the caller decide what to do. If every error captured a stack trace, the overhead would hit every return path, including happy paths in tight loops. Stack traces belong on errors that indicate a failure, not on every control flow decision.
Stack traces are breadcrumbs. Capture them at the source.
Capturing the stack
The core mechanism is runtime.Callers. This function walks the call stack and fills a buffer with program counter addresses. Each address points to a specific instruction in the binary. You store these addresses in your error struct and decode them later when you need to print the trace.
Here's the skeleton of a stack-trace error. It stores the message and a slice of program counters, then captures the stack when constructed.
package main
import (
"fmt"
"runtime"
)
// StackError implements the error interface and stores raw stack data.
type StackError struct {
msg string
stack []uintptr // uintptr holds the program counter addresses.
}
// Error satisfies the error interface.
func (e *StackError) Error() string {
return e.msg
}
// NewStackError captures the call stack at the point of creation.
func NewStackError(msg string) error {
const depth = 32 // Buffer size for stack frames.
var pcs [depth]uintptr
// Skip 2 frames to exclude runtime.Callers and this function.
n := runtime.Callers(2, pcs[:])
return &StackError{msg: msg, stack: pcs[:n]}
}
func main() {
err := NewStackError("disk full")
fmt.Println(err)
}
The skip parameter in runtime.Callers is crucial. It tells the runtime how many frames to ignore before recording. You skip two: one for runtime.Callers itself and one for NewStackError. The result points to the caller of NewStackError. If you wrap this function, you need to adjust the skip count or pass it as a parameter.
The compiler rejects the type if you forget the Error method. You'll see cannot use StackError as error if the struct doesn't implement the interface. The receiver name e follows the convention of using a short name matching the type.
Decoding and formatting
Capturing the stack is half the battle. You also need to turn those raw addresses into readable text. The runtime.CallersFrames function decodes the program counters. It returns an iterator because decoding can be slow and you might not need all frames. This lazy approach saves work if the error is never logged.
Here's how to decode the stack. You iterate over the frames and build a string.
import (
"runtime"
"strings"
)
// Stack converts the raw stack data into a human-readable format.
func (e *StackError) Stack() string {
var sb strings.Builder
sb.WriteString("Stack trace:\n")
// CallersFrames creates an iterator to decode program counters.
frames := runtime.CallersFrames(e.stack)
for {
frame, more := frames.Next()
// Append file and line number for each frame.
sb.Printf("%s:%d\n", frame.File, frame.Line)
if !more {
break
}
}
return sb.String()
}
The frame.File field might be empty for assembly functions or internal runtime code. The frame.Line is the line number in the source file. Using strings.Builder avoids allocating intermediate strings during concatenation.
Decode lazily. Format only when you print.
Integrating with the standard library
Many logging libraries support %+v to print extended details. Implementing fmt.Formatter lets your error type play nicely with fmt.Printf and loggers that respect the interface. This allows callers to print just the message or the full stack based on the format verb.
Here's the formatter implementation. It checks for the + flag to decide whether to include the stack.
import "fmt"
// Format implements fmt.Formatter to support %+v.
func (e *StackError) Format(f fmt.State, c rune) {
switch c {
case 'v':
if f.Flag('+') {
// Print message and stack when + flag is set.
fmt.Fprintf(f, "%s\n%s", e.msg, e.Stack())
return
}
// Default behavior prints only the message.
fmt.Fprint(f, e.msg)
}
}
When errors are wrapped, you need errors.As to find the stack error inside the chain. Wrapping with fmt.Errorf preserves the error chain but doesn't copy the stack trace to the wrapper. The stack remains attached to the original error.
import "errors"
func process() error {
err := NewStackError("timeout")
// Wrap adds context but keeps the original error in the chain.
return fmt.Errorf("processing failed: %w", err)
}
func main() {
err := process()
var se *StackError
if errors.As(err, &se) {
// se points to the original StackError inside the wrapper.
fmt.Println(se.Stack())
}
}
The errors.As function walks the chain and matches types. It works with pointers, so se must be a pointer to StackError. This pattern lets you extract structured data from wrapped errors without losing the context added by wrappers.
Trust errors.As to dig through wrappers.
Pitfalls and performance
Stack traces allocate memory. The stack slice grows based on the depth of the call stack. In a tight loop, creating thousands of errors with stack traces can trigger the garbage collector. Use stack traces for errors that indicate a failure, not for control flow.
The runtime.Callers function walks the stack, which takes time. The cost is usually small, but it adds up in hot paths. If you need to return errors frequently, consider a flag or configuration to enable stack traces only in debug mode.
Frame filtering is often necessary. Internal library frames clutter the output. You might want to filter frames that start with vendor/ or runtime/. The Stack method can be extended to skip frames based on file path patterns.
// StackFiltered returns the stack trace excluding internal frames.
func (e *StackError) StackFiltered() string {
var sb strings.Builder
frames := runtime.CallersFrames(e.stack)
for {
frame, more := frames.Next()
// Skip frames from the runtime package.
if frame.Function != "" && !strings.HasPrefix(frame.Function, "runtime.") {
sb.Printf("%s:%d\n", frame.File, frame.Line)
}
if !more {
break
}
}
return sb.String()
}
The compiler complains with cannot use x as error if you try to return a struct that doesn't implement the interface. Always verify the interface implementation. The compiler also warns about unused imports. If you import runtime but don't use it, you get imported and not used. Remove unused imports to keep the code clean.
Errors are values, not control flow. Don't pay for stacks you don't need.
When to add stack traces
Use a stack-trace error when the error originates deep in the call stack and you need to identify the caller for debugging. Use a simple string error when the error is local and the context is obvious from the log line. Use fmt.Errorf with %w when you need to wrap an error and preserve the chain for errors.Is checks. Use a panic when the program is in an unrecoverable state and continuing would cause data corruption. Use a custom error type with fields when you need to attach structured data like request IDs or correlation IDs.
Wrap errors to preserve context. Capture stacks to find the source.