Cancellation trees in Go
You are building a service that fetches user data from three different microservices. The frontend waits for all three to return. One service is slow or crashes. You don't want the other two hanging around wasting resources. You need a way to shout "stop" to everyone the moment one thing goes wrong.
Go does not provide a built-in data structure called a "cancellation tree." Instead, the language gives you context.Context. You build the tree by passing contexts down to child goroutines. When a parent cancels, the signal flows to every child that holds a derived context. This pattern is the backbone of cancellation in Go. It works for HTTP requests, CLI tools, background workers, and database transactions.
Think of a construction site. The foreman holds the master switch. If the foreman flips it, every worker stops. If a worker finds a gas leak, they flip the switch for their zone, and everyone in that zone stops. The switch is the context. Flipping it is cancellation. The wiring is the call stack. You don't need to track individual worker IDs. You just pass the switch handle down the chain.
The context tree
context.Context is an interface that carries deadlines, cancellation signals, and request-scoped values across API boundaries. It is designed to be passed by value. Functions accept a context as the first parameter, conventionally named ctx. This convention is universal in Go. If you see a function signature, the context is almost always first.
The tree starts with a root context. context.Background() returns an empty root context for the top level of your program. context.TODO() is a placeholder when you don't have a context yet; it behaves like Background() but signals that you need to wire one up later.
You create child contexts using helper functions. context.WithCancel returns a derived context and a cancel function. Calling cancel closes the Done channel for that context and all its descendants. context.WithTimeout and context.WithDeadline add automatic cancellation after a duration or at a specific time. context.WithCancelCause (added in Go 1.20) allows you to attach an error to the cancellation signal, so downstream code knows why the work stopped.
Context is plumbing. Run it through every long-lived call site.
Minimal example
Here is the simplest way to cancel a group of goroutines. You create a context, pass it to workers, and call cancel when you are done.
package main
import (
"context"
"fmt"
"time"
)
// Worker simulates a task that checks for cancellation.
func Worker(ctx context.Context, id int) {
select {
case <-time.After(2 * time.Second):
// Task completed successfully within the timeout.
fmt.Printf("Worker %d finished\n", id)
case <-ctx.Done():
// Context was cancelled; stop work immediately.
fmt.Printf("Worker %d cancelled\n", id)
}
}
func main() {
// WithCancel creates a derived context and a cancel function.
// Calling cancel closes the Done channel for all derived contexts.
ctx, cancel := context.WithCancel(context.Background())
for i := 0; i < 3; i++ {
// Pass the context to each goroutine so they can listen for cancellation.
go Worker(ctx, i)
}
time.Sleep(1 * time.Second)
// Invoke cancel to signal all workers to stop.
cancel()
time.Sleep(100 * time.Millisecond)
}
The select statement waits on two channels. time.After sends a value after two seconds. ctx.Done() returns a channel that closes when the context is cancelled. When cancel() runs, the Done channel closes. The select picks the ctx.Done() case, and the goroutine exits.
This pattern scales. If Worker spawns sub-workers, it passes the same ctx. When the parent cancels, the sub-workers wake up too. The tree propagates the signal automatically. You never need to store a list of goroutine IDs. The context handles the topology.
Realistic pattern with errgroup
Manual cancellation works, but real code often needs more. You want to wait for all goroutines to finish. You want to cancel everyone if one fails. You want to capture the error. The errgroup package from golang.org/x/sync handles this boilerplate.
Here is a helper function that simulates a network call. It respects cancellation.
// FetchData simulates a network call that respects cancellation.
func FetchData(ctx context.Context, service string) error {
select {
case <-time.After(1 * time.Second):
// Simulate failure after a delay.
return fmt.Errorf("service %s unavailable", service)
case <-ctx.Done():
// Return the context error if cancelled.
return ctx.Err()
}
}
Now here is the main function using errgroup. It binds the group to the context and cancels automatically on the first error.
func main() {
// WithCancelCause stores the error that triggered cancellation.
ctx, cancel := context.WithCancelCause(context.Background())
// Defer cancel to ensure resources are freed even on panic.
defer cancel()
// errgroup.WithContext creates a group that cancels the context
// automatically when the first goroutine returns an error.
g, ctx := errgroup.WithContext(ctx)
services := []string{"auth", "billing", "inventory"}
for _, svc := range services {
// Shadow variable to capture loop iteration value correctly.
svc := svc
g.Go(func() error {
return FetchData(ctx, svc)
})
}
// Wait blocks until all goroutines finish or one fails.
if err := g.Wait(); err != nil {
fmt.Printf("Operation failed: %v\n", err)
// Retrieve the specific error that caused cancellation.
if cause := context.Cause(ctx); cause != nil {
fmt.Printf("Root cause: %v\n", cause)
}
}
}
errgroup.WithContext returns a group and a context. When you call g.Go, the group tracks the goroutine. If any goroutine returns a non-nil error, g.Wait returns immediately with that error. The group also calls cancel with the error as the cause. This stops all other goroutines. context.Cause(ctx) retrieves the error later. This is cleaner than managing a separate error channel or mutex.
The worst goroutine bug is the one that never logs.
Pitfalls and errors
Cancellation trees are simple, but they have traps. The compiler and runtime will catch some mistakes, but others require discipline.
If you forget to capture the loop variable in a goroutine, the compiler rejects the program with loop variable i captured by func literal in Go 1.22 and later. In older versions, the compiler allows it, but all goroutines share the same variable. They all see the last value. Shadowing the variable with i := i or svc := svc fixes this. The shadow creates a new variable for each iteration.
Goroutine leaks happen when a goroutine waits on a channel that never closes. If you spawn a goroutine and don't pass the context, or the goroutine blocks on a channel without checking ctx.Done(), it leaks. The context is your escape hatch. Always check ctx.Done() in long-running loops or blocking calls. If you wrap a channel operation, use select with ctx.Done().
The compiler complains with cannot use x (untyped int constant) as string value in argument if you pass the wrong type. Forget to import a package and you get undefined: pkg. Forget to use one and you get imported and not used. These errors are straightforward. Fix them and move on.
Context values are a feature, not a free-for-all. context.WithValue lets you attach data to the context. Use it only for request-scoped data like trace IDs, authentication tokens, or deadlines. Never use context to pass optional parameters to functions. That is a sign of bad API design. If a function needs a parameter, add it to the signature. The context package documentation says this explicitly. The community follows this rule.
Trust gofmt. Argue logic, not formatting.
Decision matrix
Pick the right tool based on your needs. Context functions compose, so you can chain them. errgroup wraps context, so you can use both.
Use context.WithCancel when you need to stop a group of goroutines manually and don't need to propagate an error reason.
Use context.WithCancelCause when you need to propagate the error reason along with the cancellation signal so downstream code can inspect the cause.
Use errgroup.WithContext when you want automatic cancellation on the first error and want to collect errors easily without writing boilerplate.
Use context.WithTimeout when the operation must finish within a specific duration regardless of errors, and you want the context to cancel automatically.
Use context.WithDeadline when you have an absolute time limit, like a database transaction expiry or a scheduled job cutoff.
Use plain goroutines without context when the task is fire-and-forget and cancellation is irrelevant, such as logging a metric or updating a counter.