The orphan goroutine problem
You write an HTTP handler that fetches user data from a database and checks a cache. You spin up two goroutines to do the work in parallel. The handler returns the result. Everything looks fine until you deploy. The load balancer marks your instance unhealthy because the process is holding too many open file descriptors. You dig in and find hundreds of goroutines stuck in the background, waiting for a response that never comes because the parent request already finished and closed the connection. The goroutines are orphans. They have no parent to wait for them, no signal to stop, and no way to report errors back to the caller.
This is the core problem with raw concurrency. Goroutines are lightweight and easy to spawn, but they are also easy to lose. Without discipline, background tasks can outlive the code that created them. They hold memory, keep database connections open, and block on channels that no one will ever close. The process slowly leaks resources until it crashes. Structured concurrency solves this by enforcing a rule: a parent goroutine must wait for all its children to finish before it exits. The lifetime of the child is bounded by the lifetime of the parent.
Bounding the lifetime of children
Structured concurrency is a programming model where every concurrent task has a clear parent, and the parent does not exit until all its children have completed. The scope of the children is contained within the scope of the parent. If the parent needs to stop early, it cancels the children. If a child crashes, the parent knows. There are no stray goroutines wandering the heap after the main function returns.
Go does not enforce this structure at the compiler level. The language gives you raw goroutines and channels. You build the structure using sync.WaitGroup, context.Context, and careful channel design. The result is code where concurrency is visible in the control flow, not hidden in background threads that outlive their creators. The hierarchy matches the function call tree. When you read the code, you can see exactly which goroutines belong to which function.
Goroutines are cheap. Structure is free. Build the tree.
The minimal pattern
Here's the skeleton of structured concurrency: a wait group tracks active children, and a context carries the cancellation signal.
package main
import (
"context"
"fmt"
"sync"
"time"
)
// FetchData simulates a concurrent operation.
func FetchData(ctx context.Context, wg *sync.WaitGroup) {
defer wg.Done() // Decrement counter when function returns.
select {
case <-time.After(100 * time.Millisecond):
fmt.Println("Data fetched")
case <-ctx.Done(): // Exit if parent cancels.
fmt.Println("Cancelled")
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel() // Ensure resources are cleaned up.
var wg sync.WaitGroup
wg.Add(1) // Reserve one slot in the counter.
go FetchData(ctx, &wg) // Spawn child goroutine.
wg.Wait() // Block until counter reaches zero.
}
How the pieces fit together
The sync.WaitGroup maintains an internal counter. Calling Add(1) increments it. The goroutine calls Done() when it finishes, which decrements the counter. Wait() blocks the calling goroutine until the counter hits zero. The counter is atomic, so it handles concurrent increments and decrements safely. You can call Add multiple times before any goroutine starts. This is safe and recommended. Done is just Add(-1).
The defer statement schedules Done to run when the function returns. This covers normal returns and panics. If the goroutine panics, Done still runs, so the counter decrements. The panic will still crash the program unless you have a recover, but the WaitGroup won't leak. The structure guarantees that main only exits after FetchData completes or cancels. No goroutine escapes the scope of main.
The context.Context flows from parent to child. WithCancel creates a new node in the context tree. Calling cancel closes the done channel of that node. Any child listening on ctx.Done() sees the signal and stops. The select statement blocks until one channel is ready. If ctx.Done() fires, the select picks that case and the goroutine exits. This is how cancellation propagates. The parent calls cancel, the context closes, the children see the signal, and they stop.
Context is plumbing. Run it through every long-lived call site.
Why Go leaves structure to you
Go designers chose not to bake structured concurrency into the language. Other languages like Kotlin use coroutines with structured scopes. Python has asyncio tasks. Go sticks to goroutines and channels. The reasoning is simplicity and flexibility. Goroutines are lightweight and can be spawned anywhere. Enforcing a tree structure would require changes to the runtime or compiler that might restrict valid use cases.
Go prefers to give you the tools and let you build the patterns. The community has converged on WaitGroup and Context as the standard approach. Libraries like errgroup build on top of these primitives. This approach keeps the language small. You only pay for the structure you need. If you don't need structure, you don't use WaitGroup. If you need it, you add it. The trade-off is that you must remember to add it. The compiler won't stop you from leaking a goroutine. You rely on code review and testing to catch leaks.
Trust the primitives. Compose the pattern.
Realistic aggregation with errors
Here's a handler that aggregates results from multiple sources, collects errors, and respects cancellation.
// AggregateResults combines data from multiple sources.
func AggregateResults(ctx context.Context) ([]byte, error) {
var wg sync.WaitGroup
// Buffered to hold errors from both children.
errChan := make(chan error, 2)
results := make([][]byte, 2)
wg.Add(2)
// Launch first worker.
go func() {
defer wg.Done()
data, err := fetchProfile(ctx)
if err != nil {
errChan <- err // Report error without blocking.
return
}
results[0] = data
}()
// Launch second worker.
go func() {
defer wg.Done()
data, err := fetchPrefs(ctx)
if err != nil {
errChan <- err // Report error without blocking.
return
}
results[1] = data
}()
The error channel is buffered to size two because there are two workers. If a worker fails, it sends the error and returns. The buffer prevents the worker from blocking on the send if the parent is still waiting. If the channel were unbuffered, a worker might block forever if the parent crashes or stops reading. The go func() { wg.Wait(); close(errChan) }() idiom ensures the channel closes exactly once. WaitGroup guarantees all workers finish. The closer goroutine waits for WaitGroup, then closes the channel.
// Wait for workers, then close the error channel.
go func() {
wg.Wait()
close(errChan)
}()
// Collect errors as they arrive.
for err := range errChan {
if err != nil {
return nil, err
}
}
return merge(results), nil
}
The parent ranges over the channel. range blocks until the channel is closed and drained. This collects all errors. If an error occurs, the function returns immediately. The defer cancel() in the caller ensures the context cancels if the function returns early, stopping any remaining work.
Notice the function signature. ctx is the first parameter, named ctx. This is the Go convention. Any function that starts a goroutine or calls another function that might block should accept a context. The context flows down the call stack, carrying deadlines and cancellation signals. Also, the error handling uses the standard if err != nil check. Go makes the unhappy path explicit. You cannot accidentally swallow an error by ignoring the return value. The community prefers checking errors immediately.
Errors must escape. Cancel must propagate. The parent owns the lifecycle.
Pitfalls and panics
Misusing WaitGroup causes panics. If you call Done() more times than Add(), the program panics with sync: negative WaitGroup counter. This happens when a goroutine runs twice or the counter is mismanaged. Always pair Add and Done. Use defer wg.Done() to ensure the decrement happens.
If you call Add() after the goroutine starts, you risk a race condition. The goroutine starts, but Wait() sees zero and returns. The goroutine runs Add, but it's too late. The parent exits. The goroutine is now an orphan. Always call Add before launching the goroutine. The compiler won't catch this. You need the race detector or careful logic.
Context leaks are subtle. If you forget to pass ctx to a database query, the query runs until timeout or completion, even if the parent cancelled. The goroutine holds a connection. Over time, connections exhaust. The server stops accepting requests. The fix is to pass ctx everywhere. Use ctx for any operation that might block.
Closing a channel twice panics. close(errChan) must happen exactly once. If two goroutines try to close the same channel, you get panic: close of closed channel. Use the go func() { wg.Wait(); close(ch) }() pattern to ensure a single closer. The WaitGroup guarantees only one goroutine reaches the close call.
A leaked goroutine is a silent memory leak. Track every spawn.
Choosing the right tool
Use a sync.WaitGroup when you need to wait for a fixed set of goroutines to finish and you don't need to collect their results. Use a buffered channel when you need to collect results or errors from multiple goroutines and handle them in the parent. Use golang.org/x/sync/errgroup when you want structured concurrency with automatic error propagation and cancellation on the first error. Use sequential code when the operations are fast or dependent on each other, avoiding the complexity of concurrency entirely. Use a context.Context with a timeout when the operation must complete within a specific duration, regardless of the children's progress.
Concurrency adds complexity. Only pay the cost when it buys you speed or responsiveness.