How to Use go tool trace for Execution Tracing

Use go tool trace to visualize execution traces generated by runtime/trace or GODEBUG settings.

When CPU profiling leaves you guessing

Your Go server handles requests fine under light load, but once traffic spikes, latency jumps. You run pprof, and the CPU profile looks empty. The goroutines aren't burning cycles; they're waiting. Something is blocking the scheduler, or a lock is contended, or the garbage collector is pausing the world for too long. CPU profiling tells you where time is spent computing. It doesn't tell you why time is spent waiting. You need a timeline of the runtime itself.

Execution tracing records events inside the Go runtime. It captures goroutine creation, scheduling, system calls, garbage collection, and network I/O. Think of it as a black box recorder for your program. Instead of measuring how much fuel the engine burned, you get a second-by-second log of every gear shift, every idle moment, and every time the engine stalled. The trace file contains a stream of events with timestamps. go tool trace reads that stream and renders an interactive timeline in your browser.

Tracing reveals the invisible.

What execution tracing captures

The runtime emits events whenever a goroutine changes state. When a goroutine blocks on a channel, the trace records a GoroutineBlock event with the timestamp and the stack trace. When the goroutine unblocks, a GoroutineReady event is added. The tracer also records garbage collection phases, system calls, and network operations. You can add your own events using trace.NewTask or trace.Log to mark regions of your code. This lets you correlate runtime behavior with application logic.

The trace buffer lives in memory while tracing is active. When you stop tracing, the buffer flushes to a binary file. The file format is compact and optimized for size. go tool trace parses the binary, builds an index, and starts a local HTTP server. It opens your browser to an interactive viewer. The viewer shows a timeline where the x-axis is time and the y-axis lists goroutines. You can zoom in to see individual scheduling decisions.

The trace file is your timeline.

Minimal trace example

Here's the minimal setup to capture a trace. You create a file, start the tracer, do work, and stop. The runtime/trace package provides the API. You must call trace.Start before the work and trace.Stop after. Use defer to ensure the trace stops even if the function panics.

package main

import (
	"os"
	"runtime/trace"
	"time"
)

func main() {
	// Create a file to store the trace data on disk.
	f, err := os.Create("trace.out")
	if err != nil {
		panic(err)
	}
	// Close the file when main returns.
	defer f.Close()

	// Start capturing goroutine scheduling, GC, and syscalls.
	if err := trace.Start(f); err != nil {
		panic(err)
	}
	// Ensure tracing stops even if the function panics.
	defer trace.Stop()

	// Sleep long enough to generate visible events in the trace viewer.
	time.Sleep(100 * time.Millisecond)
}

Run the program to generate the trace file. Then pass the file to go tool trace.

go run main.go
go tool trace trace.out

The command starts a server and opens your browser. You'll see a single goroutine running, then sleeping. The sleep appears as a gray bar in the timeline. Click the bar to see the stack trace. It points to time.Sleep.

Start and stop are paired.

Walking through the runtime events

When you call trace.Start, the runtime begins buffering events. Every time a goroutine blocks, the buffer records the event. The buffer has a fixed size. If the buffer fills up, the runtime drops events. You lose data, and the trace becomes incomplete. Always stop tracing before the buffer overflows. For long-running programs, capture traces for short windows or attach traces to specific requests.

The trace viewer has multiple views. The "Goroutines" view shows each goroutine as a row. Colored bars indicate the goroutine's state. Green means the goroutine is running on a thread. Gray means the goroutine is waiting. You can hover over a bar to see details. Click a bar to expand the stack trace. This helps you find where a goroutine is blocked.

The "Scheduler" view shows OS threads. Goroutines migrate between threads as the scheduler assigns work. If a thread is idle while goroutines are waiting, the scheduler might be throttled. Or a goroutine might be stuck on a syscall that pins the thread. The scheduler view helps you diagnose thread starvation or syscall bottlenecks.

Custom tasks appear as overlays in the timeline. You can filter by task name. This helps isolate specific request paths. If you see a task taking too long, zoom in to see the runtime events inside the task. You might find a GC pause, a lock contention, or a slow syscall.

Custom tasks bridge the gap between runtime and app code.

Realistic example with user tasks

Real code needs more than a sleep. Here's an HTTP handler with a user-defined task. The trace viewer highlights custom tasks, making it easier to spot slow paths in your application logic. The handler simulates a request that blocks on I/O. The task wraps the handler logic.

package main

import (
	"fmt"
	"net/http"
	"runtime/trace"
)

// slowHandler simulates a request that blocks on I/O.
func slowHandler(w http.ResponseWriter, r *http.Request) {
	// Create a user-defined task region visible in the trace.
	// The task name appears in the viewer's task list.
	// Ignore error: NewTask only fails if context is invalid.
	task, _ := trace.NewTask(r.Context(), "slowHandler")
	defer task.End()

	// Block for 50ms to simulate waiting on a database.
	// This creates a visible gap in the goroutine timeline.
	time.Sleep(50 * time.Millisecond)
	fmt.Fprintln(w, "done")
}

The main function starts the server, triggers a request, and captures the trace. The trace includes the handler's task. You can see the task duration and the sleep inside it.

package main

import (
	"net/http"
	"os"
	"runtime/trace"
	"time"
)

func main() {
	// Create a file to store the trace data on disk.
	f, err := os.Create("trace.out")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	// Start capturing goroutine scheduling, GC, and syscalls.
	if err := trace.Start(f); err != nil {
		panic(err)
	}
	// Ensure tracing stops even if the function panics.
	defer trace.Stop()

	http.HandleFunc("/slow", slowHandler)
	// Run the HTTP server so the main goroutine can trigger requests.
	go func() {
		_ = http.ListenAndServe(":8080", nil)
	}()

	// Give server time to start.
	time.Sleep(100 * time.Millisecond)

	// Fire a request to generate handler events.
	_ = http.Get("http://localhost:8080/slow")

	// Allow the handler to complete before stopping the trace.
	time.Sleep(200 * time.Millisecond)
}

Run the program and open the trace. In the viewer, switch to the "Tasks" view. You'll see slowHandler as a task. Click the task to see the goroutine running it. The sleep appears inside the task region. This confirms the handler is slow due to the sleep.

trace.NewTask takes a context.Context. This ties the trace to the request lifecycle. If the context is cancelled, the task ends. This follows the convention of passing context.Context as the first parameter. Functions that take a context should respect cancellation and deadlines.

Context is plumbing. Run it through every long-lived call site.

Reading the timeline

The trace viewer helps you answer specific questions. Is a goroutine stuck? Look for a long gray bar. Click the bar to see the stack trace. If the stack shows a channel operation, the goroutine is waiting on a channel. If the stack shows a syscall, the goroutine is waiting on the OS. Is the GC pausing too long? Look for "STW" (Stop The World) events. STW pauses freeze all goroutines. If STW is frequent or long, you have memory pressure. Is a lock contended? Look for "Mutex" events. If multiple goroutines are waiting on the same mutex, the lock is a bottleneck.

User tasks help you correlate runtime events with application logic. If a task is slow, zoom in to see what's happening inside. You might find a GC pause, a lock contention, or a slow syscall. This helps you pinpoint the root cause. The trace viewer also shows network I/O. You can see when a connection is established, data is sent, and data is received. This helps you diagnose network latency.

Tracing reveals the invisible.

Pitfalls and overhead

Tracing adds overhead. The runtime must record events and manage the buffer. Expect your program to run slower while tracing is active. This is fine for debugging, but avoid enabling traces in production traffic unless you have a specific incident to investigate. The overhead depends on the number of goroutines and the frequency of events. Programs with many goroutines or frequent syscalls will see higher overhead.

If you forget to call trace.Stop, the buffer keeps growing until it hits a limit. The runtime drops events when the buffer is full. You lose data, and the trace becomes incomplete. Always pair trace.Start with defer trace.Stop. If you pass a closed file to trace.Start, the function returns an error. The compiler won't catch this, but the runtime will fail to write. Check the error: if err := trace.Start(f); err != nil { log.Fatal(err) }.

If you import runtime/trace but never call trace.Start, the compiler rejects the build with imported and not used. Go requires every import to be referenced. This prevents dead code from cluttering the binary. Remove the import or use the package.

You can enable tracing without modifying code using the GODEBUG environment variable. Set GODEBUG=trace=1 to write a trace to standard output. This is useful for quick debugging or when you cannot modify the source. However, programmatic control via runtime/trace is preferred for production debugging. It lets you start and stop traces precisely.

Tracing costs time. Measure the cost.

Decision matrix

Use go tool trace when you need to understand goroutine scheduling, GC pauses, or blocking I/O patterns. Use pprof with CPU profiling when you want to find hot functions consuming processor cycles. Use pprof with mutex profiling when you suspect lock contention is serializing your code. Use pprof with block profiling when you need a sample-based view of channel or lock waits without the overhead of full tracing. Use logging when you only need to track business logic events and don't care about runtime internals. Use go tool trace for a specific request when you can attach a trace to a context.Context and capture just that request's timeline. Use GODEBUG=trace=1 when you cannot modify the source code and need to capture a trace from an existing binary.

Pick the tool that matches the symptom.

Where to go next