How to Use net/http/pprof for Live Profiling

Import net/http/pprof to expose profiling endpoints at /debug/pprof/ for live performance analysis.

Your server is slow, but you don't know why

Your API is responding in 400ms. The requirement is 50ms. You add time.Now() checks everywhere, but the logs just show timestamps, not the culprit. The bottleneck could be a regex, a database round-trip, a lock contention, or a memory allocation storm. Guessing wastes time. You need to watch the program while it runs.

Go ships with a profiler built into the standard library. The package net/http/pprof registers HTTP handlers that expose runtime data. You get CPU profiles, heap snapshots, goroutine stacks, and mutex contention maps. It works by sampling. The runtime pauses execution briefly, records where the goroutines are, and resumes. This gives you a statistical view of where time and memory go.

What pprof actually does

The profiler hooks into the Go runtime. Every 10 milliseconds, the runtime interrupts each goroutine, captures its stack trace, and records which function is executing. After 30 seconds of sampling, you have a map of where the CPU spent its time. Functions that appear more often in the samples are the hotspots.

Memory profiling works differently. The runtime tracks allocations. When you request a profile, it aggregates how much memory each function allocated and how much is still in use. This helps you find allocation storms and leaks.

The package also exposes goroutine stacks, mutex contention, and block events. These profiles help you debug concurrency issues. If a goroutine is stuck waiting on a lock, the mutex profile shows which function holds the lock and which function is waiting.

The minimal setup

Here's the bare minimum to enable profiling. You import the package for its side effects and start a server.

package main

import (
	_ "net/http/pprof" // Import for side effects: registers handlers on DefaultServeMux
	"log"
	"net/http"
)

func main() {
	// Profiling endpoints live at /debug/pprof/
	log.Println("Server starting with pprof enabled")
	log.Fatal(http.ListenAndServe(":8080", nil))
}

The blank identifier _ tells the compiler you want the package's side effects, not its exported names. This is the standard pattern for registering handlers. The init function inside net/http/pprof runs when the package loads. It calls http.HandleFunc for paths like /debug/pprof/profile and /debug/pprof/heap. These handlers attach to the default multiplexer. When you pass nil to http.ListenAndServe, the server uses that default multiplexer. The endpoints are live immediately.

The blank identifier imports the package for its side effects. The handlers appear automatically.

How the runtime captures data

When you request a CPU profile, the runtime starts sampling. It uses a signal-driven mechanism on Unix systems. The signal handler runs in the context of the interrupted goroutine. It captures the stack trace and stores it in a buffer. When the profile duration ends, the handler flushes the buffer and sends the data to the client.

The sampling interval is fixed at 100 Hz. This means the runtime samples 100 times per second. The overhead is small, usually under 1%. You can leave profiling enabled in production if you protect the endpoints.

Memory profiling aggregates data from the garbage collector. The runtime tracks allocations per function. When you request a heap profile, it walks the allocation tables and builds a summary. This process is fast and does not pause the world.

Profiling a live bottleneck

Here's a server with a slow handler to demonstrate CPU profiling. The handler burns CPU in a loop to create a visible spike.

package main

import (
	_ "net/http/pprof"
	"log"
	"net/http"
)

// SlowHandler simulates a CPU-bound bottleneck.
func SlowHandler(w http.ResponseWriter, r *http.Request) {
	// Burn CPU to create a visible spike in the profile
	for i := 0; i < 10000000; i++ {
		_ = i * i
	}
	w.Write([]byte("done"))
}

func main() {
	http.HandleFunc("/slow", SlowHandler)
	log.Println("Hit /slow to generate load, then profile")
	log.Fatal(http.ListenAndServe(":8080", nil))
}

Go functions that handle HTTP requests follow the signature func(http.ResponseWriter, *http.Request). The naming convention for the response writer w and request r is universal in the community. Stick to these names so other developers recognize the pattern instantly.

To capture a profile, you need two terminals. One generates load, the other runs the profiler.

# Generate load in one terminal
ab -n 100 http://localhost:8080/slow

# Capture a 30-second CPU profile
go tool pprof -seconds=30 http://localhost:8080/debug/pprof/profile

The ab command sends 100 requests to the slow endpoint. The go tool pprof command fetches the profile and opens an interactive UI. Inside the UI, type top to see the hottest functions. The output shows SlowHandler at the top with nearly 100% of the CPU time.

Profiling captures reality. Benchmarks capture a snapshot. Use pprof to find the live bottleneck.

Reading the output

The pprof UI provides several commands to explore the data. The top command lists functions by cumulative or flat time. Flat time is the time spent in the function itself. Cumulative time includes time spent in callees. A function with high cumulative time but low flat time is calling something expensive.

The list command overlays the profile data on your source code. It shows exactly which lines are hot. This is invaluable for pinpointing the exact operation causing the slowdown.

# Inside the pprof UI
list SlowHandler

The output annotates each line with the number of samples. Lines with many samples are the hotspots. You can see the loop body consuming the CPU.

Heap profiles distinguish between alloc_objects and inuse_objects. alloc_objects counts everything ever allocated. inuse_objects counts what the garbage collector hasn't freed yet. A high alloc count with low inuse means you are generating garbage, not leaking it. The garbage collector is working hard to clean up temporary allocations. A high inuse count suggests a leak or long-lived data structures.

# Analyze heap allocations
go tool pprof -alloc_space http://localhost:8080/debug/pprof/heap

# Analyze memory currently in use
go tool pprof -inuse_space http://localhost:8080/debug/pprof/heap

The -alloc_space flag shows total allocated memory. The -inuse_space flag shows memory currently held. Use both to understand allocation patterns.

Security and conventions

The endpoints expose internal state. If you leave them on a public server, anyone can download your heap dump or freeze your CPU with a profile request. Bind to localhost or add middleware.

Here's a middleware that restricts access to localhost.

package main

import (
	_ "net/http/pprof"
	"log"
	"net/http"
)

// PprofMiddleware restricts access to localhost.
func PprofMiddleware(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		// Only allow requests from the local machine
		if r.RemoteAddr != "127.0.0.1:0" && r.RemoteAddr != "[::1]:0" {
			http.Error(w, "forbidden", http.StatusForbidden)
			return
		}
		next.ServeHTTP(w, r)
	})
}

func main() {
	mux := http.NewServeMux()
	// Wrap the default mux with security middleware
	secureMux := PprofMiddleware(http.DefaultServeMux)
	mux.Handle("/", secureMux)
	
	log.Println("Secure pprof enabled")
	log.Fatal(http.ListenAndServe(":8080", mux))
}

The middleware checks RemoteAddr. If the request doesn't come from localhost, it returns a 403 error. This pattern protects the profiling endpoints while keeping them accessible for local debugging.

Never expose pprof to the public internet. Lock it behind a firewall or authentication middleware.

Pitfalls

Profiling adds overhead. The CPU sampler pauses goroutines briefly. The heap profiler aggregates data. If you profile a latency-sensitive service, the overhead might skew the results. Use the shortest profile duration needed to capture the behavior.

If you forget to import the package, the handlers don't register. The compiler rejects the program with undefined: pprof if you reference the package but don't import it. If you import it without the blank identifier and don't use it, you get imported and not used.

The blank identifier is required because the package has no exported names you need in your code. The side effects are the feature.

Profiling captures a window of time. If the bottleneck is intermittent, you might miss it. Run the profiler during the problematic period. Use continuous profiling tools for long-term monitoring.

Profiling adds overhead. Turn it off in production unless you have a firewall.

When to use pprof

Use net/http/pprof when you need to profile a running server without restarting it. Use go test -bench when you want reproducible micro-benchmarks of isolated functions. Use runtime/pprof when you need to write profiles to a file programmatically without an HTTP server. Use a custom profiler when you need to profile a binary that cannot import standard library packages.

pprof is for live diagnosis. Benchmarks are for regression testing. Pick the tool that matches the question.

Where to go next