How to Use CPU and Memory Profiles in CI

The green CI, red production trap

Your CI pipeline turns green. Every test passes. The linter is happy. You merge the PR with confidence. Two days later, the on-call pager screams. Production memory usage has climbed steadily for six hours until the OOM killer stepped in. Or the CPU load spiked by 40% because a new loop is spinning harder than expected. The tests didn't catch it because they run with tiny datasets, or they finish too fast to trigger the leak, or they simply don't measure performance.

Performance regressions hide in plain sight until the load is high enough to expose them. Catching them requires more than unit tests. You need profiling in CI to capture CPU and memory behavior automatically, compare it against a baseline, and fail the build when the cost goes up. Go ships with profiling built into the runtime. You don't need external agents or magic compiler flags. You wire runtime/pprof into your code, generate the data, and analyze it in the pipeline.

Tests check correctness. Profiles check cost. You need both.

Profiling is sampling, not tracing

Profiling takes snapshots of what your program is doing. It does not record every instruction. That would be tracing, and tracing adds too much overhead for production or CI. Profiling samples the state at regular intervals and builds a statistical map of where time and memory go.

CPU profiling samples the call stack at a fixed frequency, usually every 10 milliseconds. If a function appears in the stack often, it consumes a lot of CPU time. The profile aggregates these samples so you can see the hot paths.

Memory profiling looks at the heap. It captures how much memory is allocated, how much is in use, and where the allocations came from. A heap profile helps you find leaks, excessive allocations, or objects that stay alive longer than necessary.

Go stores these profiles in a binary format called pprof. The format is compact and includes enough metadata for go tool pprof to decode it later. You generate the file in your code, then analyze it with the tool.

Sampling gives you the map. Tracing gives you the transcript. Use sampling for CI.

Minimal profiling setup

Here's the skeleton: start the CPU profiler, run your code, stop it, then write a heap snapshot. The runtime/pprof package handles the heavy lifting. You just manage the file handles and the lifecycle.

package main

import (
	"os"
	"runtime/pprof"
)

// ProfileDemo captures CPU and memory profiles to files for analysis.
func ProfileDemo() {
	// Open file for CPU profile; check error in production code
	cpuFile, _ := os.Create("cpu.prof")
	// Start sampling CPU usage; spawns a background goroutine
	pprof.StartCPUProfile(cpuFile)
	// Stop profiling and close file when function returns
	defer pprof.StopCPUProfile()
	defer cpuFile.Close()

	// Run the workload you want to measure
	doWork()

	// Open file for memory profile
	memFile, _ := os.Create("mem.prof")
	// Capture heap allocation snapshot at this point
	pprof.WriteHeapProfile(memFile)
	memFile.Close()
}

func doWork() {
	// Simulate CPU-bound work
	sum := 0
	for i := 0; i < 1000000; i++ {
		sum += i
	}
	_ = sum
}

Start, run, stop. The file holds the truth.

How the runtime captures data

When you call pprof.StartCPUProfile, the runtime spawns a goroutine that sets up a timer. Every 10 milliseconds, the timer fires. The runtime walks the stack of all goroutines and records a sample. Each sample includes the stack trace and the time elapsed. The samples write to the file you provided.

The sampling frequency is fixed. You cannot change it via the API. This keeps the overhead predictable. The profiler adds roughly 5-10% overhead, which is acceptable for CI but might skew results if you need microsecond precision.

pprof.WriteHeapProfile works differently. It forces a garbage collection cycle to ensure the heap state is consistent. Then it walks the heap and writes allocation records to the file. The profile includes both allocs (total allocations over time) and inuse (memory currently held). You can query either metric later with go tool pprof.

The profiler runs in the background. Your code runs in the foreground. They share the clock.

Automating regression detection in CI

In CI, you don't want to look at profiles manually. You want the pipeline to fail if performance degrades. The standard approach is to run a test that generates a profile, then compare that profile against a stored baseline. If the new profile shows significantly more CPU time or memory usage, the build fails.

Here's a test that profiles a specific function. The test runner executes the code, and the profile file lands in the workspace.

package main

import (
	"os"
	"testing"
	"runtime/pprof"
)

// TestHeavyFunction profiles a function to detect performance regressions.
func TestHeavyFunction(t *testing.T) {
	// Create temp file for CPU profile in test directory
	cpuFile, err := os.Create("cpu_test.prof")
	if err != nil {
		t.Fatal(err)
	}
	// Start CPU profiling for this test
	pprof.StartCPUProfile(cpuFile)
	// Ensure profiler stops even if test fails
	defer pprof.StopCPUProfile()
	defer cpuFile.Close()

	// Run the function under test
	heavyComputation()
}

func heavyComputation() {
	// Placeholder for expensive logic
	for i := 0; i < 5000000; i++ {
		_ = i * i
	}
}

The CI script runs the test, invokes go tool pprof to analyze the output, and checks for regressions. The -base flag compares the current profile against a baseline file. If the difference exceeds a threshold, the script exits with an error.

#!/bin/bash
# ci_profile_check.sh runs tests and checks profiles against baselines

# Run the test that generates the profile
go test -run TestHeavyFunction ./...

# Compare CPU profile against stored baseline
# -top prints the top functions; -base shows delta from baseline
go tool pprof -top -base baseline_cpu.prof cpu_test.prof > cpu_delta.txt

# Check if heavyComputation appears in the top list with increased usage
if grep -q "heavyComputation" cpu_delta.txt; then
	# Extract the percentage increase from the output
	increase=$(grep "heavyComputation" cpu_delta.txt | awk '{print $1}')
	# Fail if increase is greater than 10%
	if (( $(echo "$increase > 10" | bc -l) )); then
		echo "FAIL: heavyComputation CPU usage increased by ${increase}%"
		exit 1
	fi
fi

echo "PASS: CPU profile within acceptable range"

The analysis step uses go tool pprof to parse the binary profile. The tool needs the compiled binary to resolve symbols. If your CI strips symbols or builds for a different architecture, pprof cannot match stack frames to function names. Keep the binary available, or use go tool pprof -raw to get uninterpreted data.

Compare against a baseline. Fail on regression. Protect the user.

Pitfalls, errors, and conventions

Profiling in CI introduces specific failure modes. Watch for these.

If you pass a nil writer to pprof.StartCPUProfile, the runtime panics with runtime error: invalid memory address or nil pointer dereference. Always check the error from os.Create before starting the profiler. The compiler won't catch a nil handle; the runtime will.

Forgetting to stop the CPU profiler leaks a goroutine. StartCPUProfile spawns a background goroutine that runs until you call StopCPUProfile. If the goroutine leaks, your test might hang, or the test runner might report a goroutine leak. Always defer the stop call.

A leaked profiler goroutine is a silent test killer. Always defer the stop.

CI environments add noise. Background processes, shared hardware, and network latency can skew results. Run tests multiple times with go test -count=3 and average the results, or use a threshold that accounts for variance. Don't fail on a 1% fluctuation.

pprof files are binary. You cannot diff them with git diff. Store baselines in version control, but analyze them with go tool pprof. If you need to store structured data, use go tool pprof -json to export the profile as JSON. The JSON output includes samples, locations, and mappings. Your CI script can parse the JSON to extract metrics.

Convention aside: the community accepts verbose error handling because it makes failure paths visible. Write if err != nil { t.Fatal(err) } instead of ignoring the error. In profiling code, a silent failure means you collect no data and the CI passes falsely.

Another convention: pprof profile files usually end in .prof. The go tool pprof command recognizes this extension and treats the file as a profile. Stick to the naming convention so tools work out of the box.

Don't ignore errors. Name files clearly. Run multiple times.

When to profile and when to benchmark

Profiling and benchmarking solve different problems. Benchmarks measure throughput and allocation counts for a unit of work. Profiles show where time and memory are spent inside that work. Use the right tool for the question.

Use runtime/pprof in CI when you need to capture call stacks and detect regressions in complex workloads where the bottleneck might shift.

Use go test -bench with -benchmem when you only care about operations per second and bytes allocated per operation for small, isolated units.

Use go tool pprof -json when your CI pipeline needs structured data to feed into a dashboard, database, or custom alerting logic.

Use interactive go tool pprof locally when you are debugging a specific bottleneck and need to navigate the graph, zoom into functions, and explore the code manually.

Use net/http/pprof when you have a running service and want to profile it on demand without restarting or modifying the code.

Benchmarks measure speed. Profiles show why. Pick the tool that answers the question.

Where to go next

Profiling in Go means recording how much CPU time or memory your program uses while it runs. You start a recording session, let your code execute, and then save the data to a file. Later, you use a tool to read that file and show you exactly which parts of your code are slow or using too much memory, similar to a doctor checking a patient's vital signs to find the source of an illness.