Benchmark code

Write benchmark functions with the `Benchmark` prefix and the signature `func BenchmarkXxx(b *testing.B)`, then run them using `go test -bench=.

Write benchmark functions with the Benchmark prefix and the signature func BenchmarkXxx(b *testing.B), then run them using go test -bench=. -benchmem to measure performance and memory allocations. The b.N loop variable tells the benchmark runner how many iterations to execute to get a stable timing result, so you must wrap your target code inside for i := 0; i < b.N; i++.

Here is a practical example of benchmarking a string concatenation function:

// benchmark_test.go
package main

import "testing"

func ConcatStrings(a, b string) string {
	return a + b
}

func BenchmarkConcatStrings(b *testing.B) {
	a := "Hello, "
	bStr := "World"
	
	// b.N is set by the testing framework to ensure accurate timing
	for i := 0; i < b.N; i++ {
		_ = ConcatStrings(a, bStr)
	}
}

Run the benchmark from your terminal:

go test -bench=. -benchmem

The output will show operations per second (ops/sec) and average memory allocations (B/op). To compare two implementations, name them similarly (e.g., BenchmarkConcatOld and BenchmarkConcatNew) and run go test -bench=. -benchcmp=. to see the relative difference.

Avoid common pitfalls like capturing variables in the loop that change every iteration if you only want to measure the function logic, or failing to disable optimizations by the compiler. If you need to measure a specific subset of code, use b.ResetTimer() before the critical section and b.StopTimer() after, though this is rarely needed for simple function benchmarks. Always ensure your benchmark data is realistic; benchmarking empty strings or zero-sized slices often yields misleadingly fast results that don't reflect production workloads.

If you need to verify that your benchmark logic is correct before running the full suite, you can use go test -bench=. -count=1 to run each benchmark exactly once, which is faster for quick sanity checks. Remember that benchmarks are sensitive to system load, so run them multiple times or use the -count flag to average out noise if you are comparing marginal performance differences.