How to use testing package

Write Test functions with *testing.T arguments and run them using go test to automate verification of your Go code.

The missing safety net

You write a function that parses a configuration string. It works on your machine. You push the code. Three hours later, a teammate changes a delimiter and breaks production. You could have caught that. Go ships with a testing package that turns verification into a first-class citizen. You do not need third-party assertion libraries or complex test runners. The standard library gives you everything you need to write reliable, fast, and parallelizable tests.

How the testing package actually works

The testing package looks for functions matching specific naming patterns. Functions starting with Test receive a *testing.T argument. Functions starting with Benchmark receive a *testing.B. Functions starting with Fuzz receive a *testing.F. The go test command compiles your package alongside a temporary test binary, runs the matching functions, and prints a structured report.

The *testing.T type is your control panel. It tracks failures, logs output, handles timeouts, and manages subtests. The community convention is to keep test files in the same package as the code they verify, named *_test.go. This placement gives tests access to unexported functions and types without exposing them to external consumers. You write tests alongside implementation, not in a separate tests directory.

The test runner executes functions sequentially by default. You can mark tests as parallel by calling t.Parallel(). The runner then schedules them across available CPU cores. Parallel tests must not share mutable state. They must not write to the same file, modify a global variable, or connect to a shared database without proper synchronization. The runner enforces a default timeout of ten minutes. Tests that exceed it get killed and marked as failed.

The testing package is built into the language. Treat tests as documentation that runs.

Your first test

Here is the simplest possible test: spawn one, verify a result, report failure if the math is wrong.

package main

import "testing"

// Add returns the sum of two integers.
func Add(a, b int) int {
    return a + b
}

// TestAdd verifies basic addition behavior.
func TestAdd(t *testing.T) {
    // t is the test controller. It records failures and handles output.
    got := Add(1, 2)
    // Compare result against expected value.
    if got != 3 {
        // t.Error logs the failure but lets the test continue running.
        t.Error("expected 3, got", got)
    }
}

Run it with go test. The compiler builds a temporary binary containing your package and the test runner. It scans for exported functions starting with Test. It executes them and prints PASS or FAIL. If the assertion fails, the runner prints the file, line number, and the message you passed to t.Error. The program exits with a non-zero status code, which is exactly what CI pipelines look for.

The runner handles panics automatically. If your code panics, the framework catches it, prints the stack trace, and marks the test as failed. You do not need to wrap every test in a recover. The framework manages control flow. You handle logic.

Scaling tests with tables and subtests

Hardcoding assertions works for one case. It breaks when you need to verify ten. The Go community standard is table-driven tests. You define a slice of structs containing inputs and expected outputs, then loop through them. Each iteration becomes a subtest via t.Run. Subtests run independently. If one fails, the others still execute. The output shows exactly which case broke.

Here is a table-driven test that verifies multiple inputs without repeating boilerplate.

package main

import "testing"

// TestAddTable verifies addition across multiple input combinations.
func TestAddTable(t *testing.T) {
    // Define test cases as a slice of structs for data-driven execution.
    tests := []struct {
        name string
        a, b int
        want int
    }{
        {"positive", 1, 2, 3},
        {"zero", 0, 0, 0},
        {"negative", -5, 5, 0},
    }

    // Iterate over cases and create isolated subtests.
    for _, tt := range tests {
        // t.Run creates a subtest that can fail independently.
        t.Run(tt.name, func(t *testing.T) {
            got := Add(tt.a, tt.b)
            // Fail fast on mismatch. t.Fatalf stops this subtest immediately.
            if got != tt.want {
                t.Fatalf("Add(%d, %d) = %d; want %d", tt.a, tt.b, got, tt.want)
            }
        })
    }
}

Table-driven tests scale. Hardcoded assertions do not.

You can also use t.Helper() to mark wrapper functions that should not appear in failure output. When a helper function calls t.Error, the runner reports the line number of the caller, not the helper. This keeps stack traces clean. You mark cleanup logic with t.Cleanup. The runner executes cleanup functions in reverse order after the test finishes, even if the test fails. This pattern replaces manual defer chains and guarantees resources get released.

Measuring speed and finding edge cases

Correctness is only half the job. You also need to know how fast your code runs and whether it breaks under unexpected input. The testing package provides benchmarks and fuzzing without external dependencies.

Benchmarks measure execution time and memory allocation. You write a function starting with Benchmark that accepts a *testing.B. The runner calls your function b.N times. It scales b.N automatically until it collects statistically significant data. You do not hardcode loop counts.

Here is a benchmark that measures addition performance under load.

package main

import "testing"

// BenchmarkAdd measures execution time and memory allocation for Add.
func BenchmarkAdd(b *testing.B) {
    // b.N is set by the runner. It scales until timing stabilizes.
    for i := 0; i < b.N; i++ {
        // Run the target function repeatedly for accurate measurement.
        _ = Add(i, i+1)
    }
}

Run it with go test -bench=.. The output shows iterations per second and nanoseconds per operation. You can add -benchmem to see memory allocations. Benchmarks are not unit tests. They measure performance characteristics, not correctness. You run them before and after refactoring to verify you did not introduce regressions.

Fuzzing finds edge cases by generating random inputs. You write a function starting with Fuzz that accepts a *testing.F. You provide seed values with f.Add, then define the fuzz target with f.Fuzz. The runner mutates the seeds, feeds them to your function, and watches for panics or assertion failures.

Here is a fuzz test that probes addition for overflow or unexpected behavior.

package main

import "testing"

// FuzzAdd probes Add with randomly generated integer pairs.
func FuzzAdd(f *testing.F) {
    // Seed the fuzzer with known valid inputs to bootstrap mutation.
    f.Add(1, 2)
    f.Add(-100, 100)
    f.Add(0, 0)

    // f.Fuzz defines the target function. The runner generates random args.
    f.Fuzz(func(t *testing.T, a, b int) {
        // Execute the function under test with mutated inputs.
        _ = Add(a, b)
    })
}

Run it with go test -fuzz=FuzzAdd. The fuzzer runs until you stop it or it finds a crash. It saves failing inputs to a corpus directory so you can reproduce bugs deterministically. Fuzzing complements unit tests. It catches boundary conditions you did not think to write.

Common traps and compiler feedback

The test runner is strict about signatures. If you forget the *testing.T argument, the compiler rejects the program with wrong type for argument or expected function, got .... The test runner only discovers functions that match the exact signature. You cannot name a function TestAdd and pass it a *testing.B. The compiler catches this immediately.

A frequent mistake is using panic instead of t.Fatal. Panics unwind the stack and can skip cleanup logic. The test runner catches panics, but the output becomes noisy and cleanup functions may not run in the expected order. Use t.Fatalf to stop execution cleanly. The runner handles the rest.

Another trap is sharing state between parallel tests. If two tests write to the same file or modify a global variable, you get race conditions. The test runner does not serialize parallel tests. You must isolate them. Use t.TempDir() for temporary files. Use local variables instead of package-level state. Run go test -race to detect data races during development.

The compiler also enforces import hygiene. Forget to use the testing package and you get imported and not used. Forget to export a test function and you get exported function TestXxx should have signature .... The errors are verbose by design. They point directly to the mismatch. Read them literally. Fix the signature. Move on.

Let the framework manage control flow. Do not fight it with panics.

Choosing the right verification tool

Use TestXxx when you need to verify correctness against known inputs and outputs. Use t.Run subtests when a single logical test covers multiple variations that should fail independently. Use BenchmarkXxx when you need to measure execution time and memory allocation under load. Use FuzzXxx when you want the compiler to generate random inputs to find edge cases and panics. Use t.Parallel() when tests are completely isolated and do not share mutable state. Use t.Cleanup when you need to release files, connections, or temporary resources after a test finishes.

Pick the right tool for the verification job. Measure what matters.

Where to go next