How the Go Garbage Collector Works (Tricolor Mark and Sweep)

Go uses a concurrent tricolor mark-and-sweep garbage collector to automatically reclaim unused memory with minimal pause times.

The memory tax you don't see

You're running a Go service. It handles requests fine for an hour. Then latency spikes. You check the metrics and see memory usage climbing, then dropping sharply, then climbing again. The CPU graph shows periodic spikes that match the memory drops. The garbage collector is working, but it's fighting your allocation pattern. Understanding what happens under the hood helps you stop fighting the GC and start writing code that plays nice with it.

Go code follows strict formatting. Trust gofmt. Argue logic, not formatting. Most editors run it on save, so you never waste time debating indentation. The runtime manages memory for you. You allocate with new or make, and eventually, the runtime reclaims that memory when you're done. The algorithm is called tricolor mark-and-sweep. It runs concurrently with your code, keeping pauses short and latency predictable.

Tricolor mark and sweep

The GC classifies objects into three sets: white, gray, and black. White means unvisited. The object might be live, or it might be trash. Gray means visited but not fully processed. The object is live, but the GC hasn't checked what it points to. Black means fully processed. The object is live, and everything it points to has been accounted for.

The goal is to find all live objects and turn them black. Anything that remains white at the end is dead and gets swept away. The GC starts with roots. Roots are pointers the runtime knows are definitely alive: global variables, stack variables, and goroutine stacks. The GC marks roots as gray. Then it processes the gray set. For every gray object, the GC scans its pointers. Any object pointed to is marked gray if it was white. Once all pointers are scanned, the object turns black. This continues until the gray set is empty.

Go does not use generational collection. All objects are treated equally. The algorithm is non-moving. Objects stay where they are allocated. This avoids the cost of copying memory, but it can lead to fragmentation over time. The runtime handles fragmentation by returning memory to the OS during the sweep phase.

A manual trigger for learning

Here's the simplest way to trigger a collection cycle manually. You rarely need this in production, but it helps see the mechanism in action.

package main

import (
	"fmt"
	"runtime"
)

func main() {
	// Allocate a large slice to force heap allocation.
	// Stack allocation is faster but limited to the function scope.
	data := make([]byte, 1024*1024)
	data[0] = 1 // Keep the compiler from optimizing the allocation away.

	// runtime.GC() forces a full garbage collection cycle.
	// This is a blocking call for the caller, though the GC runs concurrently.
	runtime.GC()

	// MemStats captures heap metrics after the collection.
	var stats runtime.MemStats
	runtime.ReadMemStats(&stats)
	fmt.Printf("Heap in use: %d bytes\n", stats.HeapInuse)
}

Calling runtime.GC() blocks the calling goroutine until the collection finishes. If you call this in a hot path, you stall your request. The compiler won't stop you, but your latency metrics will scream. Use this only for benchmarks or tests where you need to measure steady-state memory usage.

How the runtime keeps up

The GC runs concurrently with the mutator, which is your application code. This is the hard part. If your code changes pointers while the GC is marking, you might miss a live object. Go uses write barriers to solve this. A write barrier is a tiny bit of code inserted by the compiler whenever you write a pointer. It tells the GC, "Hey, I just changed a pointer, check this out."

Without write barriers, a pointer could change from pointing to a live object to pointing to a dead object while the GC is marking. The GC might then miss the live object and reclaim it prematurely. The write barrier prevents this by ensuring the GC sees the change. The barrier has a small cost, but it's paid only on pointer writes, not reads.

The GC also scans goroutine stacks. Stacks are roots, but they change as goroutines execute. The GC pauses goroutines briefly to scan their stacks. This is the stop-the-world pause. It's short because stacks are small and scanning is fast. The pause duration depends on the number of goroutines and the amount of stack memory.

Sweep runs concurrently too. It walks the heap and reclaims white objects. It updates free lists and returns memory to the OS using system calls. The runtime uses madvise on Linux to tell the OS that memory is no longer needed. The OS may reclaim the pages immediately, or it may keep them for reuse. This depends on the system configuration.

The GC runs in the background. Your code pays a small tax for write barriers, but you get low latency. Trust the write barrier.

GC pressure in a service

In a real service, you care about allocation rate. High allocation rate means the GC has to work harder. Let's look at a handler that creates a temporary buffer. Functions that take a context should respect cancellation and deadlines. Context is plumbing. Run it through every long-lived call site.

package main

import (
	"net/http"
	"runtime/debug"
)

// handleRequest allocates a buffer for processing.
// Frequent large allocations increase GC workload.
func handleRequest(w http.ResponseWriter, r *http.Request) {
	// This allocation goes to the heap because the size is large.
	// The GC will reclaim it after the function returns.
	buf := make([]byte, 64*1024)
	w.Write(buf)
}

func main() {
	// SetGCPercent changes the trigger threshold.
	// A value of 50 means GC runs when heap grows by 50% since last collection.
	// Lower values reduce memory footprint but increase CPU overhead.
	debug.SetGCPercent(50)

	http.HandleFunc("/", handleRequest)
	http.ListenAndServe(":8080", nil)
}

The GOGC environment variable is preferred over programmatic changes in most cases. Consistency helps. GOGC defaults to 100, which means the GC triggers when the heap doubles since the last collection. Lower values trigger GC more often, reducing heap size but increasing CPU usage. Higher values do the opposite. Tuning GOGC is a trade-off between memory and CPU.

Allocation rate drives GC cost. Reduce allocations to reduce GC work.

Pitfalls and leaks

Calling runtime.GC() in production is a mistake. It blocks the caller and disrupts the GC's internal pacing. The runtime manages the GC cycle automatically based on allocation rate and GOGC. Interfering with this causes latency spikes and wasted CPU.

Goroutine leaks are a common source of memory growth. A goroutine leak happens when a goroutine waits on a channel that never gets closed. The goroutine stays alive, and so does everything it references. The GC cannot reclaim that memory because the goroutine holds a reference. Always have a cancellation path. Use context.Context to signal shutdown. The worst goroutine bug is the one that never logs.

If you try to access runtime.MemStats fields that don't exist, the compiler rejects with undefined field. If you forget to import runtime, you get undefined: runtime. The compiler catches these errors early. Forget to use an import and you get imported and not used. Go is strict about unused code.

The GC reclaims unreachable memory. It cannot fix logical leaks. If memory grows forever, your code holds a reference, not the GC. Profile your application to find leaks. Use go tool pprof to inspect heap profiles and goroutine stacks.

Tuning and tools

Use GOGC environment variable when you need to tune the balance between memory usage and CPU overhead for a long-running service.

Use runtime.GC() when you are writing a benchmark or test and need to force a collection to measure steady-state memory usage.

Use runtime.ReadMemStats when you need programmatic access to heap metrics for custom monitoring or alerts.

Use sync.Pool when you have short-lived objects that are allocated and discarded frequently, to reduce allocation pressure.

Use go tool pprof when you suspect allocation hotspots and need to identify which functions are driving GC work.

Tune the GC only when metrics show a problem. Default settings work for most workloads.

Where to go next