Interface Performance in Go: Cost of Dynamic Dispatch

The profiler does not lie

You run go test -bench=. -cpuprofile=cpu.out on your event processor. The pprof flame graph shows a single function eating forty percent of CPU time. You look at the source code. It is just calling event.Process(). The function body contains three lines of arithmetic and a slice append. The profiler is not lying. The cost is not the math. The cost is the interface.

Go interfaces are zero cost in many everyday scenarios. They are not free everywhere. When performance matters, the abstraction layer introduces measurable overhead. Understanding exactly where that overhead comes from lets you place interfaces where they belong and keep hot paths concrete.

How interfaces actually work

An interface value in Go is not a single pointer. It is a pair of pointers. The first pointer points to the actual data. The second pointer points to an itab. The itab stands for interface table. It holds a pointer to the type descriptor and a list of function pointers for every method the concrete type implements.

Think of the itab like a building directory. You know the building address, but you do not know which office you need until you check the directory board. The directory tells you exactly where to go. Go uses the same pattern. When you call a method on an interface, the runtime reads the itab, finds the function pointer, and jumps to it.

That lookup adds indirection. More importantly, it breaks inlining. Inlining is the compiler favorite optimization. It copies a function body directly into the caller, removes the call overhead, and lets the optimizer merge variables, eliminate dead code, and unroll loops. An interface call draws a hard line around that optimization. The compiler cannot inline what it cannot see at compile time.

Interfaces are contracts, not performance tools.

The cost of not knowing

Here is the simplest way to see the difference. The compiler treats these two functions completely differently.

package main

// Shape defines a geometric contract.
type Shape interface {
    Area() float64
}

// Circle implements Shape with a fixed radius.
type Circle struct {
    Radius float64
}

// Area calculates the area for a Circle.
func (c Circle) Area() float64 {
    return 3.14159 * c.Radius * c.Radius
}

// ProcessShape accepts an interface, forcing dynamic dispatch.
func ProcessShape(s Shape) float64 {
    return s.Area() // Runtime lookup required, prevents inlining
}

// ProcessCircle accepts a concrete type, enabling full optimization.
func ProcessCircle(c Circle) float64 {
    return c.Area() // Compiler inlines this directly
}

When the compiler sees ProcessCircle, it knows the exact type. It copies the Area body into ProcessCircle, replaces c.Radius with the actual register, and eliminates the function call entirely. The generated assembly might be a single floating point multiply instruction.

When it sees ProcessShape, it generates a call through a pointer. The CPU has to fetch the itab, read the method pointer, and jump. Modern CPUs predict direct branches well. Indirect branches through an interface table are harder to predict. A misprediction stalls the pipeline. In a tight loop running millions of times, those stalled cycles add up. The compiler also cannot inline the method, so it misses opportunities to optimize the surrounding code.

The CPU hates guessing. Give it direct jumps.

A realistic pipeline

Real code rarely lives in isolated functions. It lives in pipelines, handlers, and loops. Here is a typical event processor that takes a hit from interface dispatch.

package main

// Event defines the contract for all log entries.
type Event interface {
    Serialize() []byte
}

// ProcessBatch handles a slice of events.
func ProcessBatch(events []Event) [][]byte {
    out := make([][]byte, len(events))
    for i, e := range events {
        out[i] = e.Serialize() // Dynamic dispatch on every iteration
    }
    return out
}

This pattern is common in message queues, HTTP middleware, and plugin systems. The interface keeps the code flexible. It also forces a runtime lookup on every loop iteration. If Serialize is small, the dispatch overhead can dwarf the actual work.

The fix is not to delete the interface. The fix is to push the interface to the boundary. Accept the interface at the API edge, convert it to a concrete type or a type switch, and run the hot path with concrete values. Go developers follow a simple convention here: accept interfaces, return structs. You take the flexible contract at the door, but you work with concrete data inside the room.

Profile first. Optimize second. Readability survives longer than micro-optimizations.

When optimization backfires

Chasing interface overhead often leads to over-engineering. Developers replace clean abstractions with massive type switches, duplicate logic across concrete types, or pass raw function pointers everywhere. The code becomes harder to test and slower to maintain. The performance gain is usually negligible unless the profiler proves otherwise.

You can verify inlining behavior without guessing. Run go build -gcflags="-m" to see exactly what the compiler inlines and what it skips. The output will list functions that were inlined and functions that were skipped due to interface calls or size limits. If you force a concrete type but the caller passes an interface, the compiler rejects the program with cannot use e (variable of interface type Event) as ConcreteEvent value in argument. That error is a feature. It stops you from accidentally hiding performance regressions behind silent type conversions.

Another common trap is the nil interface panic. An interface is nil only when both its type pointer and data pointer are nil. If you pass a typed nil value like (*ConcreteEvent)(nil) to an interface parameter, the interface is not nil. The type pointer is set. Calling a method on it triggers a runtime panic: value method called using nil *ConcreteEvent pointer. The compiler does not catch this. The runtime does. Always check for nil explicitly or design your types to avoid nil receivers.

If you are allocating thousands of concrete objects per second, consider the arena package or sync.Pool to reduce garbage collection pressure. Dynamic dispatch itself cannot be eliminated without changing the type system, but you can reduce the surrounding allocation noise.

The compiler optimizes what it can see. Hide the interface behind the hot path.

Choosing the right boundary

Performance and flexibility are not mutually exclusive. They just live in different places. Use the right tool for the layer you are writing.

Use an interface when you are building a library boundary or testing with mocks. Use a concrete type when you are inside a tight loop or a performance-critical path. Use a type switch when you genuinely need different behavior per type and the overhead is acceptable. Use a function pointer when you want to pass behavior without the full interface machinery. Stick with the interface when the profiler shows it accounts for less than one percent of CPU time.

Measure the flame graph. Trust the data, not the intuition.

Where to go next

Dynamic dispatch is like asking a receptionist to find the right person to handle your call instead of dialing their direct line. It adds a small delay because the computer has to look up which specific function to run at runtime. You use it for flexibility, but it costs a tiny bit of speed compared to calling a function directly.