When the compiler optimizes against you
You spent an afternoon shaving microseconds off a tight loop. You refactored a messy block into a clean helper function. The code looks beautiful. You run the benchmark. Performance dropped by 20%. The compiler inlined the function before, but now it didn't. Or worse, the inlining changed the register allocation and killed the cache locality. You need a way to tell the compiler exactly what to do.
Or you are writing low-level code that wraps a system call. The function must not trigger stack growth. If the runtime tries to expand the stack while you are holding registers in a fragile state, the program corrupts memory or panics. You need to disable the stack check.
Go gives you compiler directives. They are escape hatches for the optimizer. They let you override the compiler's decisions when you have information the compiler lacks.
Directives in plain words
The Go compiler is aggressive. It tries to make your code fast by inlining functions. Inlining means the compiler copies the function body directly into the caller instead of making a function call. This removes the overhead of jumping to a new stack frame and passing arguments. It also lets the optimizer see through the function boundary to reorder instructions or eliminate dead code.
Usually, inlining is good. Sometimes, it breaks your assumptions. A function might get inlined into a massive loop, making the loop too large for the CPU's instruction cache. The CPU spends cycles fetching instructions from L2 cache instead of L1. Performance tanks. Or you might need a distinct stack frame for profiling. If the function inlines, the frame disappears from the stack trace.
//go:noinline tells the compiler to leave the function alone. It forces the compiler to generate a call instruction.
//go:nosplit is rarer and more dangerous. It prevents the runtime from growing the stack while that function runs. Every Go function normally checks if the stack has room to grow. If not, it calls the runtime to allocate more stack. With //go:nosplit, that check vanishes. The function runs with whatever stack space is left. If it runs out, the program crashes. This matters for signal handlers, syscall wrappers, and runtime internals where stack growth could corrupt state.
The compiler optimizes for speed. You optimize for correctness. Directives bridge the gap.
Minimal syntax
Directives are comments that start with //go:. They must appear on the line immediately before the function declaration. No blank lines. No other comments between the directive and the function.
Here's the simplest usage: force a function to stay separate and disable stack growth for another.
//go:noinline
// ForceNoInline prevents the compiler from copying this function body into callers.
func ForceNoInline() int {
// This function will always generate a CALL instruction.
// The binary contains a separate symbol for this function.
// Inlining is disabled regardless of function size or call count.
return 1 + 2
}
//go:nosplit
// DisableStackGrowth removes the stack check prologue from this function.
func DisableStackGrowth() {
// The runtime will not expand the stack while this function runs.
// If the function needs more stack space than is available, it panics.
// This is unsafe for general application code.
}
gofmt preserves these comments exactly. You don't need to worry about formatting tools stripping them. The convention is to place the directive on its own line, right before the func keyword.
How the compiler processes directives
When the compiler parses the source, it scans for comments starting with //go:. These are treated as compiler directives, not regular comments. The parser attaches the directive to the function node in the abstract syntax tree.
For //go:noinline, the inlining pass checks a flag on the function. If the flag is set, the compiler skips the function entirely. It treats the function as too expensive to inline. The result is a standard function call. The caller pushes arguments, jumps to the function, and pops the result. The stack frame exists. The symbol exists in the binary.
For //go:nosplit, the compiler modifies the function prologue. Normally, the compiler emits code to compare the stack pointer against a limit. If the stack is too low, it jumps to a runtime helper called morestack. That helper allocates more stack space and resumes the function. With //go:nosplit, the compiler omits the comparison and the jump. The function assumes there is enough stack.
The compiler also enforces strict rules for //go:nosplit functions. It analyzes the function body to ensure no heap allocation occurs. Heap allocation might trigger the garbage collector. The garbage collector needs stack space. If a //go:nosplit function triggers GC, the program crashes because the stack cannot grow. The compiler rejects code that violates this rule.
Inlining is a performance feature. Noinline is a debugging and sizing tool.
Realistic examples
Forcing a stack frame for tracing
You are building a library that captures stack traces for debugging. You want the trace to include the function that requested the trace. If that function inlines, the trace loses the frame. The caller appears to call runtime.Callers directly. You lose context.
//go:noinline
// GetStackTrace captures the call stack including this frame.
func GetStackTrace() []runtime.Frame {
// Inlining this function would remove it from the stack trace.
// The caller would appear to call runtime.Callers directly.
// Keeping this frame helps identify where the trace was requested.
var pc [32]uintptr
n := runtime.Callers(1, pc[:])
frames := runtime.CallersFrames(pc[:n])
var result []runtime.Frame
for {
frame, more := frames.Next()
result = append(result, frame)
if !more {
break
}
}
return result
}
The //go:noinline directive ensures GetStackTrace always appears in the trace. Profiling tools like pprof also benefit. If a function inlines, the CPU time attributes to the caller. You might see a hot spot in a large function and not realize it comes from a specific helper. Forcing noinline keeps the attribution accurate.
Low-level syscall wrapper
You are writing a wrapper for a system call. The wrapper must not trigger stack growth. Stack growth moves the stack pointer. If you have pointers to local variables or arguments, moving the stack breaks them. The syscall interface requires arguments in registers or at fixed stack offsets. Stack growth corrupts the layout.
//go:nosplit
// RawSyscall performs a system call without stack growth.
func RawSyscall(trap, a1, a2, a3 uintptr) (r1, r2 uintptr, err syscall.Errno) {
// Stack growth during a syscall can corrupt arguments passed on the stack.
// This function must fit within the current stack frame.
// The compiler verifies no heap allocation occurs here.
// If allocation is needed, the compiler rejects the code.
return 0, 0, 0
}
This pattern appears in the runtime and syscall packages. Application code rarely needs //go:nosplit. If you find yourself using it, you are likely doing something very low-level. The compiler error is a gift. It stops you from writing code that crashes at runtime.
Pitfalls and errors
Binary size bloat
//go:noinline can make your binary larger. If a small function is called thousands of times, inlining replaces the call with the function body. The code repeats, but the CPU executes it faster. If you force noinline, the compiler generates the function once and emits a call at every site. The call instruction is small, but the function body is not repeated. For tiny functions, the call overhead might dominate. For large functions, inlining bloats the binary. The CPU instruction cache suffers.
Use //go:noinline only when you have measured the impact. Profile before and after. If the binary size grows significantly and performance drops, remove the directive.
Nosplit allocation errors
The compiler rejects //go:nosplit functions that allocate on the heap. If you try to create a slice, map, or channel inside a //go:nosplit function, the compiler stops with an error.
The compiler rejects this with go:nosplit function cannot allocate.
You also cannot call other functions that allocate. The compiler tracks allocation across call boundaries. If you call a function that might allocate, the error propagates.
The compiler rejects this with go:nosplit function cannot call allocating function.
Defer statements allocate. You cannot use defer inside a //go:nosplit function.
The compiler rejects this with go:nosplit function cannot call defer.
Nosplit stack overflow
If a //go:nosplit function uses more stack than is available, the program panics. There is no recovery. The stack check is gone. The function writes past the stack limit. The runtime detects the corruption and crashes.
This usually happens if the function has large local variables or calls other functions that use deep stack frames. Keep //go:nosplit functions small. Avoid recursion. Avoid large arrays on the stack.
Noinline on already-large functions
If a function is too large to inline, //go:noinline does nothing. The compiler already decided not to inline it. The directive is a no-op. You won't get an error. The function just stays as a separate call.
Nosplit is a landmine. Only step on it if you know exactly where the ground is solid.
Decision matrix
Use //go:noinline when you need a distinct stack frame for profiling or stack traces.
Use //go:noinline when inlining causes binary size bloat that hurts instruction cache performance.
Use //go:noinline when you are benchmarking a specific function and want to exclude call overhead from the measurement.
Use //go:nosplit when you are writing low-level runtime code where stack growth would corrupt state or arguments.
Use //go:nosplit when implementing signal handlers or syscall wrappers that must not trigger stack expansion.
Use the default compiler behavior when you are writing application code. The optimizer knows more about your code than you do.
Directives are overrides. Use them to fix a bug, not to guess at optimization.