When the scheduler misses the hardware
You deploy a Go service to a modern cloud instance. The machine has sixteen logical cores. Your metrics show the application is only utilizing two. You scale up the instance, but the throughput stays flat. The bottleneck is not your code. It is the scheduler's view of the hardware. Go's runtime decides how many operating system threads to keep alive for running goroutines. That decision is controlled by a single function: runtime.GOMAXPROCS.
What GOMAXPROCS actually controls
Goroutines are lightweight. You can spawn millions of them without exhausting memory. The operating system cannot handle millions of threads. Context switching that many would grind the CPU to a halt. Go solves this with a user-space scheduler. The scheduler takes your goroutines and multiplexes them onto a smaller pool of OS threads. GOMAXPROCS sets the size of that pool.
Think of a restaurant kitchen. Goroutines are incoming orders. OS threads are the chefs cooking them. GOMAXPROCS is the number of chefs you put on the floor. If you open fifty chef stations but only have five stoves, the extra stations sit empty. If you open two stations for fifty orders, the queue backs up and customers leave. The scheduler needs to know how many stoves you actually have.
Before Go 1.5, the default was one. Every Go program ran on a single OS thread unless you explicitly changed it. That caused widespread confusion. Go 1.5 changed the default to match the number of logical CPU cores. Today, you almost never need to touch this setting. The runtime detects your hardware and sets the pool size automatically.
Trust the scheduler. It knows your machine better than you do.
The basic interface
Here is the simplest way to read and change the setting. The function takes an integer and returns the previous value.
package main
import (
"fmt"
"runtime"
)
func main() {
// Pass 0 to read the current setting without changing it
// The runtime returns the active thread pool size
current := runtime.GOMAXPROCS(0)
fmt.Printf("Scheduler is using %d OS threads\n", current)
// Pass a positive integer to change the pool size
// The function returns the old value for logging or restoration
old := runtime.GOMAXPROCS(2)
fmt.Printf("Changed from %d to 2 threads\n", old)
}
The return value exists because configuration changes sometimes need to be tracked. You can log the previous state or restore it later. The convention in Go is to call this function once, at the very top of main, before any goroutines are spawned. Changing it while the scheduler is already running works, but it forces the runtime to rebalance work across threads. That rebalance adds overhead. Set it early and leave it alone.
If you pass the wrong type, the compiler rejects the program with cannot use "4" (untyped string constant) as int value in argument. Go's type system catches configuration mistakes before they reach the scheduler.
How the scheduler uses the value
When your program starts, the Go runtime initializes its scheduler. It reads the GOMAXPROCS value and creates that many OS threads. Each thread runs a scheduler loop. The loop pulls goroutines from a global queue or a local per-thread queue. If a goroutine blocks on I/O, the thread hands it off to a system call handler and pulls the next goroutine from the queue. If a goroutine does heavy computation, the thread runs it until a preemption point or until the scheduler decides to move it to another thread.
The scheduler uses work-stealing. If one thread finishes its local queue, it steals work from a busy thread. This keeps all threads fed. GOMAXPROCS caps how many threads participate in this dance. The runtime does not create more OS threads than this number for running goroutines. System calls and network polls may spawn temporary threads, but those are separate from the main execution pool.
You can query the setting at any time. Passing zero returns the current value. This is useful for startup diagnostics. Many production services log their effective concurrency during initialization. It helps you verify that the runtime sees the hardware you expect.
The scheduler is designed to be self-tuning. You rarely need to override it.
Containers and cgroup detection
Containers change the math. A virtual machine might have sixty-four cores, but your Kubernetes pod is limited to four vCPUs. If Go reads the host's CPU count, it will try to use sixty-four threads. The container orchestrator will throttle the process, causing massive context switching and latency spikes.
Go 1.16 added cgroup awareness. The runtime now reads Linux cgroup limits and adjusts GOMAXPROCS automatically. It checks both cgroup v1 and v2 hierarchies. This covers most modern container runtimes. You can still override it with an environment variable. This is the standard approach for containerized deployments.
# Override the runtime's auto-detection for a containerized service
# Useful when cgroup limits are misconfigured or when testing locally
GOMAXPROCS=4 ./my-service
If you need to adjust it dynamically inside the code, you can read the environment variable and pass it to the runtime. This pattern appears in services that run in mixed environments.
package main
import (
"fmt"
"os"
"runtime"
"strconv"
)
func main() {
// Check for an explicit override in the environment
// LookupEnv returns false if the variable is not set
if val, exists := os.LookupEnv("GOMAXPROCS"); exists {
n, err := strconv.Atoi(val)
// Validate the parsed integer before applying it
if err == nil && n > 0 {
runtime.GOMAXPROCS(n)
}
}
// Log the effective concurrency for observability
// Passing 0 reads the current value without modifying it
fmt.Printf("Running with %d scheduler threads\n", runtime.GOMAXPROCS(0))
}
The environment variable takes precedence over the cgroup detection. This gives you a safety valve when the runtime's automatic detection misses a constraint. Most production deployments rely on the default behavior and skip the manual override entirely.
Let the runtime read the cgroups. Only override when the infrastructure lies.
Common mistakes and runtime behavior
Setting GOMAXPROCS too high creates thrashing. The CPU spends more time switching between threads than executing goroutines. Cache lines get evicted. Latency jumps. Setting it too low leaves cores idle. The scheduler cannot parallelize work across hardware that it does not know about.
The runtime does not validate your input. If you pass runtime.GOMAXPROCS(1000) on a four-core machine, the runtime creates a thousand OS threads. The operating system will schedule them, but your program will slow to a crawl. You will not get a compiler error. The compiler only checks types. The runtime accepts the value and trusts you.
Another common mistake is assuming GOMAXPROCS limits the total number of goroutines. It does not. You can run a million goroutines on two OS threads. The scheduler will time-slice them. The goroutines will just take longer to finish. The setting controls parallelism, not concurrency.
If you change the value after goroutines are already running, the scheduler adjusts gracefully. It will spawn new threads or park existing ones. The transition is not instantaneous. Some goroutines may finish on the old thread count before the new pool takes effect. This is why early initialization is the standard practice.
Go favors explicit configuration over hidden magic. The runtime package is rarely imported in application code. When you do use it, the community expects you to set values once at startup and leave them alone. The scheduler is designed to be self-tuning. Trust the defaults. Argue logic, not thread counts.
When to touch the setting
Use the default GOMAXPROCS when you are running on bare metal or a standard VM. The runtime detects logical cores and sets the optimal thread pool automatically. Use runtime.GOMAXPROCS(n) when you are debugging a specific concurrency bottleneck and need to isolate thread contention. Use the GOMAXPROCS environment variable when deploying to containers with cgroup limits that the runtime might misread. Use runtime.GOMAXPROCS(0) during startup logging to verify the scheduler's configuration matches your infrastructure. Reach for sequential code when your workload is I/O bound and the scheduler already handles blocking calls efficiently.
The scheduler handles the heavy lifting. You handle the business logic.