The scraper that broke the server
You are writing a scraper that needs to hit a hundred endpoints. In Python, threads feel clunky and the GIL slows you down. In C++, spawning a thousand threads eats all your memory and the OS scheduler chokes. You switch to Go, slap a go keyword in front of your function, and suddenly you are handling ten thousand concurrent requests without breaking a sweat. The magic is not in the keyword. It is in what the Go runtime does behind the scenes to make goroutines fundamentally different from the threads you know from other languages.
The cost of concurrency
Operating system threads are heavy. The kernel schedules them, assigns them CPU time, and manages their memory. Creating a thread costs memory, usually 1MB of stack space per thread, and CPU cycles for the context switch. If you spawn too many threads, the system grinds to a halt because the scheduler spends more time switching between them than executing code. The OS also has limits on the number of threads a process can create.
Goroutines are the Go runtime's answer to this problem. They are user-space threads. The Go runtime creates a small pool of OS threads and multiplexes thousands of goroutines onto them. The runtime decides which goroutine runs on which thread. You get the illusion of massive concurrency without the cost of massive OS threads. A goroutine starts with a tiny stack, typically 2KB, which grows dynamically as needed. This allows you to spawn millions of goroutines on a single machine.
Goroutines are cheap. The OS threads are the constraint.
How goroutines work
When you write go func(), the compiler generates code to allocate a small stack and register the function with the scheduler. At runtime, the scheduler picks up this goroutine and assigns it to one of the available OS threads. The number of OS threads is controlled by GOMAXPROCS, which defaults to the number of CPU cores.
The scheduler uses a work-stealing algorithm. Each OS thread has a local queue of goroutines. If a thread finishes its queue, it steals goroutines from another thread's queue. This keeps all CPUs busy and minimizes contention. If a goroutine blocks on a system call, the runtime parks the goroutine and runs another one on the same thread. It may create a new OS thread to keep the CPU busy if needed. The OS never sees your goroutines. It only sees the OS threads. The Go runtime handles all the complexity.
package main
import (
"fmt"
"runtime"
"time"
)
// doWork simulates a task that takes time.
// It prints the OS thread ID to show scheduling behavior.
func doWork(id int) {
// runtime.GOMAXPROCS returns the number of OS threads.
// This helps visualize how goroutines map to threads.
procs := runtime.GOMAXPROCS(0)
fmt.Printf("Goroutine %d running. Max procs: %d\n", id, procs)
time.Sleep(100 * time.Millisecond)
}
func main() {
// Set GOMAXPROCS to 1 to force serialization.
// This demonstrates that goroutines are not OS threads.
runtime.GOMAXPROCS(1)
// Launch multiple goroutines.
// They will run concurrently but share a single OS thread.
for i := 0; i < 5; i++ {
go doWork(i)
}
// Wait for goroutines to finish.
// In real code, use sync.WaitGroup or channels.
time.Sleep(500 * time.Millisecond)
}
Minimal example
A goroutine runs a function concurrently with the code that launched it. The go keyword starts the goroutine and returns immediately. The main goroutine continues without waiting. If the main goroutine exits, the program terminates and all other goroutines are killed. You must coordinate goroutines using channels, sync.WaitGroup, or context cancellation.
package main
import (
"fmt"
"time"
)
// printMessage prints a greeting after a delay.
// The delay simulates work without blocking the main flow.
func printMessage(msg string) {
time.Sleep(100 * time.Millisecond)
fmt.Println(msg)
}
func main() {
// Launch a goroutine to run printMessage concurrently.
// The runtime allocates a small stack and schedules the function.
go printMessage("Hello from goroutine")
// Main continues immediately.
// Without waiting, main exits and the program terminates.
fmt.Println("Main is running")
time.Sleep(200 * time.Millisecond)
}
Real-world pattern: Fan-out with channels
In a real service, you often need to fan out work and collect results. Goroutines handle the concurrency, and channels handle the communication. This follows the CSP model: do not communicate by sharing memory; instead, share memory by communicating. Channels are type-safe and synchronized. They block when full or empty, which coordinates goroutines automatically.
package main
import (
"context"
"fmt"
"time"
)
// fetchTask simulates an async operation.
// It checks ctx.Done() to respect cancellation.
func fetchTask(ctx context.Context, id int) error {
select {
case <-ctx.Done():
// Context cancelled or deadline exceeded.
// Return early to free resources.
return ctx.Err()
case <-time.After(50 * time.Millisecond):
fmt.Printf("Task %d done\n", id)
return nil
}
}
func main() {
// Create a context with a timeout.
// This limits how long the goroutine can run.
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()
// Spawn a goroutine that respects the context.
go func() {
if err := fetchTask(ctx, 1); err != nil {
fmt.Println("Error:", err)
}
}()
// Wait for the context to expire.
<-ctx.Done()
}
Context is plumbing. Run it through every long-lived call site.
Pitfalls and runtime errors
Goroutines are powerful, but they introduce concurrency bugs. The most common issue is a deadlock. If all goroutines block waiting on a channel that never receives, the runtime panics with fatal error: all goroutines are asleep - deadlock!. This happens when you forget to close a channel or when you have a circular dependency between channels.
Goroutine leaks are another danger. A leak occurs when a goroutine waits forever on a channel or context that never completes. The goroutine consumes memory and keeps resources alive. Always provide a cancellation path. Use context.Context to signal cancellation. The convention is to pass ctx as the first parameter, named ctx. Functions that accept a context should check ctx.Done() and return early if cancelled.
// processItem handles a single item.
// It returns immediately if the context is cancelled.
func processItem(ctx context.Context, item string) error {
select {
case <-ctx.Done():
// Cancelled. Stop processing.
return ctx.Err()
default:
// Continue processing.
return nil
}
}
Data races happen when multiple goroutines access shared memory without synchronization. The compiler does not catch data races. You must use the race detector with go run -race or go test -race. The detector reports concurrent accesses with different colors. Fix races by using channels, mutexes, or atomic operations.
The worst goroutine bug is the one that never logs.
Decision matrix
Concurrency adds complexity. Choose the right tool for the job.
Use a goroutine when you have independent I/O calls that can run while others wait. Use a worker pool when you need bounded concurrency to protect a downstream service. Use runtime.LockOSThread when you must call C code that relies on thread-local storage. Use sequential code when the computation is fast and simple; concurrency adds overhead that rarely pays off for microseconds of work.