How to Limit the Number of Concurrent Goroutines

The traffic jam problem

You write a script to process a list of one thousand URLs. You spawn a goroutine for each one. Your CPU spikes to one hundred percent. The target server sees a sudden flood of requests and starts returning 429 Too Many Requests. Your program crashes or slows to a crawl because it is fighting itself. You need a way to say, "Only ten goroutines run at the same time. The rest wait their turn."

The semaphore pattern in plain English

Go does not ship with a dedicated semaphore type in the standard library. The idiomatic solution uses a buffered channel as a counting mechanism. Think of the channel like a parking lot with exactly ten spots. Every goroutine needs a parking spot before it can start working. When a goroutine finishes, it leaves the lot and frees up a spot for the next one. The channel capacity acts as the hard limit. Sending to the channel claims a spot. Receiving from the channel releases it.

This pattern is called a counting semaphore. It does not pass data between goroutines. It only passes control. The struct{}{} type is the standard token because it occupies zero bytes. You are not moving values. You are moving permission.

Goroutines are cheap. Channels are not magic.

A minimal semaphore

Here is the simplest implementation. The channel capacity sets the concurrency ceiling. The main goroutine blocks on the send when the limit is reached. The workers release the token when they exit.

package main

import (
	"fmt"
	"time"
)

// limitConcurrency demonstrates the buffered channel semaphore pattern.
func limitConcurrency() {
	// Capacity of 3 means only three goroutines can hold a token at once.
	sem := make(chan struct{}, 3)

	for i := 0; i < 10; i++ {
		// Block until a slot is available, then claim it.
		sem <- struct{}{}

		go func(id int) {
			// Release the slot when the goroutine exits.
			defer func() { <-sem }()

			fmt.Printf("Working on %d\n", id)
			time.Sleep(500 * time.Millisecond)
		}(i)
	}

	// Wait until all ten goroutines have finished and released their slots.
	for i := 0; i < 10; i++ {
		<-sem
	}
}

How the runtime handles the blocking

The main goroutine loops ten times. On the first three iterations, sem <- struct{}{} succeeds immediately because the buffer has empty slots. The three worker goroutines start executing. On the fourth iteration, the channel is full. The send operation blocks. The main goroutine pauses and yields the CPU.

One of the workers finishes its sleep and runs the deferred receive. That operation frees a buffer slot. The scheduler wakes the main goroutine. It claims the newly freed slot and spawns the fourth worker. This cycle repeats until all ten workers have run. The final loop drains the channel to guarantee the program does not exit while background work is still in flight.

The compiler enforces type safety on the channel. If you try to send a string to a chan struct{}, the compiler rejects this with cannot use "hello" (untyped string constant) as struct{} value in send. The token type must match exactly.

Context is plumbing. Run it through every long-lived call site.

Real-world usage: bounded network calls

Production code rarely sleeps for demonstration. It makes HTTP requests, queries databases, or reads files. You need to respect cancellation, handle errors visibly, and keep the concurrency cap intact. The context.Context parameter always goes first, conventionally named ctx. Functions that take a context should respect cancellation and deadlines.

// fetchWithLimit demonstrates applying the semaphore to HTTP requests.
func fetchWithLimit(ctx context.Context, urls []string) {
	// Limit to five concurrent network calls.
	sem := make(chan struct{}, 5)

	for _, url := range urls {
		// Claim a concurrency slot before making the request.
		sem <- struct{}{}

		go func(target string) {
			defer func() { <-sem }()

			req, err := http.NewRequestWithContext(ctx, http.MethodGet, target, nil)
			if err != nil {
				fmt.Println("bad request:", err)
				return
			}

			resp, err := http.DefaultClient.Do(req)
			if err != nil {
				fmt.Println("fetch failed:", err)
				return
			}
			defer resp.Body.Close()
		}(url)
	}
}

The if err != nil { return err } pattern is verbose by design. The community accepts the boilerplate because it makes the unhappy path visible. You cannot accidentally swallow a network failure. The defer resp.Body.Close() ensures the connection returns to the pool even if the caller ignores the response.

Goroutine leaks happen when the goroutine waits on a channel that never gets closed. Always have a cancellation path.

Where things go wrong

The semaphore pattern is simple, but it has sharp edges. The most common mistake is forgetting to capture the loop variable. If you write go func() { process(url) }() without passing url as a parameter, all goroutines will read the final value of url after the loop finishes. Go 1.22+ catches this at compile time with loop variable url captured by func literal. Pass the variable explicitly to freeze its value at spawn time.

Another failure mode is early returns without releasing the token. If a goroutine panics and you do not have a recovery mechanism, the deferred receive still runs, but the program crashes. If you return early from a conditional branch and forget the defer, the channel fills up permanently. The main goroutine blocks forever on the next send. The program deadlocks. The runtime will print fatal error: all goroutines are asleep - deadlock! and terminate.

Do not pass a *string. Strings are already cheap to pass by value. The same applies to struct{} tokens. Keep the channel payload minimal.

Trust the scheduler. Argue logic, not formatting.

Picking the right concurrency tool

Concurrency in Go has several patterns. Each solves a different constraint. Match the tool to the problem.

Use a buffered channel semaphore when you need a simple, dependency-free way to cap concurrency for a batch of tasks. Use a worker pool with a fixed number of goroutines when tasks are long-lived or require heavy initialization like database connections. Use sync.WaitGroup alone when you only need to track completion and do not care about limiting parallelism. Use errgroup from golang.org/x/sync when you want concurrency limiting, error propagation, and context cancellation in one package. Use sequential code when the operation is CPU-bound and Go cannot parallelize it across cores, or when the overhead of goroutines outweighs the benefit.

Accept interfaces, return structs. Keep your concurrency boundaries explicit.

Where to go next

Limiting the number of concurrent goroutines uses a channel with a fixed size to act like a ticket system. Only a specific number of tickets exist, so only that many workers can start at once. When a worker finishes, it returns its ticket, allowing another worker to start.