How to Use semaphore.Weighted for Concurrency Limiting

The bottleneck problem

You are building a data pipeline that fetches records from three different microservices. You spawn a goroutine for each request. The target servers see fifty simultaneous connections, return 429 Too Many Requests errors, and your program floods the logs with failures. You need a way to throttle the goroutines without rewriting the entire loop. You do not want to serialize everything into a single thread. You want parallelism with a hard ceiling.

A weighted semaphore solves this. It tracks capacity in abstract units instead of simple yes-or-no slots. You define a total budget. Each task declares how much of that budget it needs. The semaphore blocks tasks until enough budget is free, then releases it when the work finishes. The golang.org/x/sync/semaphore package provides this exact primitive.

What a weighted semaphore actually does

Think of a parking garage with a fixed number of spaces. A compact car takes one space. An RV takes three. The gate does not care about the vehicle type. It only cares that the total occupied spaces never exceed the garage limit. When a car leaves, it frees its exact number of spaces. The next vehicle in line checks if enough spaces are free. If yes, it enters. If no, it waits.

In Go, the weight represents how much of your system's budget a single operation consumes. A lightweight API call might cost one unit. A heavy database query that opens multiple connections might cost four. The semaphore maintains a single integer counter and a first-in-first-out queue of waiters. When Acquire is called, the runtime checks the counter. If the counter is high enough, it subtracts the weight and lets the goroutine proceed. If not, the goroutine parks itself in the queue. When Release is called, the counter goes up and the runtime wakes the next waiter in line.

The primitive is built on atomic operations and condition variables. It avoids the overhead of creating and destroying channels for every throttle. It also handles context cancellation cleanly, which is essential for production services that need to shut down gracefully.

Semaphores are capacity trackers, not message buses. Treat them as budget managers.

The minimal pattern

Here is the simplest way to wire a weighted semaphore into a loop. The example reserves capacity, runs a task, and guarantees the capacity returns.

package main

import (
	"context"
	"fmt"
	"time"

	"golang.org/x/sync/semaphore"
)

// Worker simulates a task that consumes system resources
func Worker(id int, weight int64, sem *semaphore.Weighted) {
	ctx := context.Background()
	// Reserve the requested capacity before starting work
	if err := sem.Acquire(ctx, weight); err != nil {
		fmt.Printf("worker %d cancelled: %v\n", id, err)
		return
	}
	// Guarantee capacity returns even if the function panics
	defer sem.Release(weight)

	fmt.Printf("worker %d running with weight %d\n", id, weight)
	time.Sleep(100 * time.Millisecond)
}

func main() {
	// Total capacity of 4 units
	sem := semaphore.NewWeighted(4)

	// Spawn tasks with varying resource costs
	for i := 1; i <= 6; i++ {
		w := int64(i%3 + 1)
		go Worker(i, w, sem)
	}
	time.Sleep(500 * time.Millisecond)
}

The NewWeighted call sets the hard ceiling. The Acquire call blocks until the budget is available or the context fires. The defer ensures the budget returns even if the function exits early. The weights vary between one and three, demonstrating how the semaphore tracks partial consumption of the total capacity.

Always pair Acquire with defer Release. Capacity leaks are silent killers.

Step by step execution

When the program starts, semaphore.NewWeighted(4) allocates a struct containing a counter set to four and an empty wait queue. The loop spawns six goroutines. Each goroutine calls Acquire with a weight of one, two, or three.

The first goroutine arrives. The counter is four. It requests one unit. The counter drops to three. The goroutine proceeds. The second goroutine requests two units. The counter drops to one. It proceeds. The third goroutine requests three units. The counter is one. The request cannot be fulfilled. The goroutine parks itself in the FIFO queue and yields the CPU.

The first two goroutines finish their simulated work and hit defer sem.Release(weight). The counter jumps back up. The runtime checks the queue, finds the waiting goroutine, and checks if the new counter satisfies its weight. If yes, it unparks the goroutine and subtracts the weight. If no, it stays parked. This continues until all tasks complete.

Context integration happens inside Acquire. If you pass a context with a deadline, the runtime attaches a timer. When the deadline fires, the runtime removes the goroutine from the queue, restores its requested weight to the counter, and returns a context.DeadlineExceeded error. This prevents goroutines from hanging forever if the semaphore is misconfigured or if a downstream service stops releasing capacity.

The convention in Go is to pass context.Context as the first parameter to any function that might block. The semaphore package follows this rule. Your wrapper functions should too.

Track the budget, not the goroutines. Let the queue handle the waiting.

Real world usage: API fan-out

Production code rarely calls semaphore.Acquire directly in a loop. It wraps the semaphore in a service layer that handles retries, logging, and error aggregation. Here is a realistic pattern for fanning out HTTP requests while respecting a downstream rate limit.

package main

import (
	"context"
	"fmt"
	"net/http"
	"sync"

	"golang.org/x/sync/semaphore"
)

// Fetcher handles throttled HTTP requests
type Fetcher struct {
	sem *semaphore.Weighted
}

// NewFetcher creates a client with a hard concurrency ceiling
func NewFetcher(maxWeight int64) *Fetcher {
	return &Fetcher{sem: semaphore.NewWeighted(maxWeight)}
}

// Get performs a throttled request and returns the response body
func (f *Fetcher) Get(ctx context.Context, url string, weight int64) ([]byte, error) {
	// Block until capacity is available or context is cancelled
	if err := f.sem.Acquire(ctx, weight); err != nil {
		return nil, fmt.Errorf("acquire capacity: %w", err)
	}
	defer f.sem.Release(weight)

	// Build the request with the inherited context
	req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
	if err != nil {
		return nil, fmt.Errorf("build request: %w", err)
	}

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return nil, fmt.Errorf("execute request: %w", err)
	}
	defer resp.Body.Close()

	// Read the body into a byte slice
	buf := make([]byte, 1024)
	n, err := resp.Body.Read(buf)
	return buf[:n], err
}

func main() {
	ctx := context.Background()
	fetcher := NewFetcher(5)

	var wg sync.WaitGroup
	for i := 0; i < 10; i++ {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			_, err := fetcher.Get(ctx, fmt.Sprintf("https://httpbin.org/delay/%d", id%3+1), 2)
			if err != nil {
				fmt.Printf("request %d failed: %v\n", id, err)
			}
		}(i)
	}
	wg.Wait()
}

The Fetcher struct holds the semaphore. The Get method acquires capacity, performs the HTTP call, and releases capacity. The defer ensures the release happens even if http.DefaultClient.Do panics. The context flows from the caller through Acquire and into http.NewRequestWithContext. If the caller cancels the context, both the semaphore wait and the HTTP request terminate cleanly.

The receiver name is f, matching the type Fetcher. Go convention favors one or two letter receiver names that mirror the type initial. The error wrapping uses fmt.Errorf with %w, which preserves the original error chain for errors.Is and errors.As checks downstream.

Throttle at the edge, not in the middle. Let the semaphore guard the boundary.

Where things go wrong

Weighted semaphores are simple, but they expose a few sharp edges. The most common mistake is forgetting to release capacity. If a goroutine acquires weight and returns early without calling Release, the counter never goes back up. The program eventually blocks forever as the queue fills up. The compiler will not catch this. You will only notice when your service stops responding under load. Always use defer sem.Release(weight) immediately after a successful Acquire.

Passing a negative weight triggers a runtime panic. The semaphore package validates the weight and throws semaphore: negative weight if it is less than zero. This usually happens when you calculate weight dynamically from user input or misconfigured flags. Validate the weight before calling Acquire, or clamp it to a minimum of one.

Another trap is mixing Acquire with unbuffered channels. Some developers try to use a channel to signal completion while also using a semaphore to throttle. This creates deadlocks when the channel send blocks before the semaphore release runs. Keep the semaphore isolated from message passing. Use it strictly for capacity control. Pass results through channels or shared structs only after the work completes.

Context cancellation is often misunderstood. If you pass a context with a deadline to Acquire, the goroutine will back out of the queue when the deadline fires. The weight is restored to the counter. The function returns an error. If you ignore that error and proceed to do work anyway, you have bypassed the throttle. Always check the error from Acquire and return immediately if it is not nil.

The compiler rejects missing imports with undefined: semaphore. It rejects unused imports with imported and not used. It rejects type mismatches with cannot use x (type int) as int64 value in argument. These are straightforward. The runtime panics are the ones that matter.

Guard the release. Validate the weight. Respect the context.

Picking the right concurrency tool

Go provides several ways to limit concurrency. The right choice depends on how your tasks consume resources and how strict your boundaries need to be.

Use semaphore.Weighted when tasks consume different amounts of a shared resource and you need fine-grained budget control. Use an unbuffered channel as a semaphore when every task costs exactly one unit and you want zero external dependencies. Use a worker pool with a fixed goroutine count when you need strict concurrency bounds but do not need dynamic weight allocation. Use sequential execution when the bottleneck is CPU-bound and parallelism adds more overhead than it saves.

The weighted semaphore shines when your workload is heterogeneous. It collapses complex rate-limiting logic into a single integer counter. It integrates with context cancellation out of the box. It avoids the boilerplate of managing channel buffers and select statements for throttle control.

Pick the tool that matches your resource model. Do not overengineer the throttle.

Where to go next

Think of a weighted semaphore as a ticket system where different tasks need different numbers of tickets to run. You set a total limit on available tickets, and each task grabs the specific amount it needs before starting. This prevents too many heavy tasks from running at once while allowing smaller tasks to fit in the gaps.