The thundering herd problem
Your API receives three requests for the same user profile within a fifty millisecond window. The cache is cold. Without protection, your server fires three identical database queries. The database groans. Response times spike. This is the thundering herd problem, and it happens constantly in production when caches miss or external services go slow.
singleflight solves this by ensuring only one execution of a function runs for a specific key. Other concurrent requests for that same key wait and share the result. Think of a classroom where several students realize they have the same question. Only one student raises their hand and walks to the whiteboard to solve it. The others stay seated and wait. When the first student finishes, everyone gets the answer. No one else walks to the whiteboard.
The package lives in golang.org/x/sync/singleflight. It is not part of the standard library, but the Go team maintains it and production systems rely on it heavily. The core type is Group. Each group tracks active flights by key. When you call Do, you pass a string key and a function. The group checks if a flight is already active for that key. If yes, it queues the caller. If no, it launches the function and marks the key as active.
How it actually works
singleflight does not cache results permanently. It only deduplicates concurrent overlap. Once the function returns, the result broadcasts to all waiters and then discards. The next time someone calls Do with that key, the group treats it as a fresh request. This design keeps memory usage predictable. You do not need to worry about stale data or eviction policies. You only protect against simultaneous execution.
Under the hood, the group uses a mutex and a map to track active flights. When the first caller arrives, it locks the map, checks for the key, finds nothing, inserts a wait group, unlocks, and runs the function. Subsequent callers lock the map, find the key, add themselves to the wait group, unlock, and block. When the function returns, the group broadcasts the result to all waiters and removes the key from the map. The entire lifecycle happens in memory with minimal allocation. The group itself is cheap to create, but you typically declare it at package level so all handlers share the same deduplication namespace.
Minimal example
Here is the simplest way to wire it up. You declare a package-level Group, then call Do with a key and a function that returns any and error.
package main
import (
"fmt"
"time"
"golang.org/x/sync/singleflight"
)
// group tracks concurrent flights across the package
var group singleflight.Group
func main() {
// Key identifies the logical operation. Same key means shared result.
key := "user:42"
// Do blocks until the function finishes. It returns the value, error, and a shared flag.
result, err, shared := group.Do(key, func() (any, error) {
// Simulate a slow external call that takes real time
time.Sleep(100 * time.Millisecond)
return "profile_data", nil
})
// Check for errors before using the result
if err != nil {
fmt.Println("failed:", err)
return
}
// shared is true if another goroutine already ran this flight
fmt.Printf("got %v, was shared: %v\n", result, shared)
}
Walking through the runtime
The Do method returns three values. The first is the result, typed as any. You will need to type-assert it back to your expected type. The second is the error. The third is a boolean called shared. If shared is true, your call did not run the function. It waited for another goroutine to finish and received the cached result. If shared is false, you were the one who actually executed the function.
When you run the minimal example, the main goroutine calls Do. The group sees no active flight for "user:42", so it runs the closure. The closure sleeps for one hundred milliseconds, then returns. The group assigns shared to false because the main goroutine was the runner. If you spawn five goroutines that all call Do with the same key at the exact same moment, only one will get shared: false. The other four will block, receive the result, and print shared: true. They all finish in roughly one hundred milliseconds instead of five hundred.
This behavior is deterministic. The group guarantees that the function runs exactly once per flight. It also guarantees that all waiters receive the exact same return values. If the function returns an error, every waiter gets that error. If the function panics, the group catches it, converts it to an error, and returns it to all waiters. The program does not crash unless you have disabled panic recovery elsewhere.
Realistic HTTP handler
Production code rarely deals with hardcoded strings. You usually wrap external API calls or database queries. Here is how you structure a handler that fetches configuration data while protecting the downstream service from duplicate requests.
package main
import (
"context"
"fmt"
"net/http"
"time"
"golang.org/x/sync/singleflight"
)
// configGroup deduplicates concurrent config fetches
var configGroup singleflight.Group
// fetchConfig retrieves settings from a slow external service.
func fetchConfig(ctx context.Context) (map[string]string, error) {
// Context deadline propagates to the HTTP client
client := &http.Client{Timeout: 2 * time.Second}
req, _ := http.NewRequestWithContext(ctx, http.MethodGet, "https://api.example.com/config", nil)
resp, err := client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
// Parse response body into a map
return map[string]string{"theme": "dark", "lang": "en"}, nil
}
// HandleConfig serves the configuration endpoint.
func HandleConfig(w http.ResponseWriter, r *http.Request) {
key := "global:config:v1"
// Do passes the context implicitly through the closure if needed,
// but singleflight itself does not cancel the running flight.
val, err, _ := configGroup.Do(key, func() (any, error) {
return fetchConfig(r.Context())
})
// Handle errors before type assertion
if err != nil {
http.Error(w, "config fetch failed", http.StatusBadGateway)
return
}
// Type assert the any back to the expected map
cfg := val.(map[string]string)
fmt.Fprintf(w, "theme: %s, lang: %s", cfg["theme"], cfg["lang"])
}
Pitfalls and sharp edges
singleflight is straightforward, but it has sharp edges. The biggest one is context cancellation. singleflight does not automatically cancel a running flight when a waiter's context expires. If you pass a request context into the function, the function must check ctx.Done() or rely on the underlying client's timeout. If you forget to respect the context, the flight will run to completion even if the original HTTP request was aborted. The waiter will get an error, but the database query or API call will still execute. You must design your closure to honor deadlines.
Another trap is key design. The key must uniquely identify the logical operation. If you use a generic key like "fetch" for different user IDs, you will accidentally share results across unrelated requests. The compiler will not catch this. You will get silent data corruption. Always include the distinguishing parameters in the key, like "user:42" or "cache:product:sku-883". Treat the key like a database primary key.
Type assertion errors are common when you forget that Do returns any. If you try to use the result without asserting, the compiler rejects the program with invalid operation: cannot call non-function result (variable of type interface{}). Always assert immediately after the call. If the assertion fails at runtime, you get a panic. Use the comma-ok idiom to handle it gracefully: cfg, ok := val.(map[string]string); if !ok { ... }.
Go developers expect if err != nil { return err } boilerplate because it makes the unhappy path visible. The community accepts the verbosity. When working with singleflight, check the error before touching the any return value. The convention keeps your code readable and prevents nil pointer dereferences. Trust the boilerplate. It pays off when debugging production outages.
Context always goes as the first parameter, conventionally named ctx. Functions that take a context should respect cancellation and deadlines. The singleflight closure captures the context from the outer scope, but the group itself does not monitor it. You are responsible for wiring the timeout logic into the actual network or database call.
When to reach for singleflight
You have several tools for controlling concurrency and caching. Pick the right one based on your exact constraint.
Use singleflight when you need to deduplicate concurrent calls for the same logical operation and you want to protect a downstream service from thundering herds. Use sync.Once when you need to run a function exactly once for the lifetime of the program, such as initializing a global logger or starting a background worker. Use sync.Map or a standard map with a sync.RWMutex when you need to cache results permanently or for a long duration, not just during concurrent overlap. Use errgroup when you want to fan out multiple independent tasks and wait for all of them to finish or fail fast on the first error. Use plain sequential code when you do not need concurrency. The simplest thing that works is usually the right thing.
Deduplication is a safety valve, not a cache. Design your keys carefully and let the group handle the waiting.