How to Build a Caching Proxy in Go

The dashboard that broke the API

You built a dashboard that polls /api/stats every second. The upstream server starts returning 503 errors after a minute. The data doesn't change that fast. You need a buffer between your client and the upstream. A caching proxy sits in the middle, remembers responses, and serves them instantly without bothering the origin.

The middleman with a sticky note

A caching proxy is a middleman with a sticky note pad. When a request comes in, it checks the pad. If the answer is written down, it hands it back immediately. If not, it runs to the upstream server, gets the answer, writes it on the pad, and gives it to the client. Next time, it's instant.

In Go, this means an HTTP server that intercepts requests, checks a map, and only forwards what it doesn't have. The map lives in memory. Access is fast. The trade-off is that the cache disappears when the process restarts, and the cache grows until you evict entries. For many internal tools, that trade-off is acceptable.

Minimal proxy

Here's the skeleton: a handler that checks a sync.Map before calling the upstream.

package main

import (
	"io"
	"net/http"
	"sync"
)

// cache holds responses keyed by path.
// sync.Map handles concurrent access safely without explicit locks.
var cache = sync.Map{}

func main() {
	mux := http.NewServeMux()
	mux.HandleFunc("/", handler)
	// Listen on port 8080.
	http.ListenAndServe(":8080", mux)
}

// handler serves cached responses or fetches from upstream.
func handler(w http.ResponseWriter, r *http.Request) {
	// Load returns the value and a boolean indicating presence.
	if val, ok := cache.Load(r.URL.Path); ok {
		// Type assertion to retrieve the stored bytes.
		body := val.([]byte)
		w.Write(body)
		return
	}

	// Cache miss: proxy the request to the upstream server.
	resp, err := http.Get("http://example.com" + r.URL.Path)
	if err != nil {
		http.Error(w, "upstream error", http.StatusBadGateway)
		return
	}
	defer resp.Body.Close()

	// Read the entire body to cache it.
	body, err := io.ReadAll(resp.Body)
	if err != nil {
		http.Error(w, "read error", http.StatusInternalServerError)
		return
	}

	// Store the body in the cache.
	cache.Store(r.URL.Path, body)
	w.Write(body)
}

How the cache flows

When the server starts, cache is empty. The first request for /data hits handler. cache.Load returns nil and false. The code calls http.Get. The upstream responds. io.ReadAll pulls the body into memory. cache.Store saves it. The response goes to the client.

The second request for /data hits handler. cache.Load finds the key. It returns the bytes. The code writes them and returns. No network call.

There's a trap in this flow. You cannot cache an http.Response object directly. The Body field is a stream. Once you read it, the stream is exhausted. If you store the response and try to read the body again later, you get nothing. You must read the body into a byte slice before storing it. The code above does this with io.ReadAll.

sync.Map is specialized. It trades write performance for read performance. It uses a read map and a dirty map internally. Reads are lock-free for most cases. Writes are slower. For a cache, reads dominate. This is the right tool. A plain map with a sync.Mutex would lock on every read, which adds latency under load.

Go forces you to check errors. The if err != nil pattern is verbose by design. The community accepts the boilerplate because it makes the unhappy path visible. You cannot accidentally swallow an error. The compiler rejects the program if you ignore a return value that isn't assigned.

Caching is fast. Network is slow. Check the map first.

Real-world handler

Here's a production-ready handler: it caches status codes and headers, uses a client with timeouts, and respects context cancellation.

// entry holds the full response state for caching.
type entry struct {
	status int
	header http.Header
	body   []byte
}

// cache stores entries keyed by path.
var cache = sync.Map{}

// client enforces a timeout so the proxy doesn't hang.
var client = &http.Client{Timeout: 5 * time.Second}

// handler proxies requests, caching hits and forwarding misses.
func handler(w http.ResponseWriter, r *http.Request) {
	if val, ok := cache.Load(r.URL.Path); ok {
		e := val.(*entry)
		for k, v := range e.header {
			w.Header()[k] = v
		}
		w.WriteHeader(e.status)
		w.Write(e.body)
		return
	}

	ctx, cancel := context.WithTimeout(r.Context(), 3*time.Second)
	defer cancel()

	req, err := http.NewRequestWithContext(ctx, r.Method, "http://example.com"+r.URL.Path, nil)
	if err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}

	resp, err := client.Do(req)
	if err != nil {
		http.Error(w, "upstream error", http.StatusBadGateway)
		return
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		http.Error(w, "read error", http.StatusInternalServerError)
		return
	}

	cache.Store(r.URL.Path, &entry{
		status: resp.StatusCode,
		header: resp.Header,
		body:   body,
	})

	for k, v := range resp.Header {
		w.Header()[k] = v
	}
	w.WriteHeader(resp.StatusCode)
	w.Write(body)
}

// copyHeaders copies headers from src to dst.
func copyHeaders(dst, src http.Header) {
	for k, v := range src {
		dst[k] = v
	}
}

The realistic handler adds three layers of safety. First, it stores the status code and headers. A proxy that only caches the body breaks when the upstream returns a 404 or a redirect. The client needs the full response.

Second, it uses a custom http.Client with a timeout. http.Get uses the DefaultClient, which has no timeout. If the upstream hangs, your goroutine hangs forever. The pool of goroutines exhausts, and the proxy stops accepting new requests. A timeout kills the request and returns an error.

Third, it uses context.WithTimeout. Context is plumbing. It carries deadlines and cancellation signals. The proxy propagates the client's context to the upstream. If the client disconnects, the upstream request cancels. This prevents wasted work. Functions that take a context should respect cancellation and deadlines. By convention, context.Context always goes as the first parameter, named ctx.

The helper copyHeaders loops over the source headers and assigns them to the destination. HTTP headers are maps of string slices. You cannot assign the map directly because the ResponseWriter header map is a special type. You must copy the values.

Context is plumbing. Run it through every long-lived call site.

Pitfalls and panics

sync.Map stores interface{}. You must type assert when loading. If you assert the wrong type, the program panics at runtime. The compiler cannot check this because the type is erased when stored.

If you store []byte and load as string, the program crashes with panic: interface conversion: interface {} is []byte, not string. Always store and load the same type. Use a struct like entry to group related data and avoid mismatched types.

The cache grows forever. Every unique path adds an entry. If the upstream serves dynamic content, the cache fills with useless data. Memory usage climbs until the process gets killed by the OOM killer. You need an eviction strategy. A simple approach is to limit the cache size and delete random entries when full. A better approach is to use a library that supports LRU eviction or TTL expiration.

Forgetting to close the response body leaks connections. The http.Client reuses TCP connections via a pool. If you don't close the body, the connection stays open and never returns to the pool. The pool exhausts, and the proxy blocks on new requests. Always defer resp.Body.Close().

The worst goroutine bug is the one that never logs.

When to cache

Use a caching proxy when the upstream is slow or rate-limited and the data changes infrequently.

Use sync.Map for the cache when you have high concurrent reads and occasional writes, avoiding the overhead of a mutex lock on every access.

Use a map with sync.RWMutex when you need to iterate over the cache or perform bulk operations, since sync.Map doesn't support range iteration.

Use an external cache like Redis when the proxy restarts and you need to preserve the cache, or when the cache size exceeds available memory.

Use plain forwarding without caching when the data is dynamic per-request, like user-specific dashboards or real-time stock prices.

Where to go next

A caching proxy in Go creates a middleman server that sits between your users and the real website. When a user asks for a page, the server checks if it already has a copy saved in its memory. If it does, it sends the saved copy immediately; if not, it fetches the page from the real website, saves a copy, and then sends it to the user. This makes repeated requests much faster because the server doesn't have to talk to the real website every single time.