How to Implement Compression Middleware (gzip) in Go

The bandwidth bottleneck

You ship a new API endpoint. It returns a fifty kilobyte JSON payload. On a fast local network, the response takes two milliseconds. On a throttled mobile connection, it takes four seconds. The client already told your server it can handle compressed data. The Accept-Encoding header is right there in the request. Your server ignores it and sends the raw bytes anyway. Every megabyte you skip compressing is money leaving your pocket and latency entering your user experience.

How the compression handshake works

HTTP compression is a negotiated exchange. The client announces which algorithms it supports in the Accept-Encoding header. The server picks one, compresses the response body on the fly, and announces the chosen algorithm in the Content-Encoding header. The browser or HTTP client decompresses it before rendering or parsing. Go makes this straightforward because the standard library already ships with compress/gzip and a flexible http.ResponseWriter interface. You do not need a third-party router or a heavy framework. You just need to wrap the response writer in a compression stream before handing it to your handler.

The http.ResponseWriter interface defines three methods: Header(), Write(), and WriteHeader(). Middleware works by intercepting the w parameter and swapping it with a custom writer that implements the same interface. When your handler calls w.Write(data), the call routes through your wrapper first. The wrapper compresses the bytes and forwards them to the original network socket. The application logic never knows the difference.

Gzip is a streaming format. It does not require the entire payload in memory before it starts writing. The compressor maintains a sliding window of recent bytes, finds repeating patterns, and emits a smaller representation. This means your server can start sending compressed data to the client while your database query or template engine is still generating the rest of the response. Memory usage stays flat regardless of payload size.

Compression middleware belongs near the edge of your request pipeline. Place it after authentication and routing, but before your business logic generates the payload. This ordering ensures you only compress responses that actually reach the client, and you avoid wasting CPU on requests that get rejected early.

Wrap the writer early. Let the stream do the work.

The minimal wrapper

Here is the simplest working wrapper. It checks the request header, swaps the response writer for a gzip stream, and delegates the rest to your actual handler.

package main

import (
	"compress/gzip"
	"net/http"
	"strings"
)

// GzipMiddleware wraps an http.Handler to compress responses when the client supports it.
func GzipMiddleware(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		// Skip compression if the client did not advertise gzip support.
		if !strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
			next.ServeHTTP(w, r)
			return
		}

		// Tell the client the response body will be gzip encoded.
		w.Header().Set("Content-Encoding", "gzip")
		// Remove Content-Length because compression changes the byte count.
		w.Header().Del("Content-Length")

		// Create a gzip writer that streams directly into the original response writer.
		gz := gzip.NewWriter(w)
		// Ensure the stream flushes and closes before the request finishes.
		defer gz.Close()

		// Hand the compressed stream to the downstream handler.
		next.ServeHTTP(gz, r)
	})
}

Walking through the request lifecycle

The wrapper intercepts the request before your business logic runs. It reads Accept-Encoding and looks for the substring gzip. If the header is missing or only lists br or deflate, the middleware steps aside and calls next.ServeHTTP(w, r) with the original writer. The request proceeds uncompressed.

When gzip is supported, the middleware sets Content-Encoding to gzip. It also deletes Content-Length. You cannot know the final compressed size until the stream finishes, so leaving the original length header would cause HTTP clients to truncate or hang. The gzip.NewWriter(w) call creates a streaming compressor. It does not buffer the entire response in memory. It writes chunks to the underlying network socket as your handler generates them. The defer gz.Close() call is mandatory. Without it, the gzip trailer never gets written, and the client receives a truncated archive that fails to decompress.

Go conventions favor explicit error handling, but gzip.Writer methods return errors that are rarely actionable in middleware. The underlying http.ResponseWriter handles network drops at the transport layer. If you try to capture the error from gz.Write, you will mostly catch closed-connection errors that the HTTP server already manages. The community accepts this silence because the compression layer is transparent to the application logic.

The strings.Contains check is intentionally simple. HTTP headers can contain multiple values separated by commas, like Accept-Encoding: gzip, deflate, br. The substring match covers all valid placements without parsing the full header syntax. If you need strict RFC compliance, you can split the header and iterate, but the substring approach handles ninety-nine percent of real-world traffic correctly.

Headers travel first. Payload follows. Keep the order strict.

Production-ready compression

Production handlers often need to flush partial responses, stream events, or set custom headers after writing begins. The basic wrapper breaks if your handler calls Flush() or expects a http.ResponseWriter that supports streaming. Here is a version that preserves the flusher interface and handles panic recovery.

package main

import (
	"compress/gzip"
	"net/http"
	"strings"
)

// GzipMiddleware wraps an http.Handler with gzip compression and panic recovery.
func GzipMiddleware(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		if !strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
			next.ServeHTTP(w, r)
			return
		}

		w.Header().Set("Content-Encoding", "gzip")
		w.Header().Del("Content-Length")

		// Wrap the response writer to preserve the Flusher interface.
		gz := gzip.NewWriter(w)
		defer func() {
			// Recover from panics to ensure the gzip stream closes cleanly.
			if err := recover(); err != nil {
				gz.Close()
				panic(err)
			}
			gz.Close()
		}()

		// Create a custom writer that delegates Flush to the underlying response.
		flusher := &gzipFlusher{Writer: gz, ResponseWriter: w}
		next.ServeHTTP(flusher, r)
	})
}

// gzipFlusher combines gzip.Writer with http.Flusher support.
type gzipFlusher struct {
	*gzip.Writer
	http.ResponseWriter
}

// Flush writes buffered data to the network and resets the gzip buffer.
func (f *gzipFlusher) Flush() {
	f.Writer.Flush()
	if fl, ok := f.ResponseWriter.(http.Flusher); ok {
		fl.Flush()
	}
}

The gzipFlusher struct embeds both *gzip.Writer and http.ResponseWriter. Go's method promotion automatically routes Write() and Header() calls to the correct underlying type. The custom Flush() method bridges the gap. It pushes the gzip buffer to the network, then checks if the original response writer supports flushing. If it does, the method calls it. This keeps Server-Sent Events and chunked responses working correctly.

Panic recovery sits inside a deferred closure. Standard defer statements run even when a panic occurs, but the panic propagates upward. The closure captures the panic, forces the gzip stream to close, and re-panics. This guarantees the trailer gets written before the HTTP server catches the panic and returns a 500 status.

Go conventions dictate that middleware should be lightweight and composable. Do not add logging, metrics, or authentication inside the compression wrapper. Keep concerns separate. Chain middleware in the order that matches the request lifecycle. If you need to pass request-scoped data through the chain, use context.Context as the first parameter in your handler functions, conventionally named ctx. The compression layer does not touch the context. It only transforms the response stream.

Stream responsibly. Flush when the client waits.

Pitfalls and compiler traps

Forgetting to import strings or compress/gzip triggers immediate compiler rejections. The compiler complains with undefined: strings or undefined: gzip if the import list is incomplete. If you accidentally pass the wrong type to next.ServeHTTP, you get cannot use gz (variable of type *gzip.Writer) as http.ResponseWriter value in argument. The type system catches this before the server starts.

Runtime panics usually come from mismanaged streams. Calling w.Header().Set() after next.ServeHTTP begins writing will panic with http: Header method called after WriteHeader. The middleware must set headers before handing control to the downstream handler. Another common trap is wrapping a handler that already sets Content-Encoding. Double compression turns your JSON into unreadable binary garbage. Always verify that your handlers do not pre-emptively set encoding headers.

Memory leaks happen when defer gz.Close() is missing or when a panic bypasses the defer chain. The gzip trailer contains a CRC32 checksum and the original size. Without it, the client waits for more data until the connection times out. The panic recovery wrapper above ensures the stream closes even when your handler crashes.

Do not pass a *string for configuration values. Strings are already cheap to pass by value. If you need to toggle compression per-route, pass a boolean flag or a configuration struct by value. The compiler will reject any attempt to mutate an immutable string pointer in a concurrent context.

Trust gofmt. Argue logic, not formatting. The compression wrapper follows standard Go layout rules. Run gofmt on save. Let the tool decide indentation and brace placement. It keeps the codebase consistent across teams and reduces merge conflicts.

The worst goroutine bug is the one that never logs. The worst middleware bug is the one that silently drops headers.

When to compress and when to skip it

Use gzip middleware when your API returns large JSON, XML, or HTML payloads and your clients support standard compression. Use a dedicated compression library like github.com/andybalholm/brotli when you need better compression ratios and your traffic profile justifies the extra CPU cost. Use raw http.ResponseWriter without middleware when you are serving static files through a reverse proxy like Nginx or Caddy, which handles compression at the infrastructure layer. Use plain sequential code when you don't need concurrency: the simplest thing that works is usually the right thing.

Where to go next

Compression middleware (gzip) in Go compresses your server's responses to save bandwidth. It checks if the client supports gzip compression; if yes, it shrinks the data before sending it. Think of it like putting a letter in a vacuum-sealed bag before mailing it to save space.