Logging Best Practices for Go Microservices

Implement structured JSON logging with context propagation and trace IDs to effectively debug and monitor Go microservices.

When logs become noise

Your payment service crashes at 03:00. You pull up the logs and see ten thousand lines of INFO: processing order from five different instances. The error you need is buried somewhere in the middle, mixed with health checks and routine updates. You grep for the order ID, but the ID isn't in the logs because the developer who wrote that handler forgot to include it. You spend an hour correlating timestamps across three services just to find out the database connection timed out. This is the reality of microservices without structured logging.

Structured logging treats every log line as a data record, not a text string. Instead of writing User 123 logged in at 10:00, you emit a JSON object with fields like user_id: 123, event: login, timestamp: 2024-05-20T10:00:00Z. This lets you query logs by field. You can ask "show me all errors for user 123" without regex. The analogy is a shipping label. A plain text log is a handwritten note on a box. A structured log is a barcode scanner event. The scanner records the exact time, location, and handler ID in a database. You can filter, sort, and join those events instantly.

Logs are not for humans to read line-by-line. They are for machines to index and query.

The standard library solution

Go 1.21 introduced log/slog in the standard library. It replaces the old log package for structured needs. slog provides a handler-based design. You create a handler that defines the output format, then use the logger to emit records. The handler marshals the records to JSON, text, or any custom format.

Here's the simplest structured logger: initialize a JSON handler, set it as the default, and log with attributes.

package main

import (
	"log/slog"
	"os"
)

func main() {
	// JSON handler outputs parseable objects to stdout.
	// Container orchestrators like Kubernetes capture stdout automatically.
	handler := slog.NewJSONHandler(os.Stdout, nil)
	// Set the global default so any package calling slog.Info uses this handler.
	slog.SetDefault(slog.New(handler))
	// Emit a log with structured attributes.
	// Attributes are key-value pairs, not formatted strings.
	slog.Info("server starting", "port", 8080, "env", "production")
}

When you call slog.Info, the handler marshals the attributes to JSON. The timestamp is added automatically. The level is included. The output looks like {"time":"...","level":"INFO","msg":"server starting","port":8080,"env":"production"}. This is machine-readable. You can pipe this to a log aggregator. The aggregator parses the JSON and indexes the fields. Now you can query port=8080 directly.

Convention aside: logs go to os.Stdout, not files. Writing to files inside a container is an anti-pattern. The container runtime handles file rotation, buffering, and shipping to your log store. Your job is to write JSON to stdout. The infrastructure handles the rest.

Set the handler once. Log everywhere.

Correlating requests with context

In a microservice, you need to correlate logs across requests and services. You do this with a trace ID. The trace ID travels in the HTTP headers and gets attached to the context. Every log within that request includes the trace ID. This lets you filter all logs for a single request, even if the request spans multiple services.

Here's a realistic handler: extract the trace ID, bind it to the logger, and pass the logger through the call chain.

package main

import (
	"context"
	"log/slog"
	"net/http"
)

// traceIDKey holds the context key for the trace ID.
// Using a custom type prevents key collisions with other packages.
type traceIDKey struct{}

// extractTraceID reads the trace ID from the context or generates one.
// This function ensures every request has a unique identifier.
func extractTraceID(ctx context.Context) string {
	if id, ok := ctx.Value(traceIDKey{}).(string); ok {
		return id
	}
	// Generate a UUID if missing. In production, use a crypto/rand source.
	return "trace-" + generateRandomString()
}

// attachTraceID creates a new context with the trace ID.
// Context values flow down the call stack to all child functions.
func attachTraceID(ctx context.Context, id string) context.Context {
	return context.WithValue(ctx, traceIDKey{}, id)
}

// handleRequest processes an HTTP request with structured logging.
// The logger is enriched with the trace ID for correlation.
func handleRequest(w http.ResponseWriter, r *http.Request) {
	// Extract or generate trace ID.
	traceID := extractTraceID(r.Context())
	// Create a context carrying the trace ID.
	ctx := attachTraceID(r.Context(), traceID)
	// Bind the trace ID to the logger instance.
	// All subsequent logs from this logger include the trace ID automatically.
	logger := slog.With("trace_id", traceID)
	logger.Info("request received", "method", r.Method, "path", r.URL.Path)
	// Simulate business logic.
	processOrder(ctx, logger)
	logger.Info("request completed")
}

// processOrder performs work and logs progress.
// It receives the enriched logger to maintain context.
func processOrder(ctx context.Context, logger *slog.Logger) {
	logger.Info("validating order")
	// Check context for cancellation.
	// Long-running tasks must respect context deadlines.
	select {
	case <-ctx.Done():
		logger.Warn("order processing cancelled", "err", ctx.Err())
		return
	default:
		// Continue processing.
	}
	logger.Info("order validated")
}

// generateRandomString is a placeholder for UUID generation.
func generateRandomString() string {
	return "abc123"
}

Convention aside: context.Context always goes as the first parameter. Name it ctx. Functions that take a context should respect cancellation and deadlines. The receiver name is usually one or two letters, but here we pass the logger explicitly. Passing *slog.Logger is the standard way to carry log context through deep call stacks.

Convention aside: attribute keys use snake_case. It matches JSON conventions and makes querying easier in tools like Loki or Elasticsearch. Use trace_id, not traceId.

Context carries the truth. Bind it early, pass it down.

Dynamic levels and grouping

You can change log levels at runtime without restarting the service. This is useful for debugging production issues. You expose an endpoint that sets the level to Debug, investigate, then set it back to Info. slog provides LevelVar for this purpose.

Here's how to change log levels at runtime using a LevelVar.

package main

import (
	"log/slog"
	"os"
)

// level holds the current log level.
// LevelVar allows changing the level dynamically at runtime.
var level = slog.LevelVar{}

func main() {
	// Handler uses the LevelVar for filtering.
	// The handler checks the variable on every log call.
	handler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
		Level: &level,
	})
	slog.SetDefault(slog.New(handler))
	// Start at Info level.
	level.Set(slog.LevelInfo)
	slog.Info("initial state")
	// Switch to Debug for troubleshooting.
	level.Set(slog.LevelDebug)
	slog.Debug("this appears now")
}

You can also group related attributes using slog.Group. This creates a nested object in the JSON output. Grouping keeps logs clean when you have many attributes. For example, group HTTP request details or database connection info.

Here's how to group attributes and customize keys using ReplaceAttr.

package main

import (
	"log/slog"
	"os"
)

func main() {
	// ReplaceAttr modifies attributes before logging.
	// This example renames the message key and adds a service name.
	handler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
		ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
			if a.Key == slog.MessageKey {
				a.Key = "message"
			}
			// Add service name to every log line.
			if len(groups) == 0 && a.Key != "service" {
				return slog.Attr{
					Key: "service",
					Value: slog.StringValue("payment-service"),
				}
			}
			return a
		},
	})
	slog.SetDefault(slog.New(handler))
	// Group HTTP details into a nested object.
	slog.Info("request", slog.Group("http", "method", "GET", "path", "/orders"))
}

Convention aside: slog.Error automatically includes the error. You don't need to format the error as a string. Pass the error value directly. The handler calls Error() and includes it. This preserves the error chain for debugging.

Customize sparingly. Stick to the defaults unless you have a specific requirement.

Pitfalls and runtime behavior

Common pitfalls trip up teams. High cardinality attributes destroy log storage. Logging user_id is safe. Logging request_body with unique JSON payloads creates millions of unique keys. The log aggregator indexes every value, and storage costs explode. Log summaries, not raw data. If you need the raw body, sample it or write it to a separate object store.

Log injection breaks parsers. If you log user input directly in a message, a user can inject newlines or JSON fragments. slog escapes values in attributes, but the message string is opaque. Never put user input in the message field. Put it in an attribute where slog handles escaping.

Synchronous logging blocks handlers. slog writes to os.Stdout. If the log aggregator backlogs, writes can block the goroutine. For high-throughput services, use a buffered channel or an async logger. slog itself is synchronous. You wrap the handler to send logs to a channel, and a background goroutine flushes them.

The compiler rejects programs with undefined symbols. If you forget to import log/slog, you get undefined: slog. If you pass a struct with a mutex to slog, the runtime panics with concurrent map writes or json: unsupported type. slog tries to marshal attributes to JSON. Types that cannot be marshaled cause panics. Validate custom types before logging.

Logs are data. Treat them like metrics: bounded, structured, and queryable.

Decision matrix

Use log/slog when you need structured logging in Go 1.21 or later. Use a third-party logger like zap or zerolog when you need extreme performance and can tolerate a larger dependency footprint. Use plain fmt.Println only for development scripts or debugging one-off tools. Use log levels to filter noise: Info for business events, Debug for internal state, Error for failures that require attention. Use trace IDs to correlate requests across service boundaries. Use slog.With to attach static context to a logger for a specific scope.

Pick the tool that matches your volume. Structure beats verbosity every time.

Where to go next