How to Use OpenTelemetry for Metrics and Traces in Go

Initialize OpenTelemetry Tracer and Meter providers in Go, then wrap HTTP handlers and database connections to automatically export traces and metrics.

The invisible thread through your service

A user clicks a button on your frontend. The request hits your Go service, which queries a PostgreSQL database, calls a payment API, and returns a JSON response. The response takes two seconds. The user complains. You check the logs and see timestamps, but nothing connects the database query to the payment call. You have no idea where the time went.

Telemetry solves that gap. It stitches together every hop a request makes, attaches timing data to each step, and aggregates the results into numbers you can graph. OpenTelemetry gives you a vendor-neutral way to collect both traces and metrics without rewriting your code when you switch observability backends.

What telemetry actually does

OpenTelemetry splits observability into two tracks. Traces follow a single request through your system. Each step in the request is a span. Spans link together into a trace, which gives you a timeline of exactly where time was spent. Metrics are cumulative gauges, counters, and histograms. They answer questions like how many requests arrived per second, how long the p99 latency sits, or how many database connections are open.

Think of traces as a flight itinerary. You see the departure airport, the arrival airport, the layover duration, and the total travel time. Think of metrics as the cockpit dashboard. You see fuel burn rate, engine temperature, and altitude. You need both to understand what is happening.

OpenTelemetry defines the data model and the SDK. The SDK runs in your process, collects spans and metric points, and ships them to an exporter. The exporter speaks a protocol like OTLP over HTTP or gRPC. Your backend receives the data and stores it. The separation means you can swap backends without touching your application code.

Traces are cheap to start but expensive to store at scale. Metrics are cheap to aggregate but lose individual request context. Use traces to debug latency. Use metrics to monitor health.

The minimal setup

Here is the smallest program that initializes both a tracer and a meter, then wraps an HTTP handler so every request automatically generates telemetry data.

package main

import (
	"context"
	"log"
	"net/http"
	"os"
	"os/signal"
	"syscall"

	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
	"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
	"go.opentelemetry.io/otel/sdk/metric"
	"go.opentelemetry.io/otel/sdk/resource"
	"go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.26.0"
)

func main() {
	ctx := context.Background()

	// Attach process metadata so every span and metric carries service identity
	res := resource.NewWithAttributes(
		semconv.SchemaURL,
		semconv.ServiceName("demo-service"),
	)

	// Create an HTTP exporter that ships trace data to localhost:4318
	traceExporter, err := otlptracehttp.New(ctx)
	if err != nil {
		log.Fatalf("failed to create trace exporter: %v", err)
	}

	// Batch spans to reduce network overhead and avoid per-span HTTP calls
	tracerProvider := trace.NewTracerProvider(
		trace.WithBatcher(traceExporter),
		trace.WithResource(res),
	)

	// Register the provider globally so otel.Tracer() finds it automatically
	otel.SetTracerProvider(tracerProvider)

	// Create an HTTP exporter for metric data points
	metricExporter, err := otlpmetrichttp.New(ctx)
	if err != nil {
		log.Fatalf("failed to create metric exporter: %v", err)
	}

	// Push metrics every 10 seconds instead of waiting for shutdown
	meterProvider := metric.NewMeterProvider(
		metric.WithReader(metric.NewPeriodicReader(metricExporter)),
		metric.WithResource(res),
	)

	// Register the meter provider globally for otel.Meter() calls
	otel.SetMeterProvider(meterProvider)

	// Ensure both providers flush their buffers before the process exits
	go func() {
		sig := make(chan os.Signal, 1)
		signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
		<-sig
		_ = tracerProvider.Shutdown(ctx)
		_ = meterProvider.Shutdown(ctx)
	}()

	// Wrap the handler so OpenTelemetry injects context and creates spans
	handler := otelhttp.NewHandler(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Write([]byte("ok"))
	}), "demo.handler")

	log.Fatal(http.ListenAndServe(":8080", handler))
}

Run this with an OTLP collector listening on port 4318. Every request to localhost:8080 produces a trace and emits HTTP server metrics. The code follows the Go convention of checking errors immediately with if err != nil { log.Fatalf(...) }. The community accepts the boilerplate because it makes failure paths impossible to miss.

Providers must shut down explicitly. If you skip the shutdown call, buffered spans and metric points drop on the floor. Trust the shutdown sequence. Flush before exit.

How the pieces connect at runtime

When the program starts, the tracer provider and meter provider allocate internal buffers and background goroutines. The batcher for traces collects spans and ships them in chunks. The periodic reader for metrics aggregates data points and pushes them on a fixed interval. Both exporters open HTTP connections to the collector.

When a request arrives, otelhttp.NewHandler intercepts it. It extracts an incoming trace context from headers if one exists. If not, it starts a new trace. It creates a root span named after the handler, attaches it to the request context, and passes the enriched context to your handler function.

Inside your handler, you call otel.Tracer("mycomponent").Start(ctx, "operation"). The SDK creates a child span linked to the parent. When the span ends, it records elapsed time, status, and any attributes you attached. The batcher queues the span. When the queue fills or the flush interval fires, the exporter serializes the spans to protobuf and sends them over HTTP.

Metrics work differently. You call meter.Int64Counter("requests_total").Add(ctx, 1). The SDK records the increment in an in-memory aggregator. The periodic reader snapshots the aggregator, converts the data to OTLP metric format, and pushes it. No per-call network overhead.

Context is plumbing. Run it through every long-lived call site. If a function does not accept context.Context as its first parameter, it cannot participate in tracing. Rename the parameter to ctx by convention. Keep the signature consistent.

Adding real workloads

Real services call databases, message queues, and external APIs. Each call needs a span. Each span needs the parent context. Here is how you propagate context to a database query and record a custom metric.

package main

import (
	"context"
	"database/sql"
	"log"
	"net/http"
	"time"

	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/codes"
)

// HandleRequest demonstrates context propagation and custom metric recording.
func HandleRequest(db *sql.DB) http.HandlerFunc {
	meter := otel.Meter("demo.handler")
	counter := meter.Int64Counter("db.query.count")
	histogram := meter.Float64Histogram("db.query.duration")

	return func(w http.ResponseWriter, r *http.Request) {
		ctx := r.Context()

		// Create a child span for the database operation
		ctx, span := otel.Tracer("demo.db").Start(ctx, "db.query")
		defer span.End()

		// Record the query start time for duration calculation
		start := time.Now()

		// Execute the query with the enriched context
		row := db.QueryRowContext(ctx, "SELECT status FROM orders WHERE id = $1", 42)
		var status string
		if err := row.Scan(&status); err != nil {
			span.RecordError(err)
			span.SetStatus(codes.Error, "query failed")
			http.Error(w, "internal error", http.StatusInternalServerError)
			return
		}

		// Calculate elapsed time and attach it as a span attribute
		elapsed := time.Since(start).Seconds()
		span.SetAttributes(attribute.Float64("db.duration", elapsed))

		// Increment the counter and record the histogram point
		counter.Add(ctx, 1, attribute.String("status", status))
		histogram.Record(ctx, elapsed, attribute.String("status", status))

		w.Write([]byte("order: " + status))
	}
}

The handler extracts the request context, starts a child span, and passes that context to QueryRowContext. The database driver sees the context and can attach trace headers if it supports OpenTelemetry natively. If it does not, the span still captures timing and errors. The metric calls record data without blocking the request. The histogram aggregates latency distributions so you can see p50, p90, and p99 values later.

Public names start with a capital letter. Private start lowercase. The handler function is exported so other packages can wrap it. The internal helper variables are unexported. Follow the capitalization rule and the compiler enforces visibility for you.

Where things break

Telemetry adds background goroutines and network calls. Misconfiguration turns invisible overhead into visible latency. The most common failure is forgetting to shut down providers. The compiler will not catch it. The runtime will silently drop buffered data when the process exits. Always call Shutdown(ctx) on both providers during graceful termination.

Exporters can fail. Network partitions happen. If you ignore the error from otlptracehttp.New(ctx), the program compiles but ships no data. The compiler rejects unused imports with imported and not used, but it will not warn you about a failed exporter constructor. Check the error. Fail fast.

Context cancellation leaks spans. If a request times out and the context cancels, any goroutine still waiting on that context stops. If that goroutine was holding an open span, the span never ends. Open-ended spans confuse aggregators and inflate latency metrics. Always attach a timeout to long-running operations and end spans in defer blocks.

Unbuffered channels in custom exporters block the calling goroutine. The SDK expects exporters to return quickly. If your exporter writes to a blocking channel, request handlers stall. Buffer the channel or use a worker pool to drain it. The worst goroutine bug is the one that never logs. Add a logger to your exporter fallback path.

Go conventions keep telemetry code readable. gofmt decides indentation and spacing. Run it on save. Receiver names are one or two letters matching the type. Use (h *Handler) ServeHTTP(...) instead of (this *Handler). The underscore discards values intentionally. Use result, _ := ... only when you have verified the second return value is safe to drop. Never drop errors silently.

When to reach for OpenTelemetry

Use OpenTelemetry when your service spans multiple processes and you need to follow a single request across boundaries. Use structured logging with slog when you only need key-value records and do not care about request correlation. Use a simple counter or gauge library when you only track aggregate throughput and latency without distributed context. Use a full commercial APM agent when you want zero configuration and are willing to accept vendor lock-in. Use plain sequential code when you do not need observability: the simplest thing that works is usually the right thing.

Where to go next

OpenTelemetry adds a standard way to track how your Go application is performing by recording every request (traces) and counting key events like errors or latency (metrics). Think of it as installing a dashboard in your car that shows exactly where you drove and how fast you were going at every moment. You use it when you need to find out why your service is slow or when it crashes in production.