The missing timeline
A request hits your Go service. It validates a payload, queries a database, and forwards data to a payment provider. The payment provider times out. Your logs show three separate error messages from three different files. You know something failed, but you cannot see the chain of events. You do not know how long the database call took, or whether the timeout happened before or after the validation step. You are looking at a pile of receipts without a receipt book.
OpenTelemetry solves this by turning scattered log lines into a single timeline. It records the exact start and end times of every step, links parent calls to child calls, and carries metadata across service boundaries. The Go implementation leans heavily on context.Context. The context travels with your request, and OpenTelemetry attaches a span to it. A span is just a time-boxed record of one operation. When the operation finishes, the span closes and ships its data to an exporter.
Tracing is not magic. It is disciplined bookkeeping. You get out exactly what you put in.
How tracing actually works
Tracing relies on two moving parts: the API and the SDK. The API defines the interface for creating spans and propagating context. The SDK provides the actual implementation that records data and sends it somewhere. You import the API in your application code so you are not locked into a specific vendor. You configure the SDK at startup to point at your backend, whether that is a local console, a cloud provider, or an open-source collector.
Think of a span like a stopwatch at a track meet. The starter pistol fires, the clock starts, the runner completes the lap, and the clock stops. The time, the runner's name, and the lane number get written to a results sheet. In Go, the starter pistol is tracer.Start, the runner is your function, and the results sheet is the exporter. The context.Context is the clipboard that passes the stopwatch from one official to the next.
The SDK stores the current span inside the context using a hidden key. When you call tracer.Start, the SDK reads the parent span from the context, creates a child span, and embeds the child pointer into a new context. This chain continues until the request finishes. You never manipulate the context map directly. You only use the OpenTelemetry functions that read and write it.
The minimal setup
Here is the simplest way to wire OpenTelemetry into a Go program. It uses the standard library context, registers a console exporter, and records one span.
package main
import (
"context"
"log"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
"go.opentelemetry.io/otel/sdk/trace"
)
func main() {
// stdouttrace sends spans to the terminal for debugging
exp, err := stdouttrace.New(stdouttrace.WithPrettyPrint())
if err != nil {
log.Fatal(err)
}
// The provider manages the lifecycle of all tracers in the process
provider := trace.NewTracerProvider(trace.WithSyncer(exp))
// Register it globally so otel.Tracer() finds it automatically
otel.SetTracerProvider(provider)
// Flush ensures pending spans are written before the process exits
defer provider.Shutdown(context.Background())
// Tracer names should match your service or component
tracer := otel.Tracer("order-service")
// Start attaches a new span to the context and returns both
ctx, span := tracer.Start(context.Background(), "process-order")
defer span.End()
log.Println("Handling order")
_ = ctx
}
The code above compiles and runs. You will see a JSON document printed to stdout showing the span name, start time, end time, and trace ID. The defer span.End() call guarantees the span closes even if the function panics. The provider.Shutdown call flushes the exporter buffer so you do not lose the last few spans when the program exits.
Always check exporter errors at startup. If the exporter fails to initialize, your traces vanish silently. The compiler will not stop you. You just get a no-op provider that drops everything.
What happens under the hood
When otel.SetTracerProvider runs, it replaces the default no-op provider with your configured SDK. Every call to otel.Tracer("name") after that point returns a real tracer backed by the SDK. When you call tracer.Start, the SDK allocates a span, generates a unique trace ID and span ID, and embeds the span pointer inside a new context.Context. The original context is preserved as the parent.
The context propagation pattern is strict in Go. Functions that perform work accept context.Context as their first parameter, conventionally named ctx. You pass the enriched context down the call stack. When a child function needs to record its own step, it calls tracer.Start(ctx, "child-step"). The SDK reads the parent span from ctx, creates a child span, and returns a new context containing the child. This chain continues until the request finishes.
The exporter runs in the background. trace.WithSyncer tells the SDK to write spans synchronously, which is fine for development. Production setups use trace.WithBatcher to group spans and reduce network overhead. The SDK handles buffering, retry logic, and context cancellation automatically. You do not need to manage goroutines for the exporter yourself.
Context is plumbing. Run it through every long-lived call site.
Propagating across boundaries
Traces break when you cross process boundaries. An HTTP request leaves your service and hits a downstream API. The downstream service needs to know it is part of the same trace. OpenTelemetry solves this with W3C Trace Context headers. The SDK injects traceparent and tracestate headers into outgoing requests. The downstream service extracts them and continues the chain.
You do not parse these headers manually. You use the otelhttp instrumentation package or the propagation package to inject and extract context. The SDK handles the base64 encoding, version negotiation, and vendor-specific baggage. You just pass the context to your HTTP client.
If you skip propagation, every service starts a new trace. You get a forest of disconnected trees instead of a single timeline. The dashboard becomes useless for debugging distributed failures.
A realistic HTTP handler
Real services handle concurrent requests. Each request gets its own context, and each context carries its own trace. Here is how you wire tracing into an HTTP handler that calls a downstream service.
package main
import (
"context"
"net/http"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
)
var tracer = otel.Tracer("http-handler")
// HandleOrder processes an incoming HTTP request with tracing
func HandleOrder(w http.ResponseWriter, r *http.Request) {
// Extract or create a context for this request
ctx := r.Context()
// Start a span that covers the entire request lifecycle
ctx, span := tracer.Start(ctx, "HandleOrder")
defer span.End()
// Attach request metadata to the span for filtering later
span.SetAttributes(
attribute.String("http.method", r.Method),
attribute.String("http.path", r.URL.Path),
)
// Simulate downstream work
if err := callPaymentService(ctx); err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
http.Error(w, "payment failed", http.StatusBadGateway)
return
}
w.WriteHeader(http.StatusOK)
}
// callPaymentService demonstrates passing context to a helper
func callPaymentService(ctx context.Context) error {
ctx, span := tracer.Start(ctx, "callPaymentService")
defer span.End()
// In production, you would pass ctx to http.NewRequestWithContext
return nil
}
The handler creates a span tied to the request context. It records HTTP attributes so you can filter traces by path or method later. When the downstream call fails, span.RecordError attaches the error to the span, and span.SetStatus marks the span as failed. The context flows into callPaymentService, which creates a child span. When the handler returns, both spans end and ship to the exporter.
Go conventions matter here. The context-first rule is strict. Always put ctx first. Always name it ctx. Functions that accept a context should respect cancellation. If ctx.Done() fires, your code should stop work and return early. The span will still end via defer, but the work stops cleanly.
Where things go wrong
Tracing looks simple until you miss a detail. The most common mistake is dropping the context. If you call tracer.Start and ignore the returned context, your child functions will not see the span. The trace breaks into disconnected islands. The compiler will not stop you. You just get orphaned spans in your dashboard.
Another trap is forgetting to end the span. If you remove defer span.End(), the span stays open until the garbage collector runs or the process exits. Your dashboard shows spans with missing end times, and your exporter buffers fill up. The SDK will eventually panic if the buffer overflows, but you will see high memory usage first.
You might also see the compiler reject code with cannot use ctx (variable of type context.Context) as string value in argument if you accidentally pass the context to a function expecting a different type. Context values are typed strictly. You cannot stuff arbitrary data into a span without using the attribute package or span.SetAttributes.
Goroutine leaks happen when you spawn a background worker and forget to pass a cancellable context. The worker waits on a channel that never closes. The span never ends. The trace never completes. Always derive a child context with context.WithCancel or context.WithTimeout for background work, and call the cancel function when the parent request finishes.
The worst goroutine bug is the one that never logs.
When to reach for tracing
Tracing is powerful, but it is not a replacement for every observability tool. Pick the right instrument for the job.
Use OpenTelemetry tracing when you need to follow a single request across multiple functions or services. Use structured logging with slog when you need to record discrete events with key-value pairs for auditing or debugging. Use metrics when you need to aggregate counts, averages, or percentiles over time. Use simple log.Println when you are writing a short script that will never run in production. Use distributed tracing when latency spikes are intermittent and you cannot reproduce them locally. Use error wrapping with fmt.Errorf and %w when you need to preserve error chains without the overhead of a full trace.
Tracing adds overhead. Every span allocation, attribute set, and context propagation step costs CPU cycles. Keep spans coarse. Do not trace every loop iteration. Trace the boundaries: HTTP handlers, database queries, external API calls, and message queue consumers. If a function takes less than a few milliseconds, skip the span. The dashboard will thank you.