How to Integrate Go with Grafana Dashboards

Integrate Go with Grafana by enabling the built-in pprof HTTP handler and configuring Grafana to scrape the resulting metrics endpoint.

The missing link between Go and Grafana

You ship a Go service. It runs fine in staging. In production, it starts chewing RAM and spawning goroutines like a chain letter. You stare at a blank terminal and wonder where the bottleneck lives. Grafana dashboards answer that question, but Go does not speak Grafana natively. You need a bridge.

The original article suggests wiring Grafana directly to net/http/pprof. That approach mixes two different tools. The pprof package gives you point-in-time snapshots for debugging memory leaks or CPU hotspots. It does not emit continuous time-series data. Grafana needs a steady stream of numbers to draw lines on a chart. The industry standard for that stream is Prometheus.

How the pipeline actually works

Grafana is a visualization layer. It does not collect data. It reads from a data source, and the most common source for Go services is Prometheus. Prometheus is a time-series database that pulls metrics over HTTP. Your job is to make your Go program count what matters, format those counts in a specific text format, and serve them on a dedicated endpoint. Prometheus visits that endpoint on a schedule, stores the numbers, and Grafana draws the charts.

Think of it like a weather station. Your Go application is the sensor array measuring temperature, humidity, and wind speed. Prometheus is the central station that drives by every fifteen minutes to read the gauges. Grafana is the public display board that shows the trends. You only need to build the gauges and leave the door open for the station to read them.

Go gives you three paths to build those gauges. The standard library includes expvar, which exposes variables in JSON. The prometheus/client_golang library exposes metrics in the Prometheus text format. The net/http/pprof package exposes profiling endpoints for debugging. Grafana dashboards live on the Prometheus path.

Goroutines are cheap. Metrics are cheap. The scrape cycle is the only moving part.

The minimal instrumented server

Here is the simplest way to expose a metric. You create a counter, register it, increment it in your logic, and serve the /metrics endpoint.

package main

import (
	"net/http"
	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

// requestCounter tracks total HTTP requests processed by the service.
var requestCounter = prometheus.NewCounter(
	prometheus.CounterOpts{
		Name: "http_requests_total",
		Help: "Total number of HTTP requests processed.",
	},
)

func main() {
	// Register the counter so the default registry can collect it.
	prometheus.MustRegister(requestCounter)

	// Serve Prometheus metrics on a dedicated path.
	http.Handle("/metrics", promhttp.Handler())

	// Serve a simple endpoint that increments the counter.
	http.HandleFunc("/ping", func(w http.ResponseWriter, r *http.Request) {
		requestCounter.Inc()
		w.WriteHeader(http.StatusOK)
	})

	// Start the server on port 8080.
	http.ListenAndServe(":8080", nil)
}

The promhttp.Handler() function does the heavy lifting. It walks the default registry, formats every registered metric into the Prometheus text exposition format, and writes it to the HTTP response. You do not need to manually serialize JSON or build query parameters. The handler respects the Accept header and falls back to the text format automatically.

Metrics are just numbers with labels. Keep the cardinality low.

What happens under the hood

When you call requestCounter.Inc(), the value lives in a thread-safe atomic variable inside the process memory. Nothing is sent over the network yet. The metric stays in your Go process until Prometheus knocks on the /metrics door.

Prometheus runs a scrape loop. It opens an HTTP connection to your endpoint, reads the response body, and parses the text format. Each line in that response represents a metric sample. The format looks like this:

# HELP http_requests_total Total number of HTTP requests processed.
# TYPE http_requests_total counter
http_requests_total 42

The # HELP line documents what the metric measures. The # TYPE line tells Prometheus whether the number can only go up, can go up and down, or represents a snapshot. The final line is the actual value. Prometheus stores that value with a timestamp, repeats the process every fifteen seconds (or whatever interval you configure), and builds a time series.

Grafana connects to Prometheus via the PromQL query language. You write a query like rate(http_requests_total[5m]) to see the request throughput over the last five minutes. Grafana polls Prometheus, gets the aggregated numbers, and renders the panel. Your Go code never talks to Grafana directly. The separation of concerns keeps your service fast and your dashboards flexible.

Context is plumbing. Run it through every long-lived call site.

Wiring it into a real service

Production services rarely run a single http.ListenAndServe call. They use routers, middleware, and separate ports for metrics. Here is how you wire metrics into a realistic setup without blocking your main application logic.

package main

import (
	"context"
	"net/http"
	"time"
	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

// processDuration tracks how long the background job takes to run.
var processDuration = prometheus.NewHistogram(
	prometheus.HistogramOpts{
		Name:    "job_duration_seconds",
		Help:    "Time spent processing the background job.",
		Buckets: prometheus.DefBuckets,
	},
)

// runJob simulates a background task that takes variable time.
func runJob(ctx context.Context) {
	// Start a timer that will be recorded when the job finishes.
	timer := prometheus.NewTimer(processDuration)
	defer timer.ObserveDuration()

	// Simulate work that respects context cancellation.
	select {
	case <-time.After(500 * time.Millisecond):
		// Job completed successfully.
	case <-ctx.Done():
		// Job was cancelled before finishing.
	}
}

func main() {
	prometheus.MustRegister(processDuration)

	// Metrics server runs on a separate port to isolate scrape traffic.
	metricsMux := http.NewServeMux()
	metricsMux.Handle("/metrics", promhttp.Handler())

	// Start the metrics server in a background goroutine.
	go func() {
		if err := http.ListenAndServe(":9090", metricsMux); err != nil {
			// Log the error. The process will exit if the listener fails.
		}
	}()

	// Main application loop.
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	for {
		runJob(ctx)
		time.Sleep(2 * time.Second)
	}
}

The metrics server runs on port 9090 while your application logic runs on the main thread. This isolation prevents Prometheus scrape traffic from competing with your user-facing requests. The prometheus.NewTimer helper automatically calculates the elapsed time and records it when the deferred call runs. The histogram buckets group durations into ranges, which keeps memory usage predictable even under high load.

Histograms compress time into buckets. Pick your boundaries before you see the data.

Where things go sideways

Instrumentation looks simple until you hit production. The most common mistake is mixing up metric types. Counters only increase. They reset to zero when your process restarts. Gauges can go up and down. They represent current state, like queue depth or active connections. Histograms and summaries measure distributions, like latency or payload size. Using a gauge for a counter breaks rate calculations. Using a counter for a gauge creates impossible downward trends.

The compiler will not save you from semantic mistakes. It only catches type mismatches. If you try to register a metric with a name that already exists, the registry panics at runtime with duplicate metrics collector registration attempted. If you forget to register a metric before querying it, the /metrics endpoint simply omits it. You get a silent dashboard with empty panels.

Another trap is high cardinality. Cardinality is the number of unique label combinations. If you add a user_id label to every request, you generate millions of time series. Prometheus will exhaust memory and drop samples. Keep labels static and low cardinality: method, status, endpoint. Never put dynamic strings like request IDs, email addresses, or full URLs into metric labels. Put those in logs instead.

Security is the final hurdle. The /metrics endpoint exposes internal state. Leaving it open on the public internet gives attackers visibility into your architecture. Bind the metrics server to 127.0.0.1 or restrict access with network policies. Prometheus should scrape from inside your cluster or VPC.

The worst metric is the one that lies. Verify your labels before you ship.

Which tool fits your stack

Use expvar when you need zero dependencies and only want to expose process-level counters or custom variables in JSON format. Use prometheus/client_golang when you need Grafana dashboards, advanced metric types, and industry-standard scraping. Use net/http/pprof when you are debugging a live memory leak or CPU hotspot and need a point-in-time profile. Use plain structured logging when you need to track discrete events, request traces, or high-cardinality data that does not belong in a time-series database.

Pick the tool that matches the question you are asking.

Where to go next