The load balancer thinks you're dead
Your Kubernetes cluster marks your pod as CrashLoopBackOff even though the process is running. The load balancer drops traffic because your readiness probe returned a 503. The service is alive, but the infrastructure thinks it's a ghost. This happens when you forget to register the gRPC health service. gRPC doesn't come with a heartbeat by default. You have to install one.
Think of gRPC health checking like a "We're Open" sign on a shop window. The shop might be full of customers, cash register working, lights on. But if the sign says "Closed," the delivery driver won't bring the pizza. In gRPC, the "sign" is a specific RPC method defined by Google. Load balancers, service meshes, and orchestrators call this method to decide whether to send real traffic. If the method isn't registered, the caller gets a "method not found" error, which looks exactly like the service is down.
The health service defines two RPCs: Check and Watch. Check is a simple request-response call. Watch opens a server stream that pushes status updates. Load balancers often use Check for periodic polling. Service meshes might use Watch to get instant notifications when a service transitions to NOT_SERVING. The standard implementation handles both methods. You don't need to write stream logic manually.
gRPC ships without a heartbeat. You must register the health service or the load balancer assumes you're dead.
Registering the health service
Here's the bare minimum to make a gRPC server answer health probes. The code creates a server, instantiates the health checker, sets the initial status, and registers the service.
package main
import (
"log"
"net"
"google.golang.org/grpc"
"google.golang.org/grpc/health"
"google.golang.org/grpc/health/grpc_health_v1"
)
func main() {
// Listen on a local port for gRPC traffic
lis, err := net.Listen("tcp", ":50051")
if err != nil {
log.Fatalf("failed to listen: %v", err)
}
// Create the gRPC server instance
s := grpc.NewServer()
// Instantiate the health server provided by the gRPC package
hs := health.NewServer()
// Set the default status to SERVING so the server accepts traffic immediately
hs.SetServingStatus("", grpc_health_v1.HealthCheckResponse_SERVING)
// Register the health service so the gRPC server knows how to route health checks
grpc_health_v1.RegisterHealthServer(s, hs)
log.Println("Server started with health checking enabled")
if err := s.Serve(lis); err != nil {
log.Fatalf("failed to serve: %v", err)
}
}
The health.NewServer function returns a struct that implements the grpc_health_v1.HealthServer interface. This struct holds a map of service names to their current status. When you call SetServingStatus, you update that map. The empty string key "" represents the server as a whole. When a client calls the Check RPC, the health server looks up the requested service name in the map. If the name exists, it returns the stored status. If the name is missing, it returns UNKNOWN.
The RegisterHealthServer function binds this logic to the gRPC server's routing table. Without this registration, the server has no handler for the health method. A client calling the health check receives rpc error: code = Unimplemented desc = method Health/Check not found. This error code tells the load balancer that the method isn't implemented, which usually triggers a circuit breaker or removes the backend from the pool.
Register the health service or the load balancer assumes you're dead.
Checking dependencies in real code
Real services depend on databases, caches, or other APIs. Your health check should reflect the state of those dependencies, not just whether the process is running. A healthy process with a dead database is a broken service.
Here's a health checker that monitors a database connection and updates the gRPC status accordingly. The checker runs in a background goroutine and respects context cancellation.
package main
import (
"context"
"sync"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/health"
"google.golang.org/grpc/health/grpc_health_v1"
)
// HealthChecker tracks dependencies and updates gRPC status
type HealthChecker struct {
hs *health.Server
db *Database
mu sync.Mutex
}
// NewHealthChecker creates a checker and registers the health service
func NewHealthChecker(s *grpc.Server, db *Database) *HealthChecker {
hs := health.NewServer()
// Start as SERVING; the checker will update status if dependencies fail
hs.SetServingStatus("", grpc_health_v1.HealthCheckResponse_SERVING)
grpc_health_v1.RegisterHealthServer(s, hs)
return &HealthChecker{hs: hs, db: db}
}
// CheckDependencies runs a loop to verify external services
func (hc *HealthChecker) CheckDependencies(ctx context.Context) {
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
hc.mu.Lock()
// Simulate a database ping check
if err := hc.db.Ping(); err != nil {
// Mark as NOT_SERVING if the database is unreachable
hc.hs.SetServingStatus("", grpc_health_v1.HealthCheckResponse_NOT_SERVING)
} else {
// Restore SERVING status when the database is back
hc.hs.SetServingStatus("", grpc_health_v1.HealthCheckResponse_SERVING)
}
hc.mu.Unlock()
}
}
}
The checker uses a mutex to protect the status update. The SetServingStatus method is not thread-safe across concurrent writes to the same key in some implementations, so locking prevents race conditions. The context parameter allows the checker to stop when the server shuts down. If you forget the cancellation path, the checker goroutine runs forever after the server exits. The worst goroutine bug is the one that never logs. A leaked health checker holds a reference to the database connection, preventing the process from exiting cleanly.
Go functions that take a context should put it as the first parameter, named ctx. This convention lets tools and readers spot the cancellation path immediately. The receiver name in the checker struct is hc, a short abbreviation matching the type. Go convention prefers one or two letter receiver names. Don't use this or self.
A healthy process with a dead database is a broken service. Update the status map when dependencies fail.
Service names and status codes
The SetServingStatus method accepts a service name and a status enum. The service name can be an empty string for the whole server, or a specific name like OrderService. If you register multiple services, you can set different statuses for each. A load balancer might check OrderService specifically to route traffic only to healthy instances.
If you don't set a status for a specific service, the server returns UNKNOWN. This distinction matters when you have a monolithic gRPC server hosting multiple logical services. You can mark one service as NOT_SERVING while keeping others SERVING. The status codes are SERVING, NOT_SERVING, and UNKNOWN. SERVING means the service is ready to handle traffic. NOT_SERVING means the service is intentionally unavailable. UNKNOWN means the server doesn't know the status of the requested service.
Kubernetes uses gRPC health checks via the grpc probe type in the pod spec. The probe calls the Check method on the specified port. If the response is SERVING, the probe succeeds. If the response is NOT_SERVING or UNKNOWN, the probe fails. Envoy proxy also supports gRPC health checks. It polls the Check method and updates the cluster membership based on the response. The health check implementation is compatible with these tools out of the box. You don't need a wrapper or adapter.
The empty string is the server. Specific names are services. Match the client request to the registered key.
Common pitfalls
If you forget to call RegisterHealthServer, the gRPC server doesn't know the health method exists. A client calling the health check receives rpc error: code = Unimplemented desc = method Health/Check not found. This error code tells the load balancer that the method isn't implemented, which usually triggers a circuit breaker or removes the backend from the pool.
Another common mistake is setting the status for a specific service name but checking the default. If you call hs.SetServingStatus("my-service", SERVING) and the client checks "", the server returns UNKNOWN. The health check returns UNKNOWN for any service name that hasn't been explicitly registered in the status map. Always set the empty string status for general readiness, or ensure your client checks the exact service name you registered.
Go code should run through gofmt. The health check code follows standard formatting rules. Don't argue about indentation; let the tool decide. Most editors run gofmt on save. The community accepts the formatting because it eliminates style debates and keeps the codebase consistent.
The standard health server handles 99% of cases. Don't reinvent the check method.
When to use the health service
Use health.NewServer when you need standard gRPC health checking for load balancers and Kubernetes probes. Use a custom HealthServer implementation when you need complex logic that the standard server doesn't support, such as per-request validation or dynamic status calculation. Use a separate HTTP health endpoint when your infrastructure requires HTTP probes instead of gRPC. Use a dedicated health check package when you are building a microservice framework that manages multiple dependencies.