How to Implement Graceful Shutdown in Go Microservices

Web
Implement graceful shutdown in Go by catching OS signals, canceling a context, and calling server.Shutdown to finish active requests before exiting.

The death of a process

You deploy a new version of your Go microservice. The orchestrator stops the old pod. It sends a SIGTERM signal. The old process drops every active connection instantly. Users get 502 Bad Gateway errors. A database transaction rolls back halfway through. The logs show a sudden cutoff with no explanation.

The process didn't crash. It was killed. But it died messy.

Graceful shutdown is the art of dying politely. Instead of vanishing instantly, the service stops accepting new work, finishes the tasks it already started, closes its database connections, and then exits. The orchestrator sees the process exit cleanly and moves on. Users see their requests complete or fail with a proper error, not a broken pipe.

How graceful shutdown works

Think of a restaurant closing for the night. The manager doesn't just flip the sign to "Closed" and throw customers out the door. The manager stops taking new reservations. The kitchen stops cooking new orders. The waiters let the current diners finish their meals. Once the last plate is cleared and the last check is paid, the staff locks the door and goes home.

In Go, the signal is the manager flipping the sign. The context is the message telling the kitchen and waiters to wrap up. The shutdown method is the process of clearing the tables.

The core mechanism relies on three parts:

  1. OS Signals: The operating system sends a signal like SIGTERM (terminate) or SIGINT (interrupt) when you press Ctrl+C or when a process manager kills the app.
  2. Context Cancellation: A context.Context carries a cancellation signal through your application. When the signal arrives, you cancel the context. Every goroutine that holds the context sees the cancellation and stops.
  3. Server Shutdown: The http.Server has a Shutdown method. It stops accepting new connections but waits for active connections to finish.

Minimal example

Here's the simplest way to add graceful shutdown to an HTTP server. You listen for signals in the background, block until a signal arrives, and then call Shutdown.

package main

import (
	"context"
	"log"
	"net/http"
	"os"
	"os/signal"
	"syscall"
	"time"
)

func main() {
	// Define the handler. In a real app, this would be your router.
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		w.Write([]byte("Hello, World"))
	})

	// Create the server with a handler.
	server := &http.Server{
		Addr:    ":8080",
		Handler: mux,
	}

	// Spawn the listener in a goroutine so main can block on signals.
	// If we called ListenAndServe directly, main would block here and never check for shutdown.
	go func() {
		// ListenAndServe returns http.ErrServerClosed when Shutdown is called.
		// We ignore that specific error because it means the shutdown succeeded.
		if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
			log.Fatalf("listen: %s\n", err)
		}
	}()

	// Create a channel to receive OS signals.
	// Buffered to 1 so the signal send doesn't block if main isn't reading yet.
	quit := make(chan os.Signal, 1)
	
	// Register the channel to receive SIGINT (Ctrl+C) and SIGTERM (kill).
	signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
	
	// Block until a signal arrives. This line pauses execution until the channel receives a value.
	<-quit
	log.Println("Shutting down server...")

	// Create a context with a timeout.
	// This gives active requests 5 seconds to finish before we force kill them.
	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
	defer cancel()

	// Shutdown stops the server gracefully.
	// It stops accepting new connections and waits for active ones to complete.
	if err := server.Shutdown(ctx); err != nil {
		log.Fatalf("Could not stop server: %v", err)
	}
	log.Println("Server exited")
}

Goroutines are cheap. Channels are not magic.

Walking through the shutdown

When the program runs, signal.Notify registers the quit channel to receive signals. The program then hits <-quit and blocks. The HTTP server runs in the background, handling requests normally.

When the orchestrator sends SIGTERM, the quit channel receives the signal. The <-quit line unblocks. The program prints "Shutting down server..." and creates a context with a 5-second timeout.

server.Shutdown(ctx) does the heavy lifting. It closes the underlying listener, so no new TCP connections can be accepted. Existing connections remain open. The server waits for all active HTTP handlers to return. If a handler finishes within 5 seconds, it completes normally. If a handler takes longer than 5 seconds, the context expires, the handler is forced to stop, and the server exits.

If Shutdown succeeds, it returns nil. If the context times out, it returns an error. The if err != nil check catches that timeout and logs it.

Realistic example with background workers

Microservices rarely just serve HTTP. They often have background workers, database connections, or gRPC servers. All of these need to stop when the signal arrives. You can't just shut down the HTTP server and hope the workers die. You need a central context that flows through every part of the app.

Here's a realistic pattern. The main function creates a context and passes it to every component. When the signal arrives, the context is cancelled. Every component sees the cancellation and stops.

package main

import (
	"context"
	"log"
	"net/http"
	"os"
	"os/signal"
	"syscall"
	"time"
)

// runWorker simulates a background task that processes jobs.
// It accepts a context as the first parameter, following Go convention.
func runWorker(ctx context.Context) {
	log.Println("Worker started")
	
	// Simulate a loop that processes jobs.
	for {
		select {
		// Check if the context was cancelled.
		// When ctx.Done() returns a value, the shutdown signal has arrived.
		case <-ctx.Done():
			log.Println("Worker received shutdown signal, exiting")
			return
		default:
			// Do work here.
			// In a real app, this might be a blocking call to a job queue.
			// We sleep to simulate work without blocking the shutdown check.
			time.Sleep(100 * time.Millisecond)
		}
	}
}

func main() {
	// Create a context that will be cancelled on shutdown.
	// We use context.Background() as the parent.
	ctx, cancel := context.WithCancel(context.Background())
	
	// Ensure the context is cancelled when main exits.
	// This is a safety net to clean up resources even if we panic.
	defer cancel()

	// Start the background worker.
	// It receives the context and will stop when the context is cancelled.
	go runWorker(ctx)

	// Setup HTTP server.
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		// Handlers should also accept the context if they do long work.
		// For simple handlers, the request context is usually enough.
		w.Write([]byte("Service is running"))
	})

	server := &http.Server{
		Addr:    ":8080",
		Handler: mux,
	}

	// Start the server in a goroutine.
	go func() {
		if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
			log.Fatalf("listen: %s\n", err)
		}
	}()

	// Listen for shutdown signals.
	quit := make(chan os.Signal, 1)
	signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
	<-quit
	log.Println("Shutting down...")

	// Cancel the context.
	// This unblocks the worker and any other goroutine waiting on ctx.Done().
	cancel()

	// Shutdown the HTTP server with a timeout.
	// We create a new context here specifically for the shutdown deadline.
	// The worker context was already cancelled, so the worker stops immediately.
	// The server context gives active HTTP requests time to finish.
	shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 5*time.Second)
	defer shutdownCancel()

	if err := server.Shutdown(shutdownCtx); err != nil {
		log.Fatalf("Server forced to shutdown: %v", err)
	}
	log.Println("Server exited")
}

Context is plumbing. Run it through every long-lived call site.

Pitfalls and errors

Graceful shutdown looks simple until you hit the edge cases. These are the common traps.

Ignoring http.ErrServerClosed

When you call server.Shutdown, the ListenAndServe call returns http.ErrServerClosed. This is not a failure. It's the expected result of a successful shutdown. If you treat this error as fatal, your app logs a panic or fatality on clean exit.

The compiler won't stop you. You have to check the error manually. The pattern err != http.ErrServerClosed filters out the expected error. If you forget this check, your logs fill with false alarms every time you deploy.

Context timeout too short

If you set the shutdown timeout to 1 second, but a request takes 2 seconds to process, the server kills the request mid-stream. The client sees a connection reset error. The request fails.

Set the timeout based on your slowest expected request. If you have a long-running report generation, you might need 30 seconds. If you have fast API calls, 5 seconds is plenty. Monitor your request durations and adjust the timeout accordingly.

Goroutine leaks

If a goroutine blocks on a channel that never gets closed, the process hangs forever. The HTTP server shuts down, the context is cancelled, but the goroutine is stuck waiting on a channel. The process doesn't exit. The orchestrator waits for the timeout and force-kills the process.

Always ensure every goroutine has a cancellation path. Use context.Context or a done channel. Never block indefinitely on a channel without a select on ctx.Done().

The worst goroutine bug is the one that never logs.

Forgetting to pass context to handlers

If your HTTP handler starts a long background task, and that task doesn't check the request context, the task keeps running after the request finishes. If the server is shutting down, the task keeps running after the server exits.

Pass the request context to any function that does work. The request context is automatically cancelled when the client disconnects or the server shuts down. This prevents orphaned goroutines.

Decision matrix

Graceful shutdown involves several tools. Pick the right one for the job.

Use http.Server.Shutdown when you have an HTTP server and want to finish active requests before exiting. This is the standard way to stop a web service.

Use context.WithTimeout when you need to enforce a deadline on the shutdown process. This prevents the server from hanging forever if a request is stuck.

Use signal.Notify when you need to react to OS signals manually. This is how you detect that the process is being killed.

Use a background goroutine with context when you have long-running tasks like database syncs or worker pools. The context allows these tasks to stop cleanly when the signal arrives.

Use os.Exit when you need to kill the process immediately. This is rare and usually indicates a critical failure. It bypasses all shutdown logic and defers.

Don't fight the signal. Listen, cancel, and exit.

Where to go next