The closing handshake
You are running a background process that caches compiled Go code. It sits idle most of the time, waiting for the go command to ask it to store or retrieve artifacts. When the build finishes, the parent process tells the cache daemon to shut down. If the daemon just vanishes mid-write, the cache corrupts. If it ignores the signal, the parent hangs forever. You need a handshake. Receive the stop command. Finish the current task. Send an acknowledgment. Exit cleanly.
How graceful shutdown actually works
Graceful shutdown is the practice of stopping a program without dropping active work or leaking resources. Think of a coffee shop at closing time. The manager flips the sign to closed. No new customers are let in. The baristas finish the drinks already on the counter. They wipe the machines, lock the register, and turn off the lights. The shop closes, but nothing spills on the floor.
In Go, this pattern relies on three pieces. A trigger tells the program to stop. A coordination mechanism lets active tasks finish. A cleanup step releases files, network connections, and goroutines. The trigger can be an OS signal, an HTTP endpoint, or a message over standard input. The coordination is usually a context.Context or a sync.WaitGroup. The cleanup is explicit code that runs before the process ends.
Go favors explicit control flow over hidden magic. You will not find a global shutdown hook that automatically drains your workers. You write the loop that listens for the stop signal. You write the code that waits for goroutines to finish. You write the function that closes the database connection. The verbosity is intentional. It forces you to think about resource lifecycles instead of hoping the runtime guesses correctly.
Convention aside: The community accepts the if err != nil { return err } boilerplate because it makes the unhappy path visible. Graceful shutdown follows the same philosophy. You check every channel close. You verify every context cancellation. You never assume a background task finished just because you told it to stop.
The minimal subprocess loop
The original kernel shows a subprocess reading JSON commands from standard input. Here is the stripped-down version that handles the close protocol correctly.
package main
import (
"encoding/json"
"fmt"
"os"
)
// Request represents a command sent to the subprocess.
type Request struct {
ID string `json:"id"`
Command string `json:"command"`
}
// Response acknowledges a request before the process exits.
type Response struct {
ID string `json:"id"`
}
// runLoop reads commands from stdin and handles the close protocol.
func runLoop() {
decoder := json.NewDecoder(os.Stdin)
// Keep reading until an error occurs or the process exits.
for {
var req Request
// Block here waiting for the next JSON payload.
if err := decoder.Decode(&req); err != nil {
// Stream closed or malformed input. Safe to stop.
return
}
// Check for the shutdown command.
if req.Command == "close" {
// Acknowledge the request so the parent knows we saw it.
resp := Response{ID: req.ID}
if err := json.NewEncoder(os.Stdout).Encode(resp); err != nil {
// Log the failure but do not crash the shutdown sequence.
fmt.Fprintf(os.Stderr, "failed to send close ack: %v\n", err)
}
// Exit cleanly after the handshake completes.
os.Exit(0)
}
// Handle other commands like put or get here.
}
}
func main() {
runLoop()
}
Convention aside: Public names start with a capital letter. Private start lowercase. The Request and Response structs are exported because the parent process needs to unmarshal them. The runLoop function is unexported because it is an internal implementation detail. Go uses capitalization for visibility, not keywords like public or private.
What happens under the hood
The program starts by creating a JSON decoder attached to standard input. The for loop calls decoder.Decode, which blocks until bytes arrive. When the parent process sends {"id":"123","command":"close"}, the decoder unmarshals it into the Request struct. The if statement catches the close command. Before terminating, the program encodes a matching Response back to standard output. This handshake prevents the parent from assuming the child crashed. Finally, os.Exit(0) terminates the process immediately.
Notice the error handling on the decoder. If the stream closes or the input is malformed, Decode returns an error. The loop returns naturally, which triggers Go's deferred functions and cleans up resources. If you skip the acknowledgment and call os.Exit directly, the parent process might read a broken pipe error. The protocol requires the response first.
The decoder maintains an internal buffer. It reads ahead from standard input to find JSON boundaries. This means your program does not wake up on every single byte. It waits for a complete value. When the parent closes its end of the pipe, Decode returns io.EOF. The loop exits gracefully. You do not need to poll for data. The blocking call handles the waiting for you.
Convention aside: gofmt is mandatory. Do not argue about indentation or brace placement. Let the tool decide. Most editors run it on save. Your code will look identical to every other Go project in the ecosystem. Trust the formatter. Argue logic, not whitespace.
Coordinating multiple workers
Real services rarely run a single blocking loop. They spawn goroutines to handle concurrent requests, manage database connections, and listen on network sockets. Graceful shutdown here means stopping the listener, draining active requests, and waiting for goroutines to finish.
package main
import (
"context"
"fmt"
"os"
"os/signal"
"sync"
"syscall"
"time"
)
// Service manages the lifecycle of a long-running worker.
type Service struct {
wg sync.WaitGroup
quit chan struct{}
}
// NewService creates a service ready to handle background work.
func NewService() *Service {
return &Service{
// Buffered channel prevents the sender from blocking if quit is called early.
quit: make(chan struct{}, 1),
}
}
// Start begins processing work in a separate goroutine.
func (s *Service) Start(ctx context.Context) {
s.wg.Add(1)
go func() {
// Ensure the wait group counter decrements when the goroutine exits.
defer s.wg.Done()
// Simulate continuous background work.
for {
select {
case <-ctx.Done():
// Context cancelled. Stop accepting new work.
fmt.Println("worker: context cancelled, finishing up")
return
case <-s.quit:
// Explicit quit signal received.
fmt.Println("worker: quit signal received, finishing up")
return
default:
// Do actual work here.
time.Sleep(100 * time.Millisecond)
}
}
}()
}
// Shutdown signals the service to stop and waits for completion.
func (s *Service) Shutdown() {
// Close the channel to unblock all waiting goroutines.
close(s.quit)
// Block until every goroutine calls wg.Done().
s.wg.Wait()
fmt.Println("service: all workers finished")
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
// Cancel the context when main returns to clean up resources.
defer cancel()
svc := NewService()
svc.Start(ctx)
// Listen for OS signals like Ctrl+C or SIGTERM.
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
// Block until a signal arrives.
sig := <-sigCh
fmt.Printf("received signal %v, initiating graceful shutdown\n", sig)
// Cancel the context to stop blocking operations.
cancel()
// Wait for goroutines to drain.
svc.Shutdown()
}
Convention aside: The receiver name is usually one or two letters matching the type. (s *Service) is correct. (this *Service) or (self *Service) breaks the style guide. Keep it short. It reduces visual noise and matches the standard library.
The sync.WaitGroup tracks how many goroutines are still running. You call Add(1) before launching the goroutine. The goroutine calls Done() when it finishes. Wait() blocks until the counter reaches zero. This pattern guarantees that your cleanup code runs only after every worker has exited. You never close a database connection while a query is still executing.
Convention aside: context.Context always goes as the first parameter. Conventionally named ctx. Functions that take a context should respect cancellation and deadlines. Pass it down to every long-lived call site. Context is plumbing. Run it through every blocking operation.
Where things go wrong
Graceful shutdown breaks when you ignore the coordination step. If you call os.Exit(0) inside a goroutine, the entire process dies instantly. Deferred functions never run. Open files stay locked. Database transactions roll back unpredictably. Always return from main or use a shutdown coordinator instead of forcing an exit.
Another common mistake is blocking on standard output during shutdown. If the parent process has already closed its end of the pipe, json.NewEncoder(os.Stdout).Encode will fail with a broken pipe error. The compiler will not catch this at build time. You only see it at runtime. If you forget to import the encoding/json package, the compiler rejects the program with undefined: json. If you try to pass a string where a struct is expected, you get cannot use "close" (untyped string constant) as Request value in assignment. These errors force you to align your types before runtime.
Goroutine leaks are the silent killer. If a worker waits on a channel that never closes, sync.WaitGroup.Wait() blocks forever. The process hangs during shutdown. Always provide a cancellation path. Use a select statement with a context or a quit channel. When the signal arrives, the goroutine breaks out of its loop and calls wg.Done().
Convention aside: _ discards a value intentionally. result, _ := someFunc() says you considered the second return value and chose to drop it. Use it sparingly with errors. Ignoring an error during shutdown is usually a bug. Log it or return it. Do not hide it behind an underscore.
The worst goroutine bug is the one that never logs. If your shutdown routine hangs, add a timeout. Wrap wg.Wait() in a goroutine with a time.After channel. If the workers do not finish within five seconds, print a stack trace and force exit. Production systems need a hard stop deadline.
Picking the right trigger
Use OS signals when you are building a standalone daemon that runs on a server or desktop. Use a JSON or text protocol over standard input when your program acts as a subprocess for another tool like go or git. Use an HTTP endpoint when your service is exposed to a network and needs remote administration. Use context cancellation when you need to coordinate multiple goroutines inside a single process. Use a simple boolean flag when you have a single-threaded loop and want to avoid the overhead of channels.
Convention aside: Accept interfaces, return structs. If your shutdown logic needs to be tested, define an interface for the trigger mechanism. Return a concrete struct that implements it. This keeps your production code simple and your test code flexible.
Do not pass a *string for configuration values. Strings are already cheap to pass by value. The pointer adds indirection without saving memory. Keep your shutdown parameters as plain values unless you need to mutate them in place.