The tap vs the bucket
Imagine a log viewer in your browser. The user clicks "Show Errors" and the screen starts filling with red lines instantly. The server doesn't sit there collecting every error from the past hour into a giant list before sending anything. It finds the first error, sends it. Finds the second, sends it. The client renders as it receives. This is server streaming. One request, many responses, delivered over time.
Standard RPC works like a waiter bringing the whole meal at once. You order, the kitchen cooks everything, plates it, and the waiter walks out with a heavy tray. If the kitchen takes ten minutes, you wait ten minutes. Server streaming is like a tapas bar. You order, and the kitchen sends out small plates as soon as they're ready. The connection stays open. The client keeps reading until the server signals it's done.
In gRPC terms, the client sends one request, and the server calls Send on a stream object multiple times. The underlying HTTP/2 connection carries all the messages in a single stream. The client receives each message as it arrives, without waiting for the server to finish.
Defining the stream
The contract starts in the .proto file. You mark the response type with the stream keyword. This tells the code generator to create a method that returns a stream interface instead of a single message.
Here's the proto definition for a log streaming service. The stream keyword on the response creates the server-side stream capability.
// syntax = "proto3"; // standard version for protobuf
// package example; // namespace for generated code
service LogService {
// stream on the response means the server sends multiple messages
// the client receives them as they arrive
rpc GetLogs (LogRequest) returns (stream LogEntry);
}
message LogRequest {
string level = 1; // filter by severity
}
message LogEntry {
string message = 1; // the actual log text
int64 timestamp = 2; // when it happened
}
Run protoc with the Go and gRPC plugins to generate the code. The generator creates a LogService_GetLogsServer interface. This interface holds the connection state and provides methods like Send, Context, SetHeader, and SetTrailer. The server method signature includes context.Context as the first argument. This is standard Go convention for functions that perform I/O. The context carries cancellation signals and deadlines.
The server implementation
Here's the server implementation. The method receives the request and a stream object. You loop and call Send to push messages to the client.
package main
import (
"context"
"fmt"
"log"
"net"
"google.golang.org/grpc"
pb "path/to/your/proto" // replace with actual import
)
// logServer implements the generated service interface
type logServer struct {
pb.UnimplementedLogServiceServer // forward unknown methods to default handler
}
// GetLogs streams log entries to the client
func (s *logServer) GetLogs(ctx context.Context, req *pb.LogRequest, stream pb.LogService_GetLogsServer) error {
// loop a fixed number of times for demonstration
for i := 0; i < 5; i++ {
entry := &pb.LogEntry{
Message: fmt.Sprintf("Error %d", i),
Timestamp: int64(i),
}
// Send pushes the message to the client over the open connection
// it blocks if the client is slow to read
if err := stream.Send(entry); err != nil {
return err // client disconnected or network failure
}
}
return nil // signals the stream is complete
}
func main() {
lis, err := net.Listen("tcp", ":50051")
if err != nil {
log.Fatalf("failed to listen: %v", err)
}
srv := grpc.NewServer()
pb.RegisterLogServiceServer(srv, &logServer{})
log.Println("server listening on :50051")
srv.Serve(lis)
}
Embed UnimplementedLogServiceServer in your struct. This is a Go community convention for gRPC servers. If you add a new RPC to the proto file later, the generated interface gains a new method. Embedding the unimplemented server ensures your build still compiles. The server returns a standard Unimplemented status for unknown calls instead of crashing. It buys you time to implement the new method.
Check the error from Send. The client can disconnect at any moment. If you ignore the error, the server keeps sending into the void. The error tells you the stream is broken. Return it to clean up resources.
Headers and trailers
Streams support metadata. SetHeader queues metadata to send with the first message. SendHeader sends metadata immediately. SetTrailer adds metadata to the end of the stream. Trailers are useful for checksums or final statistics. The client reads headers before the first message and trailers after the stream closes.
Use SendHeader when the client needs information before processing messages. For example, send a total count or a correlation ID. Use SetTrailer for data computed after the stream finishes. The server calls SetTrailer before returning from the method. The client reads trailers after receiving io.EOF.
Headers go first. Trailers go last. Messages go in between.
Cancellation and leaks
In production, you stream from a source like a database cursor, a file, or a channel. The client might cancel the request before the stream finishes. You need to respect cancellation to avoid wasting resources.
Here's a realistic example. The server reads from a channel and sends to the client. It checks ctx.Done() to handle cancellation.
// StreamLogs reads from a source and respects client cancellation
func (s *logServer) StreamLogs(ctx context.Context, req *pb.LogRequest, stream pb.LogService_StreamLogsServer) error {
// simulate a source of data, like a channel from a worker
logChan := make(chan *pb.LogEntry, 10)
// start a goroutine to feed the channel
go func() {
defer close(logChan)
for i := 0; ; i++ {
select {
case <-ctx.Done():
return // client cancelled, stop generating
default:
logChan <- &pb.LogEntry{Message: fmt.Sprintf("Log %d", i)}
time.Sleep(100 * time.Millisecond)
}
}
}()
// main goroutine sends to the client
for {
select {
case <-ctx.Done():
return ctx.Err() // return cancellation error
case entry, ok := <-logChan:
if !ok {
return nil // source exhausted
}
if err := stream.Send(entry); err != nil {
return err // send failed
}
}
}
}
The select statement multiplexes two channels: ctx.Done() and logChan. If the client cancels, ctx.Done() fires. The server returns immediately. The goroutine feeding logChan also checks ctx.Done(). This prevents a goroutine leak. If the main goroutine exits without stopping the feeder, the feeder blocks forever on the channel send. The server holds a reference to the goroutine, and it never gets garbage collected.
Context is the kill switch. Wire it through every loop.
Pitfalls and errors
Server streaming introduces specific failure modes. Ignoring Send errors is the most common mistake. The compiler won't stop you from ignoring the error, but the runtime will fail when the client drops. You might panic later when trying to send on a closed stream, or you might leak goroutines that keep generating data.
If you implement the interface manually without embedding the unimplemented server, adding a new RPC to the proto breaks your build. The compiler rejects the registration with cannot use &logServer{} as pb.LogServiceServer value in argument: *logServer does not implement pb.LogServiceServer (missing method NewMethod). Embedding the unimplemented server catches missing methods at compile time.
gRPC runs over HTTP/2. HTTP/2 has flow control. If the client reads slowly, the server's Send call blocks. This is intentional. It prevents the server from flooding the client's memory. This is backpressure. If you need to handle a slow client without blocking the entire server, run the send loop in a goroutine and check ctx.Done() or a timeout.
Backpressure is a feature. Let the slow client throttle the fast server.
When to stream
Pick the right RPC pattern for the job. Server streaming shines when the response is large or arrives over time.
Use server streaming when the response size is unknown or unbounded, like a log tail or a database cursor. Use server streaming when the client needs to start processing data before the server finishes computation. Use unary RPC when the response fits in memory and the client waits for the full result. Use client streaming when the client uploads a large file in chunks. Use bidirectional streaming when both sides exchange messages continuously, like a chat app. Use HTTP long-polling when the infrastructure cannot support HTTP/2 or gRPC.
Check Send errors. The client can vanish at any moment.