The contract that survives version changes
You built a Go service. It sends JSON to a frontend. Everything works. You add a last_login field to the user object. The frontend crashes because it tries to parse a string as an integer, or worse, it silently ignores the new data and shows stale information. You roll back. You realize that every time you change the data shape, you risk breaking every client that reads it. You need a way to define data once, generate code for multiple languages, and add fields without breaking old clients.
Protocol Buffers solve this with a schema file. You write a .proto file that describes the data structure. A compiler generates Go structs and serialization methods. The wire format uses field numbers, not names, so adding a field never breaks old readers. The schema becomes the single source of truth for your data contract.
How the schema works
Think of a .proto file as a form with numbered boxes. Box 1 holds the ID. Box 2 holds the name. When you send the form over the network, you only transmit the box number and the value. The receiver looks at the number to decide what to do with the value.
If you add Box 3 for email later, the old receiver sees Box 1 and Box 2, processes them, and ignores Box 3 because it doesn't know about it. The message still works. The labels on the boxes are just for humans. The numbers are for machines. That's why field numbers are permanent. Changing a number after deployment changes the wire format and corrupts data for old clients.
The protoc compiler reads the .proto file and generates code. In Go, it produces a .pb.go file with structs that match your messages, plus methods to marshal to bytes and unmarshal from bytes. You don't write the serialization logic. The tool writes it. You focus on the schema and the business logic.
Minimal proto file
Here's the smallest useful proto file that defines a message and tells the compiler where to put the Go code.
syntax = "proto3"; // proto3 is the current standard; proto2 is legacy
package user; // namespace for the schema, not the Go package
option go_package = "example.com/myapp/proto"; // maps schema to Go import path
message User {
int64 id = 1; // unique tag; permanent once deployed
string name = 2; // tag 2; strings are UTF-8
string email = 3; // tag 3
}
The syntax declaration must be the first line. proto3 is the standard. The package name organizes the schema but has no effect on Go imports. The option go_package is the bridge between protobuf and Go. It tells protoc what import path to use in the generated code. The path must match your Go module structure. If you set go_package to example.com/myapp/proto, the generated file will have package proto and you import it as example.com/myapp/proto.
Field numbers start at 1. They must be unique within a message. The range 1 to 15 uses one byte on the wire. Numbers 16 to 2047 use two bytes. Keep frequently used fields in the 1-15 range for efficiency. The compiler rejects the file with a duplicate tag error if you assign the same number twice.
Realistic schema with lists and nesting
Real services need lists, nested messages, and enums. Here's a schema for an order service that uses those features.
syntax = "proto3";
package api;
option go_package = "example.com/myapp/api";
enum Status {
STATUS_UNKNOWN = 0; // zero value is required for proto3 enums
STATUS_ACTIVE = 1;
STATUS_INACTIVE = 2;
}
message Item {
string sku = 1;
int32 quantity = 2;
}
message Order {
string id = 1;
Status status = 2; // enum maps to int in Go
repeated Item items = 3; // repeated creates a slice in Go
int64 total_cents = 4; // use cents to avoid float precision issues
}
The repeated keyword creates a slice in Go. repeated Item items becomes Items []*Item. The enum generates integer constants. STATUS_UNKNOWN must be 0. Proto3 requires the zero value to be defined and set to 0. It's the default when the field is unset.
Using int64 for money in cents avoids floating-point precision errors. Floats can't represent 0.1 exactly. Integers can. This is a common pattern in financial services. The generated Go code uses int64 for total_cents.
Oneof for mutually exclusive fields
Sometimes a message has fields that can't be set at the same time. A payment method might be credit card or PayPal, never both. The oneof keyword enforces this at the schema level and saves memory.
syntax = "proto3";
package api;
option go_package = "example.com/myapp/api";
message Payment {
string id = 1;
oneof method { // only one field can be set at a time
CreditCard cc = 2;
PayPal pp = 3;
}
}
message CreditCard {
string number = 1;
int32 exp_month = 2;
int32 exp_year = 3;
}
message PayPal {
string email = 1;
}
The oneof block generates a wrapper type in Go. The Payment struct has a Method field that holds a pointer to one of the inner types. You check which case is set using a type switch. This prevents invalid states where both methods are populated. It also reduces memory usage because only one field is allocated at a time.
Using the generated code
The generated code behaves like any other Go package. You construct structs, call Marshal to get bytes, and call Unmarshal to read bytes back.
package main
import (
"log"
"example.com/myapp/api"
)
func main() {
// construct the message with struct literal
order := &api.Order{
Id: "ord-123",
Status: api.Status_STATUS_ACTIVE,
Items: []*api.Item{
{Sku: "widget-a", Quantity: 2},
{Sku: "widget-b", Quantity: 1},
},
TotalCents: 500,
}
// Marshal converts the struct to binary bytes
data, err := order.Marshal()
if err != nil {
log.Fatal(err) // Marshal rarely fails, but check anyway
}
// Unmarshal reads bytes back into a struct
var decoded api.Order
if err := decoded.Unmarshal(data); err != nil {
log.Fatal(err) // Unmarshal fails on corrupt data
}
log.Printf("decoded order: %+v", decoded)
}
The Marshal method returns ([]byte, error). It rarely fails with standard types, but the interface returns an error for consistency. Handle it with if err != nil. The Unmarshal method takes a byte slice and populates the struct. It returns an error if the data is corrupt or truncated.
Proto3 scalars default to zero values. If you unmarshal a message that doesn't contain a field, the field gets its zero value. An int64 becomes 0. A string becomes "". A bool becomes false. You can't distinguish between "unset" and "zero" for scalars unless you use the optional keyword, which wraps the value in a pointer. Most services accept the zero-value semantics and avoid optional for simplicity.
Pitfalls and compiler errors
Field numbers are the most common source of bugs. If you change a field number after deployment, the wire format changes. Old clients read the wrong data. Treat field numbers as immutable. If you need to rename a field, keep the number and update the label. The label is just documentation for humans.
If you duplicate a field number, the compiler rejects the proto file with a duplicate tag error. If you use a reserved number, you get a field number is reserved error. The reserved keyword lets you mark numbers as unusable, which prevents accidental reuse if you delete a field.
The go_package option must match your module path. If you set go_package to example.com/myapp/api but your go.mod defines module example.com/myapp, the generated code imports example.com/myapp/api. If that package doesn't exist in your module, the compiler complains with undefined: api. If you import the package but don't use it, you get imported and not used. Fix the import path or remove the import.
Generated code passes gofmt. You don't need to format .pb.go files. Most editors run gofmt on save, but the generated code is already formatted. Don't fight the tool. Trust the compiler.
Proto3 doesn't support optional for scalars by default. If you need to distinguish unset from zero, use optional in proto3, which generates a pointer field in Go. Or use a wrapper message. The community prefers explicit wrapper messages for complex cases and accepts zero-value semantics for simple flags.
When to use Protocol Buffers
Use Protocol Buffers when you need a compact binary format for high-throughput internal services.
Use Protocol Buffers when you have multiple languages reading the same data and need a strict schema that generates idiomatic code.
Use Protocol Buffers when you want to evolve the schema over time without breaking old clients, as long as you keep field numbers stable.
Use JSON when the data is public-facing and humans need to read or edit it easily.
Use JSON when you need dynamic keys or arbitrary nesting that a schema cannot predict.
Use plain Go structs when the data stays within one service and serialization is just a side effect.
Use encoding/json with structs when you want quick prototyping without the build step overhead.
Binary is fast. JSON is readable. Pick the wire format that matches the audience. Field numbers are the contract. Labels are for humans. Protoc writes the boilerplate. You write the logic.