Complete Guide to the bytes Package in Go

The Go bytes package offers efficient functions for searching, comparing, and transforming byte slices.

When strings aren't enough

You are building a simple protocol handler. You receive a stream of raw data from a network socket. You need to verify a magic number at the start, find a delimiter, extract a payload, and discard the rest. Strings feel natural because you can search and slice them, but binary data is not text. Using strings for binary data leads to encoding errors, unnecessary allocations, and subtle bugs when null bytes appear. The bytes package is the standard library's toolkit for working with []byte slices efficiently.

The bytes package in plain words

Go treats string and []byte as distinct types. A string is immutable. Once created, you cannot change its contents. A []byte is a slice of bytes. It is mutable. You can modify elements, append data, or replace sections. The bytes package provides functions that mirror the strings package but operate on byte slices. It handles searching, comparing, transforming, and building byte sequences.

Think of a string as a printed label on a box. You can read the label, but you cannot erase or rewrite it without printing a new one. A []byte is the contents of the box. You can rearrange items, swap parts, or add new ones. The bytes package is the set of tools you use to manipulate the box contents without taking everything apart.

The package is essential for binary data handling. It works with arbitrary byte values, including null bytes and sequences that are not valid UTF-8. If you have text, strings might be slightly more convenient, but bytes works correctly on text encoded as bytes. The convention is clear: use strings for text, use bytes for binary data.

Minimal example

The bytes package offers functions for common operations. bytes.Contains checks for a subsequence. bytes.ToUpper transforms data. Most functions return new slices rather than modifying the input, which prevents side effects.

package main

import (
	"bytes"
	"fmt"
)

// main demonstrates basic byte slice operations.
func main() {
	// Binary data often comes from network packets or files.
	// We use a byte slice to hold it.
	data := []byte("hello world")

	// Check if a subsequence exists without allocating a new string.
	// bytes.Contains scans the slice efficiently.
	if bytes.Contains(data, []byte("world")) {
		fmt.Println("Found 'world'")
	}

	// Transform data. bytes.ToUpper returns a new slice.
	// It does not modify the original slice in place.
	upper := bytes.ToUpper(data)
	fmt.Println(string(upper))

	// bytes.Index returns the position of the first occurrence.
	// It returns -1 if the target is not found.
	idx := bytes.Index(data, []byte("world"))
	fmt.Printf("Index: %d\n", idx)
}

How slicing and memory work

When you slice a []byte, you create a new slice header that points to the same underlying array. This operation is cheap. No data is copied. The new slice shares the backing array with the original. This sharing is powerful for performance, but it introduces a memory retention risk.

If you extract a small header from a large packet and keep the header alive, the entire packet remains in memory. The garbage collector cannot reclaim the backing array as long as any subslice references it. This is a common source of memory leaks in high-throughput services.

// extractHeader returns the first 4 bytes of a packet.
// WARNING: The returned slice shares memory with raw.
// If raw is large, keeping header alive retains the whole packet.
func extractHeader(raw []byte) []byte {
	return raw[:4]
}

// extractHeaderSafe returns a copy of the header.
// This isolates the header from the backing array.
func extractHeaderSafe(raw []byte) []byte {
	// bytes.Clone allocates a new slice and copies the data.
	// Use this when you need to discard the original backing array.
	return bytes.Clone(raw[:4])
}

Use bytes.Clone or manual copying when you need to isolate a subslice. The bytes package provides bytes.Clone for this exact purpose. It creates a new slice with a new backing array.

Building data with bytes.Buffer

When you need to construct a byte sequence incrementally, bytes.Buffer is the tool. It is a growable buffer that implements io.Writer and io.Reader. You can write to it, read from it, and reset it without reallocating.

bytes.Buffer is efficient for building responses, assembling protocol messages, or collecting output from multiple sources. It manages its own capacity growth. You write to it using methods like Write, WriteString, WriteByte, and WriteRune. When you are done, you can retrieve the contents with Bytes or String.

// buildMessage constructs a binary message using bytes.Buffer.
func buildMessage(id uint16, payload []byte) []byte {
	// Create a buffer. It starts with zero capacity and grows as needed.
	var buf bytes.Buffer

	// Write a 2-byte header.
	// WriteByte writes a single byte.
	buf.WriteByte(0xDE)
	buf.WriteByte(0xAD)

	// Write the ID as big-endian bytes.
	// Manual shifting avoids allocating a temporary slice.
	buf.WriteByte(byte(id >> 8))
	buf.WriteByte(byte(id))

	// Write the payload.
	// Write returns the number of bytes written and an error.
	// Buffer.Write never returns an error, so we can ignore it.
	n, _ := buf.Write(payload)
	
	// Verify we wrote everything.
	if n != len(payload) {
		panic("write error")
	}

	// Return the complete message.
	// buf.Bytes() returns a slice pointing to the buffer's internal array.
	// The caller must not modify this slice if the buffer is reused.
	return buf.Bytes()
}

bytes.Buffer is not safe for concurrent use. If multiple goroutines access a buffer, you must protect it with a mutex. The package documentation states this clearly. For concurrent writing, use sync.Pool to recycle buffers or use channels to serialize access.

Realistic example: Parsing a protocol

A common task is parsing a binary protocol. You receive raw bytes, validate the structure, and extract fields. The bytes package provides functions for every step.

// parsePacket extracts the payload from a raw byte slice.
// It checks for a magic header, validates the length, and returns the body.
func parsePacket(raw []byte) ([]byte, error) {
	// Check the magic bytes at the start.
	// bytes.HasPrefix is efficient and avoids allocation.
	if !bytes.HasPrefix(raw, []byte{0xDE, 0xAD, 0xBE, 0xEF}) {
		return nil, fmt.Errorf("invalid magic number")
	}

	// Skip the header (4 bytes).
	// Slicing is cheap and shares the backing array.
	body := raw[4:]

	// Extract the length field (next 2 bytes).
	if len(body) < 2 {
		return nil, fmt.Errorf("packet too short")
	}
	length := uint16(body[0])<<8 | uint16(body[1])

	// Validate the payload length.
	// The payload follows the length field.
	if len(body) < int(length)+2 {
		return nil, fmt.Errorf("payload truncated")
	}

	// Extract the payload.
	// bytes.Cut splits at the first occurrence of a separator.
	// Here we use slicing since we know the offset.
	payload := body[2 : 2+length]

	// Trim trailing padding if present.
	// bytes.Trim returns a subslice, no allocation.
	return bytes.Trim(payload, "\x00"), nil
}

// parsePacketWithCut demonstrates bytes.Cut for delimiter-based parsing.
func parsePacketWithCut(raw []byte) ([]byte, error) {
	// bytes.Cut splits the slice at the first occurrence of sep.
	// It returns the part before sep, the part after sep, and a boolean.
	// This is cleaner than Index + slicing.
	header, rest, found := bytes.Cut(raw, []byte{0x00})
	if !found {
		return nil, fmt.Errorf("missing delimiter")
	}

	// Validate header.
	if !bytes.Equal(header, []byte("PKT")) {
		return nil, fmt.Errorf("invalid header")
	}

	return rest, nil
}

bytes.Cut is available since Go 1.18. It is the modern way to split at a delimiter. It returns three values: the prefix, the suffix, and a boolean indicating success. It avoids the boilerplate of Index and bounds checking.

Pitfalls and compiler errors

The bytes package is straightforward, but a few traps exist.

You cannot use == to compare slices. The compiler rejects if a == b with invalid operation: a == b (slice can only be compared to nil). Slices are reference types with length and capacity metadata. The == operator only checks for nil. Use bytes.Equal to compare contents. bytes.Equal returns a boolean and handles length mismatches correctly.

Converting between string and []byte allocates memory. []byte(str) copies the string data. string(b) copies the slice data. In a tight loop, repeated conversions kill performance. Keep data in the type you need. If you have a []byte and need to search for text, convert once, or use bytes functions. Avoid string(data) just to call strings.Contains. Use bytes.Contains instead.

Functions like bytes.Index return -1 when the target is not found. This is a convention across the bytes and strings packages. Check for -1 explicitly. Do not assume a non-negative index.

The bytes package functions are safe for arbitrary binary data. They do not assume UTF-8 encoding. If you pass a []byte containing multi-byte characters to bytes.ToUpper, it will only affect ASCII letters. Multi-byte characters remain unchanged. Use strings functions if you need Unicode-aware transformations. Converting to string and back adds allocation, so weigh the cost.

Decision matrix

Use bytes when you are working with binary data, network packets, or file contents where encoding is not guaranteed.

Use strings when you are handling text, especially UTF-8, and you want functions that respect character boundaries.

Use bytes.Buffer when you need to build a byte sequence incrementally, like constructing a response body or assembling a protocol message.

Use bytes.Reader when you need to wrap a static byte slice as an io.Reader for testing or streaming consumption.

Use bytes.Equal when comparing slices, because the == operator does not work on slices.

Use bytes.Clone when you need to isolate a subslice from its backing array to prevent memory retention.

Use bytes.Cut when you need to split a slice at the first occurrence of a delimiter and handle the result cleanly.

Bytes are the raw material. Strings are the finished product. Slices share memory. Watch your backing arrays. Binary data gets bytes. Text gets strings.

Where to go next