How Go Strings Work Internally

Go strings are immutable byte sequences defined by a pointer and length, ensuring data safety through immutability.

The string header is a view, not the text

You are processing a 500MB log file. You extract the first 10 bytes of every line to build an index. You expect this to be fast and memory-efficient. Instead, your program eats gigabytes of RAM and the garbage collector runs constantly. You assumed a substring would just be a lightweight view into the original data. Go disagrees. The string you created is a full copy. Or worse, the original 500MB file is stuck in memory because a tiny slice is holding onto it. Understanding the string header is the key to fixing both problems.

A Go string is not the text itself. It is a small structure called a header. The header holds two pieces of information: a pointer to the bytes and the length of those bytes. The bytes live on the heap or in the binary's read-only section. The header is tiny. You can pass a string to a function without copying the text. You only copy the header.

The bytes are immutable. You cannot change a byte inside a string. This is the contract. Because the bytes never change, Go can safely let multiple string headers point to the same underlying bytes. If one string changes, it doesn't break the others. Since strings never change, sharing is free.

package main

import "fmt"

func main() {
	// String literal lives in read-only memory embedded in the binary.
	// The variable s is a header: pointer + length.
	s := "hello"

	// Accessing bytes works. Indexing returns a byte value, not a reference.
	// This is O(1) and does not allocate.
	first := s[0]
	fmt.Println(first) // 104 ('h')

	// This fails to compile. Strings are immutable.
	// s[0] = 'x' // Error: cannot assign to s[0]

	// Concatenation creates a new string.
	// Go allocates a new buffer, copies "hello", copies " world",
	// and creates a new header. The original s is untouched.
	greeting := s + " world"
	fmt.Println(greeting) // hello world
	fmt.Println(s)        // hello
}

Strings are cheap headers. The cost is in the bytes.

Slicing creates a view, not a copy

Slicing a string is one of the most efficient operations in Go. When you write s[start:end], Go does not copy the data. It creates a new header that points to the same underlying bytes. The new header adjusts the length to match the slice range. This operation is O(1). It allocates nothing.

This behavior is powerful for parsing. You can extract tokens from a large buffer without touching the heap. The downside is reference retention. The slice holds a pointer to the original bytes. As long as the slice exists, the original bytes cannot be garbage collected. If you slice a tiny piece out of a massive buffer and keep the slice, the massive buffer stays in memory.

// ExtractToken returns a view into the original string.
// It does not copy data. The caller must ensure the original
// string is not needed, or the memory will not be reclaimed.
func ExtractToken(s string, start, end int) string {
	// Slicing creates a new header pointing to the same bytes.
	// This is O(1) and allocates nothing.
	return s[start:end]
}

// SafeExtractToken returns a copy of the substring.
// Use this when you need to keep the token but want the
// original large buffer to be garbage collected.
func SafeExtractToken(s string, start, end int) string {
	// The inner slice creates a view.
	// The outer string() conversion forces a copy of the bytes.
	// This allocates a new buffer for the substring.
	return string(s[start:end])
}

Slicing is fast. Keeping the slice is expensive if the source is huge.

Bytes are not characters

Go strings are sequences of bytes. They are not sequences of characters. Go uses UTF-8 encoding for text. In UTF-8, a character can be one to four bytes. ASCII characters use one byte. Characters outside ASCII use multiple bytes.

This distinction causes bugs when beginners use len() or indexing on text with non-ASCII characters. len(s) returns the number of bytes, not the number of characters. Indexing s[i] returns the byte at position i, which might be the middle of a multi-byte character.

package main

import (
	"fmt"
	"unicode/utf8"
)

func main() {
	// "café" contains 4 characters but 5 bytes.
	// The 'é' is encoded as two bytes in UTF-8.
	s := "café"

	// len returns byte count, not character count.
	fmt.Println(len(s)) // 5

	// Indexing returns bytes. s[3] is the second byte of 'é'.
	// It is not a valid character on its own.
	fmt.Printf("%x\n", s[3]) // a9

	// Use utf8.RuneCountInString for character count.
	fmt.Println(utf8.RuneCountInString(s)) // 4

	// Use a range loop to iterate over characters correctly.
	// The index i is the byte position. The value r is the rune (character).
	for i, r := range s {
		fmt.Printf("byte %d: %c\n", i, r)
	}
	// Output:
	// byte 0: c
	// byte 1: a
	// byte 2: f
	// byte 3: é
}

Bytes are raw data. Runes are characters. Know the difference.

Building strings efficiently

String concatenation with + or fmt.Sprintf allocates a new buffer every time. If you build a string in a loop, you allocate and copy repeatedly. The complexity grows quadratically. For small strings, this is fine. For large strings or tight loops, it kills performance.

Use strings.Builder when you need to construct a string from many parts. The builder maintains a mutable byte buffer. It grows the buffer as needed. You write to the buffer, then convert to a string once at the end. This reduces allocations to logarithmic growth.

package main

import (
	"fmt"
	"strings"
)

// BuildGreeting constructs a string from multiple parts.
// It uses strings.Builder to avoid repeated allocations.
func BuildGreeting(names []string) string {
	// Builder starts with a zero-length buffer.
	// It grows automatically as you write.
	var b strings.Builder

	for i, name := range names {
		if i > 0 {
			// WriteByte is efficient for single bytes.
			b.WriteByte(',')
			b.WriteByte(' ')
		}
		// WriteString copies the string into the builder's buffer.
		b.WriteString(name)
	}

	// String() returns the final string.
	// This creates one allocation for the result.
	return b.String()
}

func main() {
	names := []string{"Alice", "Bob", "Charlie"}
	fmt.Println(BuildGreeting(names)) // Alice, Bob, Charlie
}

Convention aside: strings.Builder is the standard tool for mutable string construction. The community expects builders for heavy text assembly. Don't roll your own buffer management.

Decision: strings, slices, and runes

Use a string when you need immutable text, especially for keys in maps or arguments to functions. Use a []byte when you need to modify the content in place or perform heavy manipulation without allocation. Use a string slice s[start:end] when you want a zero-allocation view of a substring and the original data can stay alive. Use string(s[start:end]) when you need to keep a substring but want the original large buffer to be garbage collected. Use []rune(s) when you need to iterate over Unicode characters correctly and the text contains non-ASCII characters.

Pitfalls and compiler errors

Try to modify a string and the compiler rejects the program with cannot assign to s[0]. The error is explicit. Strings are read-only. If you need mutation, convert to a byte slice, modify the slice, and convert back. The conversions copy the data.

// MutateString demonstrates the copy-modify-copy pattern.
// This is the only way to change string content.
func MutateString(s string) string {
	// Convert to byte slice. This allocates a copy.
	b := []byte(s)

	// Modify the slice.
	b[0] = 'X'

	// Convert back to string. This allocates another copy.
	return string(b)
}

Passing a *string is an anti-pattern. Strings are cheap to pass by value. The header is two words. Passing a pointer adds an indirection without saving memory. The receiver name should be a string, not a *string. The only valid reason for *string is if the value can be nil to represent absence, but this is rare. An empty string usually suffices.

Slicing a string does not have a capacity. Slices have capacity because they can grow. Strings cannot grow. The header has no capacity field. This simplifies the representation. You cannot "resize" a string in place. You must create a new string.

The worst string bug is the silent memory leak. You slice a large buffer, keep the slice, and the buffer never gets collected. The program runs out of memory. Always check lifetimes. If the source is large and temporary, copy the slice.

Trust the header. Pass strings by value.

Where to go next