How to Use xiter Patterns for Iterator Utilities

The problem with loading everything at once

You open a two gigabyte application log to find a specific error pattern. In Python, you write a generator. In JavaScript, you use an async iterator. Both stream data one line at a time. Go used to force you into a different pattern. You either built a channel and spawned a goroutine to feed it, or you wrote a struct with a Next() method and a Done() flag. Both approaches worked, but both carried overhead. Channels require a goroutine scheduler context switch. Custom iterator structs require boilerplate methods and explicit state management. Go 1.23 introduced iter.Seq and the yield callback pattern. It gives you lazy, single-use iterators that process data on demand without allocating intermediate slices or spawning background goroutines.

How the yield callback actually works

Lazy evaluation means you compute values only when the caller asks for them. The iter.Seq[T] type is not a magic container. It is a function type that takes a callback and returns nothing. That callback is the communication line between your iterator and the range loop. You pass a value to the callback. The callback runs the loop body, then returns a boolean. true means keep going. false means the caller hit break or finished the loop. Your iterator stops immediately.

Think of it like a factory assembly line where the next station only pulls a part when it is ready to work. The factory does not pile up inventory. It waits for the signal. The yield parameter is just a regular function variable. Go does not use yield as a keyword. You name it whatever you want, though yield is the community standard because it matches the mental model from other languages. The callback signature is always func(T) bool. The T is the value you produce. The bool is the control signal from the range loop.

A minimal iterator from scratch

Here is the simplest line iterator. It splits a byte slice on newlines and hands each line to the caller one at a time.

func Lines(s []byte) iter.Seq[[]byte] {
	// Return a function that matches the iter.Seq signature.
	// The compiler will inject the range loop body into this callback.
	return func(yield func([]byte) bool) {
		// Loop until the remaining slice is empty.
		for len(s) > 0 {
			var line []byte
			// Find the next newline. If found, split before and after it.
			if i := bytes.IndexByte(s, '\n'); i >= 0 {
				line, s = s[:i+1], s[i+1:]
			} else {
				// No more newlines. Take the rest and clear the source.
				line, s = s, nil
			}
			// Cap the slice capacity to prevent holding onto the original buffer.
			// This lets the GC reclaim the rest of the underlying array.
			if !yield(line[:len(line):len(line)]) {
				return
			}
		}
	}
}

Walking through the execution

When you write for line := range Lines(data), the compiler translates that into a direct function call. It passes a hidden closure as the yield argument. Your function starts running. It finds the first newline, slices the data, and calls yield(line). Execution pauses inside your function. The hidden closure runs the range loop body with your line. When the body finishes, the closure returns true. Your function resumes exactly where it left off. It updates the slice pointer, finds the next newline, and calls yield again.

This continues until the data runs out. If the loop body calls break, the closure returns false. Your function sees false, returns early, and the iteration ends. The entire process happens in a single goroutine with zero allocations. The compiler inlines the callback boundary in most cases, so the performance matches a hand-written for loop. There is no channel send, no mutex, no heap allocation for the iterator state. The range keyword simply became syntactic sugar for calling a function that accepts a callback.

Real-world data processing

Real code rarely just splits strings. You usually filter, transform, or validate. Here is a log parser that skips empty lines, trims whitespace, and converts each line to a structured record. It demonstrates how the callback pattern composes naturally with standard library functions.

type LogEntry struct {
	Timestamp string
	Message   string
}

func ParseLogs(raw []byte) iter.Seq[LogEntry] {
	return func(yield func(LogEntry) bool) {
		// Reuse the Lines iterator to avoid duplicating split logic.
		for line := range Lines(raw) {
			// Trim leading and trailing whitespace from each line.
			trimmed := bytes.TrimSpace(line)
			// Skip blank lines early to save parsing work.
			if len(trimmed) == 0 {
				continue
			}
			// Split on the first space to separate timestamp from message.
			parts := bytes.SplitN(trimmed, []byte(" "), 2)
			if len(parts) < 2 {
				continue
			}
			// Hand the structured record to the range loop.
			// Stop immediately if the caller breaks out.
			if !yield(LogEntry{
				Timestamp: string(parts[0]),
				Message:   string(parts[1]),
			}) {
				return
			}
		}
	}
}

The pattern encourages composition. You can nest iter.Seq calls inside other iter.Seq functions. Each layer pauses and resumes independently. The caller only sees the final output type. The intermediate steps remain lazy. This matches Go's broader preference for passing behavior as functions rather than defining complex interfaces. You accept the callback interface, return concrete values, and let the compiler handle the wiring.

Where the pattern breaks

The pattern has strict boundaries. The biggest one is single-use. An iter.Seq is exhausted after one pass. If you try to range over it again, you get an empty loop. The compiler does not stop you from writing the second range, but the runtime simply returns immediately. You cannot reset the iterator. If you need multiple passes, materialize the results into a slice first.

Another trap is concurrency. The callback mechanism is not thread-safe. Ranging from the same iterator in two goroutines causes data races and undefined behavior. The range loop expects exclusive access to the callback state. If you need concurrent iteration, stick to channels or worker pools.

Memory aliasing also trips people up. When you slice a byte slice, the new slice shares the underlying array. If you return s[i+1:] without capping the capacity, the caller holds a reference to the entire original buffer. The garbage collector cannot reclaim it. The line[:len(line):len(line)] syntax forces a new capacity bound equal to the length, isolating each line. This is a common Go convention for streaming parsers. You trim the capacity to prevent accidental memory leaks.

If you try to use iter.Seq on Go 1.22 or earlier, the compiler rejects the program with undefined: iter. The package only exists in 1.23+. If you accidentally return a slice instead of the callback function, you get a type mismatch error like cannot use []byte as iter.Seq value in return argument. The language enforces the contract at compile time. If you forget to call yield inside your loop, the function runs to completion without producing any values, and the range loop exits immediately. The compiler will not warn you about missing yields. It trusts you to drive the callback.

Choosing the right iteration tool

Use iter.Seq when you need a lazy, single-pass pipeline that processes data without intermediate allocations. Use a slice when the dataset fits comfortably in memory and you need random access or multiple passes. Use a channel when you must bridge independent goroutines or handle backpressure between producers and consumers. Use a traditional Next() struct when you need to maintain complex internal state across multiple independent iteration sessions. Use plain sequential code when you don't need concurrency: the simplest thing that works is usually the right thing.

Where to go next

xiter Patterns let you loop over data one piece at a time without loading everything into memory first. Think of it like reading a book page by page instead of photocopying the whole book before starting. You use it when processing large datasets or streams where memory efficiency matters.