How to Use bufio for Buffered Reading and Writing

The cost of knocking on the kernel's door

You open a two-gigabyte log file in Go. You loop through it, reading one line at a time, and print a summary. The program takes forty seconds. You swap the standard file read for a buffered version, and it finishes in four. Nothing changed about your parsing logic. The only difference is how often your program knocked on the operating system's door.

Every time you call Read or Write on a file, socket, or standard stream, Go hands control to the kernel. The kernel switches contexts, validates permissions, moves data between kernel space and user space, and hands control back. That round trip costs microseconds. Do it ten thousand times for a modest file, and those microseconds stack into seconds. bufio exists to batch those trips. It keeps a chunk of memory in your program, fills it once, and serves your reads from that chunk until it runs dry.

How buffering actually works

Think of bufio like a water cooler in an office. The unbuffered approach sends an employee to the basement well for every single glass of water. The buffered approach sends one person to fill a large cooler, and everyone else pours from it. When the cooler empties, someone refills it. The well gets disturbed far less often. The office stays productive.

In Go, this pattern wraps the io.Reader and io.Writer interfaces. You do not need to know what sits underneath. A file, an HTTP response body, a network socket, or a gzip stream all implement those interfaces. bufio treats them identically. It sits between your code and the underlying stream, intercepting small operations and batching them into larger system calls.

The standard library follows a strict convention here: interfaces are accepted, structs are returned. bufio.NewReader takes an io.Reader interface and returns a *bufio.Reader struct. This design lets you swap out the underlying stream without changing your parsing code. You write against the interface, and bufio handles the concrete type.

Minimal example

Here is the simplest way to wrap a reader and a writer with buffers.

package main

import (
	"bufio"
	"fmt"
	"os"
)

func main() {
	// Wrap stdin to batch small reads into larger OS calls
	reader := bufio.NewReader(os.Stdin)
	// Read until the first newline, pulling from memory first
	line, err := reader.ReadString('\n')
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return
	}

	// Wrap stdout to batch writes and reduce terminal flush overhead
	writer := bufio.NewWriter(os.Stdout)
	// Write to the in-memory buffer, not directly to the terminal
	writer.WriteString("Echo: " + line)
	// Push the buffered data to the underlying writer
	writer.Flush()
}

The code above reads a line from standard input and echoes it to standard output. Notice the explicit Flush() call. The writer does not automatically push data to the terminal. It waits until its internal byte slice fills up, or until you tell it to empty. This design gives you control over when data actually leaves your program.

Buffered I/O requires explicit lifecycle management. Trust the flush. Never assume the OS will save your data.

Walking through the memory flow

When bufio.NewReader runs, it allocates a default 4096-byte slice in user space. The first time you call ReadString, the buffer is empty. bufio calls the underlying os.Stdin.Read, which triggers a kernel system call to pull up to 4096 bytes from the terminal driver. Those bytes land in the buffer. bufio then scans its own slice for the newline character, returns the substring, and updates an internal read pointer.

The next call to ReadString skips the kernel entirely. It just advances the pointer in the existing slice. Only when the pointer reaches the end of the slice does bufio make another system call to refill. Writing works in reverse. WriteString copies bytes into an empty output buffer. The kernel sees nothing until Flush runs or the buffer hits capacity. At that point, bufio calls the underlying Write, which triggers a single system call for all the accumulated data.

This batching is why bufio shines with small, frequent operations. Reading one byte at a time from a network socket without buffering can drop throughput by an order of magnitude. The CPU spends more time waiting for the kernel than processing data.

Go developers accept the verbose error checking and explicit flush calls because it makes the unhappy path visible. Hiding the flush inside a defer or an automatic destructor can mask timing bugs. When you call Flush yourself, you know exactly when the data leaves your process.

Realistic pipeline

Real programs rarely just echo input. They parse, transform, and route data. Here is a common pattern: reading a large configuration or log file, filtering lines, and writing the results to a new file.

package main

import (
	"bufio"
	"os"
)

// FilterLines reads from src, keeps lines starting with keyword, writes to dst
func FilterLines(src *os.File, dst *os.File, keyword string) error {
	// Wrap both files to batch disk I/O operations
	scanner := bufio.NewReader(src)
	writer := bufio.NewWriter(dst)

	for {
		// Read until newline, handling partial reads automatically
		line, err := scanner.ReadString('\n')
		if err != nil {
			// io.EOF means we reached the end of the file cleanly
			if err.Error() == "EOF" {
				break
			}
			return err
		}

		// Simple prefix check; real code would use strings.HasPrefix
		if len(line) >= len(keyword) && line[:len(keyword)] == keyword {
			// Write to buffer; actual disk write happens later
			writer.WriteString(line)
		}
	}

	// Ensure all buffered data hits the disk before returning
	return writer.Flush()
}

This function demonstrates the typical lifecycle. You wrap the file handles, loop until io.EOF, process in memory, and flush at the end. The ReadString method handles the buffer refills transparently. You never manage the 4096-byte slice yourself. You just ask for a line, and bufio delivers it, whether it came from memory or a fresh system call.

The bufio package also exposes a Peek(n int) method that lets you look ahead without advancing the read pointer. This is invaluable when parsing binary protocols or custom text formats where you need to inspect the next few bytes to decide how to parse the rest. Peek returns a slice pointing directly into the internal buffer. If you request more bytes than are currently loaded, bufio triggers a refill. If the stream ends before fulfilling the request, you get io.ErrShortBuffer. The compiler will reject this with a cannot use slice as int in argument error if you accidentally pass a string instead of a length, so keep your types tight.

Buffer sizing matters less than you think. The default 4096 bytes matches typical OS page sizes and disk block sizes. Changing it to 64KB or 1MB rarely improves throughput on modern hardware. The bottleneck is usually disk latency or network round trips, not buffer capacity. Stick to the default unless you have benchmarked a specific workload.

Pitfalls and silent failures

Buffered I/O introduces a few traps that trip up beginners. The most common is forgetting to flush a writer. If your program crashes or exits before Flush runs, the buffered data vanishes. The compiler will not stop you. bufio.Writer.Flush returns an error, but the type system allows you to ignore it. If you do, you get silent data loss. Always check the flush error, or call it right before closing the underlying stream.

Another trap is mixing buffered and unbuffered reads on the same stream. If you wrap a file in bufio.NewReader and then call file.Read directly, you will lose data. The buffer already pulled ahead into memory. The direct read will skip over those bytes because the file pointer moved. The compiler stays silent if you keep a reference to the original file. The runtime will just serve stale or duplicated data. Stick to one interface per stream.

You will also encounter io.EOF constantly. It is not an error in the traditional sense. It is a signal that the stream ended cleanly. If you treat it like a fatal error, your program will abort on empty files or closed connections. Check for io.EOF explicitly, or use errors.Is(err, io.EOF) to distinguish it from disk failures or network drops.

The standard library also provides bufio.Scanner as an alternative to bufio.Reader. Scanner handles buffer resizing automatically and yields lines as strings. It allocates a new string for every line, which creates garbage collection pressure on massive files. bufio.Reader reuses the same underlying byte slice and returns string views or copies only when you ask. Pick the tool that matches your memory budget.

Goroutines and buffers do not mix safely without synchronization. bufio.Reader and bufio.Writer are not concurrency-safe. If you share a buffered stream across multiple goroutines, you will get data races and corrupted output. Wrap the stream in a mutex, or give each goroutine its own buffered reader/writer.

When to reach for bufio

Use bufio.Reader when you need to read small chunks repeatedly from a file, socket, or pipe. Use bufio.Scanner when you only need line-by-line iteration and want the standard library to handle buffer resizing and error wrapping. Use direct io.Reader calls when you are moving large blocks of data at once, like copying entire files with io.Copy. Use bufio.Writer when you are generating many small writes and want to batch them into fewer system calls. Use direct io.Writer calls when you need immediate persistence, like logging critical errors or writing to a terminal that requires instant feedback.

Buffering is a trade-off between latency and throughput. Pick the one that matches your workload.

Where to go next

The bufio package adds a temporary storage area (buffer) between your program and the data source or destination. This reduces the number of slow system calls by reading or writing data in larger chunks at once. Think of it like using a bucket to carry water instead of a spoon; you make fewer trips to get the same amount done.