How to Read a File Line by Line in Go

Read a file line by line in Go using bufio.Scanner to process text efficiently without loading the entire file into memory.

Reading a file line by line

You have a 4 gigabyte log file from a production server. Opening it in your editor makes the fans spin up. Loading it into a slice of strings crashes your program with an out-of-memory panic. You need to read one line, process it, and discard it before moving to the next. The file stays on disk. Your RAM stays flat.

Go handles this with bufio.Scanner. Think of the scanner as a conveyor belt feeding a single workstation. The file is a mountain of raw bytes on the floor. The conveyor belt grabs a manageable chunk, slides it to the worker, and waits. The worker processes one item, signals completion, and the belt pulls the next chunk. You never see the whole mountain. The scanner manages the disk reads, splits the bytes at newlines, and hands you clean strings. It abstracts away partial reads and buffer management.

The scanner accepts an io.Reader. Go follows the rule "accept interfaces, return structs." By taking an interface, the scanner works with files, network streams, and in-memory buffers without changing a single line of logic. The community prefers this pattern because it keeps dependencies loose and testing straightforward.

Minimal example

Here is the standard loop: open the file, attach a scanner, iterate, and verify the exit condition.

package main

import (
	"bufio"
	"fmt"
	"log"
	"os"
)

func main() {
	// Open returns a file handle that implements io.Reader.
	file, err := os.Open("data.txt")
	if err != nil {
		log.Fatal(err)
	}
	// Defer ensures the file descriptor closes even if processing panics.
	defer file.Close()

	// NewScanner wraps the reader and sets up a 4KB internal buffer.
	scanner := bufio.NewScanner(file)

	// Scan advances to the next line. It returns false at EOF or on error.
	for scanner.Scan() {
		// Text returns the current line. The underlying buffer is reused on the next iteration.
		fmt.Println(scanner.Text())
	}

	// Err distinguishes a clean end-of-file from a disk failure or read error.
	if err := scanner.Err(); err != nil {
		log.Fatal(err)
	}
}

The if err != nil pattern looks verbose. The community keeps it because it forces you to acknowledge failure paths instead of hiding them. Place defer file.Close() immediately after a successful open. It is the standard Go idiom for resource cleanup. Run gofmt on the file before committing. The tool enforces consistent indentation and spacing so you never waste time arguing about whitespace in code reviews.

The scanner buffers for you. You process line by line. Memory stays flat.

How the scanner works

When bufio.NewScanner runs, it allocates a 4KB buffer on the heap. This buffer holds raw bytes pulled from the disk. The scanner also stores a split function. The default is bufio.ScanLines, which hunts for newline characters.

Calling scanner.Scan() triggers a cycle. The scanner checks if the buffer contains a complete token. If the buffer is empty or lacks a newline, it calls Read on the underlying file to fill the buffer. Once enough bytes arrive, the split function scans the slice, locates the delimiter, and returns the token boundaries. The scanner advances its internal cursor past the token and returns true.

The loop repeats until Scan() returns false. This happens for two reasons: the file ended cleanly, or an error occurred during reading. The loop structure deliberately merges both cases to keep the code tight. You must call scanner.Err() after the loop to check which path actually happened. Skipping it means you might silently ignore a permission denied error or a corrupted disk sector.

Go naming conventions keep the API predictable. Public methods like Scan, Text, and Err start with a capital letter. Internal state like the buffer and cursor position starts lowercase. You interact only with the public surface. The compiler enforces this boundary strictly. Forget to initialize a variable and you get undefined: scanner. Import a package without using it and the build fails with imported and not used. These messages are plain text. They do not use numeric codes. They tell you exactly what went wrong.

Public names start with a capital letter. Private names start lowercase. The compiler enforces the boundary.

Realistic usage: handling long lines

Production logs often contain stack traces or JSON payloads that exceed 64 kilobytes. The scanner enforces a default token limit to prevent malicious or malformed files from exhausting memory. If a line crosses that limit, Scan() stops and scanner.Err() returns bufio.Scanner: token too long.

Here is how you adjust the limits for heavy workloads.

import (
	"bufio"
	"os"
	"strings"
)

// CountMatches scans a file and returns how many lines contain the target.
// It expands the buffer to safely handle lines larger than the default 64KB limit.
func CountMatches(path, target string) (int, error) {
	// Open the file for reading.
	file, err := os.Open(path)
	if err != nil {
		return 0, err
	}
	// Release the file descriptor when the function exits.
	defer file.Close()

	scanner := bufio.NewScanner(file)
	// Allocate a 1MB backing slice and set the maximum token size to 1MB.
	// This prevents the scanner from rejecting long stack traces or JSON blobs.
	scanner.Buffer(make([]byte, 1024*1024), 1024*1024)

	count := 0
	for scanner.Scan() {
		// Text returns the current line without the trailing newline.
		if strings.Contains(scanner.Text(), target) {
			count++
		}
	}
	// Return the count and any error that occurred during scanning.
	return count, scanner.Err()
}

The Buffer method takes two arguments: a backing slice and a maximum token size. The slice provides the storage. The max size acts as a hard ceiling for a single token. Setting both to 1MB lets the scanner swallow massive lines without repeated allocations. The scanner reuses the same slice for every read.

If this function lives inside a web handler or a background worker, pass context.Context as the first parameter. The context convention dictates it goes first and is named ctx. Although os.Open does not accept a context, your wrapper should check ctx.Done() periodically or wrap the file operations in a cancellable scope. Context is plumbing. Run it through every long-lived call site.

Set the buffer size before you scan. Long lines crash the default scanner.

Pitfalls and traps

The scanner hides a subtle trap that catches developers who store lines in a slice. scanner.Text() returns a string that aliases the scanner's internal buffer. The next call to Scan() overwrites that buffer. If you append scanner.Text() to a slice and continue looping, every element in the slice will eventually point to the last line. Copy the string explicitly if you need to keep it alive. The Go runtime optimizes string creation by pointing directly to the underlying byte slice when possible. This optimization saves memory but breaks if you hold onto the string across iterations.

Another common mistake is skipping scanner.Err(). The loop exits on false for both success and failure. If you assume a clean exit, you might process partial data from a crashed read. The scanner also struggles with binary data. It expects text delimited by newlines or custom split functions. Binary protocols often contain bytes that mimic delimiters, which breaks the tokenization logic. Use bufio.Reader or encoding/binary for fixed-size records or raw byte streams.

Copy the string if you keep it. Check the error after the loop.

Custom split functions

You are not locked into lines. The scanner accepts any function matching the SplitFunc signature. This lets you parse CSV rows, JSON lines, or space-separated values without pulling in a heavy parsing library.

Here is a split function that isolates words separated by whitespace.

import (
	"bufio"
	"bytes"
)

// SplitWords breaks a byte slice into individual words.
// It returns the advance count, the token, and an error.
func SplitWords(data []byte, atEOF bool) (advance int, token []byte, err error) {
	// Locate the first space character in the buffer.
	i := bytes.IndexByte(data, ' ')
	if i == -1 {
		// No space found. If at EOF, return the remaining data as the final token.
		if atEOF {
			return len(data), data, nil
		}
		// Request more data from the underlying reader.
		return 0, nil, nil
	}
	// Advance past the space and return the word before it.
	return i + 1, data[:i], nil
}

The split function receives the current buffer and a boolean flag indicating whether the scanner reached the end of the stream. Returning 0, nil, nil tells the scanner to read more bytes. Returning a token advances the cursor and yields the result. You attach it with scanner.Split(SplitWords). The loop structure remains identical. The scanner handles buffering and state. You only define the boundaries.

The scanner splits data. You define the split. The loop stays the same.

Decision matrix

Use bufio.Scanner when you need to process text files line by line and the lines fit within a predictable buffer size. Use bufio.Reader with ReadString or ReadBytes when you need to handle arbitrarily long lines without preallocating a massive buffer. Use io.ReadAll when the file is small enough to fit in memory and you need random access to the entire content. Use filepath.WalkDir when you need to traverse a directory tree and process multiple files recursively.

Pick the tool that matches your data size. Scanner for lines. Reader for chunks. All for small blobs.

Where to go next