How to Work with Memory-Mapped Files in Go

When disk I/O becomes a bottleneck

You are parsing a two-gigabyte log file. You open it, read line by line, and watch the CPU sit idle while the disk grinds. You switch to bufio.Reader, and it helps, but random access still means seeking and buffering. The operating system already has a better idea. It can pretend the file is just a slice of bytes sitting in RAM. You ask for page 42, the OS loads it from disk into the page cache, and hands you a pointer. You ask for page 100, it loads that too. The file becomes a memory slice without you writing a single buffer.

How memory mapping actually works

Memory mapping bridges the gap between disk files and RAM. Instead of copying data from a file descriptor into a Go slice, you ask the kernel to map the file directly into your process's virtual address space. The kernel returns a []byte. When you read an index, the CPU triggers a page fault if that chunk isn't in RAM yet. The kernel intercepts the fault, reads the corresponding disk block into the page cache, updates the page tables, and resumes your program. You never see the fault. You just get the byte.

Go does not ship a standard library function for this. The syscall package exposes the raw operating system calls. On Linux and macOS, that means syscall.Mmap and syscall.Munmap. Windows uses a different API, which is why cross-platform code usually wraps these calls or leans on a third-party package.

The minimal mapping

Here is the bare minimum to map a file for reading. You open the file, grab its size, and hand the file descriptor to syscall.Mmap.

package main

import (
	"os"
	"syscall"
)

// mmapRead maps a file into memory for read-only access.
func mmapRead(path string) ([]byte, error) {
	f, err := os.Open(path)
	if err != nil {
		return nil, err
	}
	defer f.Close()

	info, err := f.Stat()
	if err != nil {
		return nil, err
	}

	// MAP_PRIVATE creates a copy-on-write mapping. Changes stay in RAM.
	// PROT_READ restricts access to reads only.
	data, err := syscall.Mmap(int(f.Fd()), 0, int(info.Size()), syscall.PROT_READ, syscall.MAP_PRIVATE)
	if err != nil {
		return nil, err
	}

	return data, nil
}

The function returns a []byte that points directly to the kernel's page cache. Indexing data[100] reads from the file. Slicing data[100:200] creates a new slice header pointing to the same underlying memory. You still own the file descriptor until f.Close() runs, but the mapping stays alive until you explicitly tear it down.

What happens under the hood

When syscall.Mmap returns, no data has actually moved from the disk yet. The kernel just reserves virtual address space and sets up page table entries marked as not present. Your program touches data[0]. The CPU raises a page fault. The kernel catches it, checks the mapping, reads the first disk block into RAM, marks the page as present, and lets your program continue. This lazy loading is why memory mapping feels instant.

Writing requires syscall.PROT_WRITE and syscall.MAP_SHARED. MAP_SHARED pushes changes back to the disk file. MAP_PRIVATE keeps modifications in RAM only, which is useful for temporary scratch space backed by a file. When you are finished, you must call syscall.Munmap(data). Forgetting to unmap leaks virtual address space. The kernel will eventually reclaim it when the process exits, but long-running services will exhaust their address space and crash.

Go's garbage collector does not track memory-mapped slices. The []byte header lives on the Go heap, but the underlying array points to kernel-managed memory. The GC will happily let the slice header escape to the heap if you return it from a function, but it will never free the mapped region. You control the lifecycle with Munmap.

A production-ready wrapper

Production code needs error handling, safe unmapping, and a way to handle files that change size. Here is a wrapper that maps a file, reads a chunk, and guarantees cleanup.

package main

import (
	"fmt"
	"os"
	"syscall"
)

// readMappedChunk reads a specific byte range from a memory-mapped file.
func readMappedChunk(path string, offset, length int64) ([]byte, error) {
	f, err := os.Open(path)
	if err != nil {
		return nil, err
	}
	// Close the file descriptor immediately. The mapping keeps the file alive.
	defer f.Close()

	info, err := f.Stat()
	if err != nil {
		return nil, err
	}

	// Clamp the requested range to the actual file size.
	if offset+length > info.Size() {
		length = info.Size() - offset
	}

	// MAP_SHARED allows reading files that other processes might update.
	data, err := syscall.Mmap(int(f.Fd()), offset, int(length), syscall.PROT_READ, syscall.MAP_SHARED)
	if err != nil {
		return nil, err
	}

	// Copy the data into a standard Go slice before unmapping.
	// Mapped memory becomes invalid once Munmap runs.
	result := make([]byte, length)
	copy(result, data)

	// Free the virtual memory mapping.
	if err := syscall.Munmap(data); err != nil {
		return nil, fmt.Errorf("munmap: %w", err)
	}

	return result, nil
}

Notice the copy step. Memory-mapped slices are only valid while the mapping exists. If you return data directly and call Munmap later, you risk using freed memory. Copying to a standard []byte gives you a GC-managed slice that lives independently of the file descriptor. The defer f.Close() runs before Munmap, which is safe because the kernel keeps the file open as long as the mapping exists.

Where memory mapping breaks

Memory mapping is fast, but it carries hidden costs. The biggest trap is assuming it beats bufio for sequential reads. If you read a file from start to finish, bufio or io.Copy will often outperform mmap because the kernel can optimize sequential disk reads with large I/O requests. Memory mapping shines for random access, partial reads, or files updated by external processes.

Platform differences break builds. syscall.Mmap takes a file descriptor, which works on Unix-like systems. Windows requires syscall.CreateFileMapping and syscall.MapViewOfFile. If you compile for Windows without guards, the linker complains with undefined: syscall.Mmap. Cross-platform projects usually wrap these calls behind a build tag or use a library like github.com/edsrzf/mmap-go.

Alignment matters on some architectures. Reading a 64-bit integer from an odd byte offset in a mapped slice can trigger a bus error on strict-alignment CPUs. Go's binary.Read or binary.LittleEndian.Uint64 handles byte-by-byte reads safely, but direct type assertions require the slice to be aligned. The compiler will not catch misalignment at compile time. You get a runtime panic or corrupted data.

File size changes break mappings. If another process truncates the file while your mapping is active, accessing the new length returns a SIGBUS signal. Go catches this and panics with signal: bus error. Always check the file size before mapping, or map in fixed-size chunks and handle bus errors gracefully.

When to reach for mmap

Use os.ReadFile when the file fits comfortably in RAM and you need the entire contents at once. Use bufio.Reader when you stream a file sequentially from start to finish. Use io.Copy when you move data between readers and writers without caring about the contents. Use syscall.Mmap when you need fast random access to specific byte ranges in a large file. Use MAP_SHARED with mmap when another process updates the file and you want to see changes without reopening the descriptor. Use a standard []byte copy when you need to hold data after the file mapping is torn down.

Convention aside

Go favors explicit resource management. The defer f.Close() pattern is standard, but memory mappings require a second cleanup step. The community convention is to pair every Mmap with a corresponding Munmap in the same scope, usually via a defer or a custom type with a Close() method. Don't fight the boilerplate. Explicit cleanup prevents address space leaks in long-running services.

Memory mapping hands you the kernel's page cache directly. Treat it like a loan, not a gift.

Where to go next

Memory-mapped files let your program read a file directly from the disk into your computer's memory without copying data. This is like opening a book and reading it directly instead of photocopying every page first. You use this technique when you need to process very large files quickly without using up too much memory.