The memory bomb
You write a backup script. It works perfectly on your test files. You point it at a 40GB database dump. Your machine freezes, the fan screams, and the process vanishes. The OOM killer terminated you. You used os.ReadFile to load the entire file into RAM before writing it out. Go tried to allocate 40GB of memory, failed, and gave up.
Copying files requires streaming data, not loading it all at once. Go provides io.Copy for this exact purpose. It reads chunks from a source, writes chunks to a destination, and repeats until the stream ends. Memory usage stays flat regardless of file size.
Streaming with interfaces
io.Copy works because Go defines two fundamental interfaces: io.Reader and io.Writer.
A Reader has one method: Read(p []byte) (n int, err error). It fills the byte slice with data and returns how many bytes it wrote. A Writer has one method: Write(p []byte) (n int, err error). It takes bytes and sends them somewhere.
Files implement both. Network connections implement both. Strings can implement Reader. Gzip streams implement both. io.Copy accepts any Writer and any Reader. It doesn't care where the data comes from or where it goes. It just moves bytes.
Think of a bucket brigade. You don't carry the whole river to the other side. You pass buckets down the line. io.Copy manages the buckets. It allocates a buffer, fills it from the source, empties it into the destination, and repeats. The buffer size determines how much water moves per trip.
Minimal copy
Here's the standard pattern. Open the source for reading, create the destination for writing, stream the data, and close the files.
package main
import (
"io"
"os"
)
// copyFile streams data from src to dst with constant memory usage.
func copyFile(src, dst string) error {
// Open source for reading. os.Open returns *os.File which implements io.Reader.
source, err := os.Open(src)
if err != nil {
return err
}
// Defer close releases the file descriptor when the function returns.
defer source.Close()
// Create destination for writing. os.Create truncates any existing file at this path.
dest, err := os.Create(dst)
if err != nil {
return err
}
// Defer close ensures the destination file descriptor releases.
defer dest.Close()
// io.Copy reads from source and writes to dest in 32KB chunks by default.
// It returns the number of bytes copied and any error encountered.
_, err = io.Copy(dest, source)
if err != nil {
return err
}
return nil
}
The defer statements are idiomatic Go. They guarantee cleanup even if an error occurs mid-function. The community accepts the boilerplate of if err != nil { return err } because it makes the unhappy path visible at every step. You can't accidentally swallow an error with a silent return.
io.Copy is the hose. Don't carry the ocean.
What happens under the hood
When you call io.Copy(dest, source), the function allocates a 32KB byte slice once. It enters a loop:
- Call
source.Read(buf). The OS fills the buffer with data from the file.Readreturns the number of bytes filled and an error. If the end of the file is reached,Readreturnsio.EOF. - Call
dest.Write(buf[:n]). The OS takes the bytes and writes them to the destination.Writereturns the number of bytes written and an error. - Repeat until
Readreturnsio.EOFor an error occurs.
The 32KB buffer is a sweet spot. Small buffers cause too many system calls, which adds overhead. Large buffers waste memory and can cause latency spikes. 32KB balances throughput and resource usage for most workloads.
If you pass the wrong types, the compiler catches it immediately. If you try to pass a string literal as the source, the compiler rejects the program with cannot use "filename" (untyped string constant) as io.Reader in argument. Go's type system forces you to work with the right abstractions.
Production-ready copy
Real code needs more than basic streaming. You want error wrapping for debugging, disk synchronization for durability, and control over buffer size for performance.
Here's a robust implementation that wraps errors, uses a larger buffer, and forces data to disk.
package main
import (
"fmt"
"io"
"os"
)
// RobustCopy copies a file with error wrapping, custom buffer, and disk sync.
func RobustCopy(src, dst string) error {
// Open source with read-only access.
source, err := os.Open(src)
if err != nil {
// Wrap error to include the filename for easier debugging.
return fmt.Errorf("open source %s: %w", src, err)
}
// Defer close ensures cleanup.
defer source.Close()
// Create destination. This fails if the file already exists.
dest, err := os.Create(dst)
if err != nil {
return fmt.Errorf("create dest %s: %w", dst, err)
}
// Defer close handles destination cleanup.
defer dest.Close()
// Allocate a 1MB buffer for high-throughput disk-to-disk copies.
// Larger buffers reduce syscall overhead on fast storage.
buf := make([]byte, 1024*1024)
// io.CopyBuffer reuses the provided buffer to avoid repeated allocations.
// It streams data just like io.Copy but uses your buffer.
_, err = io.CopyBuffer(dest, source, buf)
if err != nil {
return fmt.Errorf("copy data: %w", err)
}
// Sync flushes the OS page cache to the physical disk.
// Without this, a crash could lose the last few megabytes of data.
if err := dest.Sync(); err != nil {
return fmt.Errorf("sync dest %s: %w", dst, err)
}
return nil
}
The fmt.Errorf calls use %w to wrap the underlying error. This preserves the error chain so you can inspect the root cause later with errors.Is or errors.As. The Sync call is crucial for backups. The OS caches writes in RAM for performance. If the power fails before the cache flushes, you lose data. Sync forces the kernel to write everything to the disk platter or flash.
io.Copy blocks until the copy finishes. It does not accept a context.Context. If you need to cancel a long-running copy, you must close the source file or implement a custom loop that checks for cancellation. Context is plumbing. Run it through every long-lived call site, or write your own streaming loop.
Pitfalls and edge cases
Memory spikes with os.ReadFile. The convenience of os.ReadFile and os.WriteFile is a trap for large files. ReadFile loads the entire content into a []byte. If the file is larger than available RAM, the program crashes. Use io.Copy for anything that isn't trivially small.
Truncation with os.Create. os.Create opens the file with O_WRONLY|O_CREATE|O_TRUNC. If the destination file already exists, it gets wiped immediately. If the copy fails halfway, you lose the original destination. Check for existence with os.Stat or use os.OpenFile with flags to control behavior.
Missing Sync. If you copy critical data and skip Sync, the data might sit in the OS cache. A crash or reboot can result in a corrupted or incomplete file. The worst goroutine bug is the one that never logs; the worst copy bug is the one that looks successful but loses data on crash.
Compiler errors on arguments. If you try to pass permissions to os.Create, the compiler rejects it with too many arguments in call to os.Create. os.Create uses default permissions masked by the umask. Use os.OpenFile with explicit flags and mode bits if you need control.
Buffer reuse. io.Copy allocates a buffer internally. io.CopyBuffer lets you provide one. If you are copying many files in a loop, allocate one buffer outside the loop and reuse it with io.CopyBuffer. This reduces allocation pressure on the garbage collector.
Decision matrix
Use io.Copy when you need to stream data between any reader and writer with constant memory usage.
Use os.ReadFile and os.WriteFile when the file is small and you need the content in memory for processing.
Use io.CopyBuffer when you are copying large files and want to tune the buffer size for performance.
Use os.OpenFile with flags when you need to append to a file or control permissions explicitly.
Use plain sequential code when you don't need concurrency: the simplest thing that works is usually the right thing.