How to Bulk Insert Rows in Go

The mailbox problem

You're running a migration script. Ten thousand user records sit in a CSV. You write a loop, call db.Exec for each row, and hit run. The terminal stares back. Three minutes later, you've inserted four hundred rows. Your coffee is cold. The database isn't broken. Your approach is fighting the network round-trips.

Every call to db.Exec sends a SQL command over the network, waits for the database to parse it, execute it, and send back a result. Ten thousand calls mean ten thousand round-trips. The latency adds up. The database spends more time managing the connection protocol than writing data to disk. Bulk insertion reduces round-trips and batches the work. Think of mailing letters. Sending one letter per row means you walk to the mailbox ten thousand times. Bulk insertion means you stuff all the letters into one envelope and walk to the mailbox once. The database driver handles the envelope.

Go's database/sql package doesn't enforce a specific bulk method because every database speaks a different dialect. PostgreSQL has COPY. MySQL has LOAD DATA. SQLite has transactions. You pick the tool that matches your database's native bulk mechanism. The pattern in Go is always the same: group the work, reuse the plan, and stream the data.

The baseline: transaction and prepared statement

The simplest improvement over a raw loop is wrapping inserts in a transaction and reusing a prepared statement. A transaction groups multiple writes into a single atomic unit. The database skips writing to disk after every row, which saves I/O. A prepared statement lets the database parse the SQL query once and reuse the execution plan for every row.

Here's the baseline pattern for any database driver that supports database/sql: wrap inserts in a transaction and reuse a prepared statement.

package main

import (
    "database/sql"
)

// Row holds the data for a single user record.
type Row struct {
    Name  string
    Email string
}

// InsertBatch demonstrates transaction and prepared statement usage.
func InsertBatch(db *sql.DB, rows []Row) error {
    tx, err := db.Begin() // Begin transaction to group writes and reduce disk syncs.
    if err != nil {
        return err
    }
    defer tx.Rollback() // Rollback ensures abort on early return; safe after Commit.

    stmt, err := tx.Prepare("INSERT INTO users (name, email) VALUES ($1, $2)") // Prepare once to parse SQL once.
    if err != nil {
        return err
    }
    defer stmt.Close() // Close statement to free server resources.

    for _, r := range rows {
        if _, err := stmt.Exec(r.Name, r.Email); err != nil { // Execute prepared statement for each row.
            return err
        }
    }

    return tx.Commit() // Commit writes all changes to disk in one operation.
}

Transactions group work. Prepared statements save parsing. Together they beat a naked loop.

What happens under the hood

When you call tx.Begin, the database opens a workspace. Changes stay in memory until you commit. The Prepare call sends the SQL text to the server. The server checks the syntax and builds an execution plan. It returns a statement handle. Inside the loop, stmt.Exec sends only the parameter values. The server plugs the values into the cached plan and runs the insert. No parsing overhead per row.

When the loop finishes, tx.Commit tells the database to flush the workspace to the write-ahead log and update the tables. The write-ahead log ensures durability. The database writes to the log before updating the table. This means a crash during commit doesn't lose data, but it does add I/O cost. Batching reduces the number of log flushes. If any row fails, tx.Rollback discards the entire batch. The database convention here is strict: always defer the rollback. If you forget and a panic occurs, the transaction hangs open, locking rows and consuming memory until the connection times out.

The defer tx.Rollback() pattern is a Go idiom. Calling Rollback after Commit isn't wasteful. The database/sql package handles this gracefully. If the transaction is already committed, Rollback returns sql.ErrTxDone, which is harmless. The defer ensures cleanup happens no matter how the function exits. This is why the community accepts the boilerplate: it makes the unhappy path visible and safe.

Defer the rollback. Panic or not, the transaction closes.

High-throughput: the COPY protocol

For PostgreSQL, the transaction loop is faster than a raw loop, but it still sends one INSERT command per row. The network stack has to serialize and deserialize each command. The database has to lock the table, check constraints, and update indexes for every single row. PostgreSQL offers a COPY protocol that streams raw data directly into the table. It bypasses the SQL parser and constraint checks during the stream, then validates everything at the end.

The lib/pq driver is unmaintained. The community has moved to pgx for PostgreSQL. pgx offers CopyFrom with better performance and context support. If you are starting a new project, reach for pgx. The pattern remains the same: use the driver's copy interface to stream data.

Here's how pgx handles the COPY protocol: acquire a connection, start the copy session, stream rows, and close to commit.

package main

import (
    "context"

    "github.com/jackc/pgx/v5"
    "github.com/jackc/pgx/v5/pgxpool"
)

// BulkCopy demonstrates the PostgreSQL COPY protocol via pgx.
func BulkCopy(ctx context.Context, pool *pgxpool.Pool, rows []Row) error {
    conn, err := pool.Acquire(ctx) // Acquire connection from pool for copy session.
    if err != nil {
        return err
    }
    defer conn.Release() // Release connection back to pool when done.

    writer, err := conn.Conn().CopyFrom(ctx, pgx.Identifier{"users"}, []string{"name", "email"}) // CopyFrom initiates COPY FROM STDIN.
    if err != nil {
        return err
    }

    for _, r := range rows {
        _, err := writer.Write(ctx, []interface{}{r.Name, r.Email}) // Write rows as tab-separated values.
        if err != nil {
            return err
        }
    }

    return writer.Close(ctx) // Close ends stream and triggers database validation.
}

The context.Context always goes as the first parameter, conventionally named ctx. Functions that take a context should respect cancellation and deadlines. In this function, ctx allows the caller to cancel the copy operation if it takes too long.

The COPY protocol trades flexibility for speed. You lose per-row error granularity, but you gain orders of magnitude in throughput.

Chunking for stability

Bulk inserts scale, but they don't scale infinitely. A single transaction holding millions of rows can lock the table for too long, blocking reads and writes from other users. It also consumes undo logs and WAL space. The safe pattern is chunking. Split the data into batches of five thousand or ten thousand rows. Insert each chunk in its own transaction. This keeps locks short and allows the database to reclaim resources between batches. It also lets you report progress. You can log "Inserted 5000 rows" after each commit.

Here's a chunking helper that wraps the batch insert to limit transaction size and lock duration.

package main

import (
    "database/sql"
)

// InsertChunks processes rows in batches to limit transaction size and lock duration.
func InsertChunks(db *sql.DB, rows []Row, batchSize int) error {
    for i := 0; i < len(rows); i += batchSize { // Iterate with step size equal to batch size.
        end := i + batchSize
        if end > len(rows) {
            end = len(rows) // Cap end index at slice length.
        }

        batch := rows[i:end]
        if err := InsertBatch(db, batch); err != nil { // Slice batch and insert using transaction strategy.
            return err
        }
    }

    return nil // Return nil only after all chunks commit successfully.
}

Chunking turns a monolithic operation into a manageable stream. You trade a tiny bit of overhead for stability and observability.

Pitfalls and errors

The compiler rejects the program with undefined: pq if you forget to import the driver. If you try to use CopyIn on a driver that doesn't support it, you get a method-not-found error. Runtime panics often come from memory pressure. If you load ten million rows into a Go slice before inserting, you'll exhaust heap memory. Stream the data from the source instead of buffering it all. Use channels to feed rows into the insert function. This keeps memory usage constant regardless of dataset size.

Another trap is the transaction timeout. Databases have a statement_timeout or idle_in_transaction_session_timeout. If your bulk insert takes longer than the limit, the database kills the connection. The driver returns an error like pq: canceling statement due to statement timeout. Always set a context deadline for long-running bulk operations.

If you double-commit a transaction, the driver returns sql: transaction has already been committed. This usually happens when you forget the defer tx.Rollback() and manually commit in multiple branches. Stick to the defer pattern.

Goroutine leaks happen when the goroutine waits on a channel that never gets closed. If you spawn goroutines to fetch data for bulk inserts, ensure the channels are closed when the source is exhausted. The worst goroutine bug is the one that never logs.

Most editors run gofmt on save. Don't argue about indentation; let the tool decide. Trust gofmt. Argue logic, not formatting.

The worst goroutine bug is the one that never logs.

Decision matrix

Use a raw loop with db.Exec when you are inserting fewer than ten rows and code simplicity matters more than performance.

Use a transaction with a prepared statement when you need to insert hundreds to thousands of rows and want a balance of speed and portability across database drivers.

Use the database-specific COPY protocol when you are loading tens of thousands of rows and need maximum throughput for that specific database.

Use a streaming approach with channels when the source data is too large to fit in memory and you must process rows as they arrive.

Use a batch library like pgx.Batch when you need to mix inserts with other commands in a single network round-trip.

Where to go next

Bulk inserting means adding many rows to a database at once instead of one by one. In Go, you usually do this by opening a transaction and running a loop of insert commands, or by using a special "copy" command provided by your database driver. This is like mailing a whole stack of letters in one envelope rather than sending them individually.