The offset trap
You build a list of items. The first page loads instantly. You add a "Load More" button. It works fine for a few thousand rows. Then the dataset grows to a million rows. Users scroll down. Page 10 takes two seconds. Page 50 hangs. The database CPU spikes. The problem isn't the network. The problem is the query.
Most developers start with OFFSET. The query looks like SELECT * FROM items ORDER BY id LIMIT 20 OFFSET 480. The database engine reads the index, walks to row 480, discards those rows, and then collects the next 20. As the offset grows, the database does more work to throw away more data. The cost scales linearly with the offset. Page 1000 forces the database to scan 20,000 rows just to return 20.
Keyset pagination eliminates the scan. You send the ID of the last item you saw. The database jumps straight to that ID and reads forward. The cost stays constant regardless of how deep the user scrolls. The query becomes SELECT * FROM items WHERE id > 4821 ORDER BY id LIMIT 20. The database uses the index to seek to 4821 and stops after 20 rows. The work depends only on the page size, not the position.
The bookmark approach
Think of OFFSET like asking a librarian to hand you page 1, then page 2, then page 3. When you ask for page 1000, the librarian flips through 999 pages every single time. Keyset pagination is like using a bookmark. You tell the librarian, "Give me the next 10 pages after my bookmark." The librarian jumps to the bookmark and hands you the pages. The time it takes doesn't change whether the bookmark is on page 10 or page 1000.
The cursor is the bookmark. It's a value from the last row of the previous page. Usually, it's the primary key ID. Sometimes it's a timestamp or a composite of multiple columns. The client stores the cursor and sends it back when fetching the next page. The server uses the cursor in a WHERE clause to filter rows.
Cursors are stateless. The server doesn't need to track user sessions or open cursors. The client holds the position. This makes the system easier to scale. Any server instance can handle the request as long as it has access to the database.
Minimal query
Here's the core function. It takes a context, a database handle, the last cursor value, and a limit. It returns the next batch of items.
// GetItemsAfter fetches items with IDs strictly greater than the cursor.
func GetItemsAfter(ctx context.Context, db *sql.DB, lastID int64, limit int) ([]Item, error) {
// The WHERE clause filters rows after the cursor. ORDER BY ensures deterministic results.
// LIMIT bounds the result set to avoid returning the entire table.
rows, err := db.QueryContext(ctx, "SELECT id, name FROM items WHERE id > ? ORDER BY id ASC LIMIT ?", lastID, limit)
if err != nil {
return nil, fmt.Errorf("query items: %w", err)
}
defer rows.Close()
var items []Item
for rows.Next() {
var item Item
// Scan maps columns to struct fields.
if err := rows.Scan(&item.ID, &item.Name); err != nil {
return nil, fmt.Errorf("scan item: %w", err)
}
items = append(items, item)
}
// Check for errors that occurred during iteration.
if err := rows.Err(); err != nil {
return nil, fmt.Errorf("iterate rows: %w", err)
}
return items, nil
}
The WHERE id > ? clause is the key. It tells the database to start after lastID. The ORDER BY id ASC is mandatory. Without it, the database can return rows in any order, and the cursor logic breaks. The client might see duplicates or miss items.
context.Context goes as the first parameter. This is a Go convention. Functions that perform I/O or long-running work should accept a context to support cancellation and deadlines. The database driver respects context cancellation. If the client disconnects, the query can be aborted.
The if err != nil checks are verbose by design. The Go community accepts this boilerplate because it makes error handling explicit. You can't accidentally ignore an error. Every failure path is visible in the code.
Index seeks are fast. Index scans are slow.
How the database executes this
Databases use indexes to speed up lookups. A B-tree index stores keys in sorted order. When you query WHERE id > 4821, the database engine traverses the B-tree to find the leaf node containing 4821. It then walks forward through the leaf nodes, collecting rows until it hits the limit. This is an index seek followed by a range scan. The complexity is proportional to the number of rows returned, not the total rows in the table.
With OFFSET, the engine still seeks to the start, but it must walk through OFFSET rows and discard them before collecting the result. The complexity is proportional to OFFSET + LIMIT. As OFFSET grows, the cost grows.
Keyset pagination keeps the cost flat. The engine seeks to the cursor and walks LIMIT steps. Whether the cursor is 10 or 10,000,000, the work is the same. The database doesn't care how far back the cursor is. It just jumps there.
This performance gain assumes the cursor column is indexed. If you paginate by a column without an index, the database performs a full table scan. The query becomes slow regardless of the pagination strategy. Always index the columns used in the cursor.
Real API handler
In a real application, the cursor comes from the client. You need to parse it, validate it, and return the next cursor in the response. Here's a handler that ties it together.
// Item represents a row in the items table.
type Item struct {
ID int64 `json:"id"`
Name string `json:"name"`
}
// PageResponse wraps the items and the cursor for the next page.
type PageResponse struct {
Items []Item `json:"items"`
Next string `json:"next,omitempty"`
}
Structs define the data shapes. Public fields start with a capital letter. The json tags map the fields to lowercase keys in the JSON response. This matches common API conventions.
// HandleItems serves paginated items.
func HandleItems(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
// Parse cursor from query param.
cursorStr := r.URL.Query().Get("cursor")
var lastID int64
if cursorStr != "" {
// Sscanf parses the ID. Production code should sign cursors.
if _, err := fmt.Sscanf(cursorStr, "%d", &lastID); err != nil {
http.Error(w, "invalid cursor", http.StatusBadRequest)
return
}
}
// Fetch items.
items, err := GetItemsAfter(r.Context(), db, lastID, 20)
if err != nil {
http.Error(w, "server error", http.StatusInternalServerError)
return
}
// Build response.
resp := PageResponse{Items: items}
if len(items) == 20 {
resp.Next = fmt.Sprintf("%d", items[len(items)-1].ID)
}
json.NewEncoder(w).Encode(resp)
}
}
The handler extracts the cursor from the query string. If the cursor is present, it parses it into an int64. If parsing fails, it returns a 400 error. The Sscanf function handles the conversion. In production, you should sign the cursor to prevent tampering. A user could modify the cursor to access data they shouldn't see. Signing ensures the cursor hasn't been altered.
The handler fetches items with a limit of 20. If the result contains 20 items, there might be more data. The handler sets the Next field to the ID of the last item. The client uses this ID as the cursor for the next request. The omitempty tag hides the Next field if it's empty, which happens on the last page.
Cursors are stateless. The client holds the position, not the server.
Pitfalls and edge cases
Keyset pagination isn't magic. It has quirks you need to handle.
Duplicate sort keys. If multiple items share the same ID, WHERE id > lastID skips all items with that ID. You might miss rows or see duplicates. This happens when the cursor column isn't unique. The fix is a composite cursor. Include a secondary column in the sort and the filter. The query becomes WHERE (id, created_at) > (lastID, lastTime) ORDER BY id, created_at. The database compares tuples lexicographically. This ensures unique ordering even if IDs repeat.
Gaps from deletions. If items are deleted, the cursor jumps over the gap. The client never sees the deleted items. This is usually fine for feeds and timelines. Users expect the feed to change as data updates. If you need stable pages where the same page always returns the same items, keyset pagination is the wrong tool. Use offset pagination or a materialized snapshot instead.
Cursor tampering. If the cursor is a plain integer, users can guess IDs or access arbitrary data. Sign the cursor with a secret key. Encode the cursor and signature as a token. The server verifies the signature before using the cursor. This prevents users from manipulating the pagination state.
Type mismatches. If you forget to parse the cursor and pass a string to the query, the database driver rejects it. You get an error like driver: converting argument $1 type: unsupported type string, want int64. Always validate and parse input. The compiler won't catch runtime type errors.
Ordering drift. If the ORDER BY clause doesn't match the cursor column, results become non-deterministic. The client might see items out of order. Always order by the cursor column. If you sort by multiple columns, the cursor must include all of them.
Order by the cursor. Always.
Decision matrix
Pick the pagination strategy based on the data size and the user interaction pattern.
Use keyset pagination when the dataset is large and users scroll indefinitely. The performance stays constant as the user goes deeper.
Use offset pagination when the dataset is small or you need random access to page numbers. Offset supports "Go to page 50" directly. Keyset requires sequential traversal.
Use composite cursors when your sort order relies on multiple columns. A single column isn't enough to guarantee unique ordering.
Use signed tokens when the cursor contains sensitive data or you need to prevent tampering. Plain cursors expose internal IDs and allow manipulation.
Use a simple ID cursor when the primary key is monotonic and unique. This is the simplest and fastest approach.
Use a timestamp cursor when items are ordered by time and IDs aren't monotonic. Timestamps handle out-of-order inserts better than IDs.
Pick the pagination strategy that matches the data size, not the UI habit.