How to Implement Full-Text Search with PostgreSQL in Go

PostgreSQL full-text search in Go requires enabling the `tsvector` and `tsquery` types in your schema, then querying them using the `@@` operator via a standard SQL driver like `lib/pq`.

When `LIKE` isn't enough

You built a blog. Users type "Go concurrency" into the search bar. The database returns posts about "Goats" and "Going to the store". Worse, the query takes two seconds on ten thousand rows. You need a search that understands words, handles plurals, ignores "the", and runs fast. That's full-text search.

PostgreSQL has built-in full-text search that rivals external engines for most applications. It handles tokenization, stemming, stop-word removal, and relevance ranking. The database does the heavy lifting. Your Go code just sends the query and reads the results.

The search index lives in the row

PostgreSQL full-text search relies on two types. A tsvector stores the processed words from a document. A tsquery stores the processed words from the user's search. The @@ operator checks if the query matches the vector.

Think of a library card catalog. You don't read every book to find "Go". You look up the card that lists every book mentioning "Go", sorted by relevance. PostgreSQL builds this card automatically. The card is the tsvector. The lookup is the tsquery. The @@ operator is the librarian checking the card.

The to_tsvector function turns raw text into a vector. It lowercases words, removes stop words like "the" and "and", and reduces words to stems. "Running" becomes "run". "Database" becomes "databas". The plainto_tsquery function does the same processing on the user's input. This ensures "Running" matches "run" because both reduce to the same stem.

A GIN index on the tsvector column makes the lookup instant. Without the index, PostgreSQL scans every row. With the index, it jumps straight to matching documents.

Schema setup with generated columns

Here's the schema setup and a basic query using generated columns. The GENERATED ALWAYS clause keeps the index in sync without triggers.

package main

import (
	"database/sql"
	"fmt"
	"log"

	_ "github.com/jackc/pgx/v5/stdlib" // stdlib adapter for database/sql
)

// setupDB creates the table with a generated search vector.
// GENERATED ALWAYS computes the vector on write.
// This avoids the bug where the index drifts from the content.
func setupDB(db *sql.DB) error {
	_, err := db.Exec(`
		CREATE TABLE IF NOT EXISTS documents (
			id SERIAL PRIMARY KEY,
			title TEXT,
			content TEXT,
			// to_tsvector reduces words to stems and strips stop words.
			// The 'english' config handles stemming and stop lists.
			// GENERATED ALWAYS ensures the vector updates automatically.
			search_vector TSVECTOR GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || content)) STORED
		);
		// GIN indexes are optimized for composite types like tsvector.
		// They turn full scans into logarithmic lookups.
		CREATE INDEX IF NOT EXISTS idx_search_vector ON documents USING GIN(search_vector);
	`)
	return err
}

func main() {
	db, err := sql.Open("pgx", "postgres://user:pass@localhost/dbname?sslmode=disable")
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	if err := setupDB(db); err != nil {
		log.Fatal(err)
	}

	// Insert a sample document.
	// The search_vector column is computed automatically.
	_, err = db.Exec(`INSERT INTO documents (title, content) VALUES ($1, $2)`,
		"Go Concurrency", "Goroutines run concurrently. Channels coordinate them.")
	if err != nil {
		log.Fatal(err)
	}

	// Query using the @@ operator.
	// plainto_tsquery processes the input to match the vector format.
	var id int
	err = db.QueryRow(`
		SELECT id FROM documents 
		WHERE search_vector @@ plainto_tsquery('english', $1)
	`, "concurrency").Scan(&id)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("Found document ID: %d\n", id)
}

The GENERATED ALWAYS clause is the modern way to manage search vectors. Older tutorials recommend database triggers. Triggers are error-prone. If the trigger fails, the index silently drifts from the content. Generated columns enforce the invariant at the storage engine level. The column is read-only for application code. PostgreSQL calculates the value whenever title or content changes.

Convention aside: gofmt formats this code. Run gofmt -w main.go or configure your editor to format on save. The Go community trusts the tool. Don't argue about indentation.

Querying with relevance ranking

Basic matching finds rows. Real search ranks them. The ts_rank function calculates a score based on term frequency and proximity. Higher scores mean better matches.

Here's a query that returns ranked results. The LIMIT clause protects the database from returning millions of rows.

// searchRanked finds documents and orders them by relevance.
// ts_rank computes a score based on term frequency and position.
func searchRanked(db *sql.DB, query string) ([]Document, error) {
	rows, err := db.Query(`
		SELECT id, title, ts_rank(search_vector, plainto_tsquery('english', $1)) AS rank
		FROM documents
		WHERE search_vector @@ plainto_tsquery('english', $1)
		ORDER BY rank DESC
		LIMIT 20
	`, query)
	if err != nil {
		return nil, fmt.Errorf("search query: %w", err)
	}
	defer rows.Close()

	var results []Document
	for rows.Next() {
		var doc Document
		var rank float64 // rank is used for sorting, discarded in the struct
		if err := rows.Scan(&doc.ID, &doc.Title, &rank); err != nil {
			return nil, fmt.Errorf("scan row: %w", err)
		}
		results = append(results, doc)
	}
	// Check for errors that occurred during iteration.
	// rows.Err() catches network drops or constraint violations.
	if err := rows.Err(); err != nil {
		return nil, fmt.Errorf("row iteration: %w", err)
	}
	return results, nil
}

type Document struct {
	ID    int
	Title string
}

The ts_rank function takes the vector and the query. It returns a float. ORDER BY rank DESC puts the best matches first. The LIMIT clause is essential. Without it, a popular query could return the entire table.

Convention aside: if err != nil is verbose by design. The community accepts the boilerplate because it makes the unhappy path visible. Always check the error. Wrap it with fmt.Errorf and %w to preserve the chain.

Realistic service with context

Production code needs timeouts and cancellation. HTTP requests can hang. Long-running queries block connections. context.Context solves this.

Here's a search service that wraps the database logic and respects context cancellation. The driver aborts the query if the context expires.

import (
	"context"
	"database/sql"
	"fmt"
)

// SearchService handles document retrieval.
type SearchService struct {
	db *sql.DB
}

// Search finds documents matching the query.
// Context controls timeout and cancellation.
func (s *SearchService) Search(ctx context.Context, q string) ([]Document, error) {
	// Use QueryContext to pass cancellation to the driver.
	// The driver sends a cancel signal to PostgreSQL.
	rows, err := s.db.QueryContext(ctx, `
		SELECT id, title, ts_rank(search_vector, plainto_tsquery('english', $1)) AS rank
		FROM documents
		WHERE search_vector @@ plainto_tsquery('english', $1)
		ORDER BY rank DESC
		LIMIT 20
	`, q)
	if err != nil {
		return nil, fmt.Errorf("search query: %w", err)
	}
	defer rows.Close()

	var results []Document
	for rows.Next() {
		var doc Document
		var rank float64
		if err := rows.Scan(&doc.ID, &doc.Title, &rank); err != nil {
			return nil, fmt.Errorf("scan row: %w", err)
		}
		results = append(results, doc)
	}
	if err := rows.Err(); err != nil {
		return nil, fmt.Errorf("row iteration: %w", err)
	}
	return results, nil
}

The QueryContext method accepts a context. If the context has a deadline, the driver sets a timeout. If the context is cancelled, the driver aborts the query. This prevents goroutine leaks and connection exhaustion.

Convention aside: context.Context always goes as the first parameter. Name it ctx. Functions that take a context should respect cancellation and deadlines. The receiver name is usually one or two letters matching the type: (s *SearchService), not (this *SearchService).

Pitfalls and compiler errors

Full-text search has traps. Missing one returns empty results or slow queries.

Language mismatch kills relevance. If you build the vector with to_tsvector('english') but query with plainto_tsquery('simple'), the stems won't match. "Running" becomes "run" in English but stays "running" in simple. The query returns nothing. Always use the same language config for both sides.

The compiler rejects invalid operators. If you write search_vector @@ 'query', the compiler rejects the program with operator does not exist: tsvector @@ text. You must wrap the text in plainto_tsquery. The database needs a tsquery, not a string.

Forgetting rows.Err() causes silent data loss. Network errors can happen during iteration. If you only check the error from Query, you miss errors that occur while reading rows. Always call rows.Err() after the loop.

Convention aside: Use _ to discard values intentionally. If you only need IDs, write rows.Scan(&id, &_, &_). This tells the reader you considered the other columns and chose to drop them. Use it sparingly with errors. Never discard an error with _.

Advanced query parsing

plainto_tsquery treats the input as a bag of words. It joins terms with AND. Searching for "Go Python" finds documents containing both "Go" and "Python". Users often want more control. They want exact phrases or exclusions.

websearch_to_tsquery parses input like a search engine. It understands quotes and minus signs. "exact phrase" -exclude matches the phrase exactly and excludes documents with "exclude". This function is surprisingly powerful. It handles the syntax users expect from Google.

Ah-ha reveal: websearch_to_tsquery handles boolean logic automatically. You don't need to write custom SQL for OR or NOT. The function parses the user's input and builds the correct query tree. Use it when you want to expose advanced search syntax to users.

Decision matrix

Pick the right tool for the job. Search requirements vary by use case.

Use LIKE '%term%' when you need exact substring matching on small tables under a thousand rows.

Use full-text search with tsvector and GIN when you need relevance ranking, stemming, and fast lookups on large datasets.

Use an external search engine like Elasticsearch when you need faceted search, fuzzy matching beyond typos, or distributed scaling across multiple nodes.

Use pg_trgm extension when you need prefix matching or similarity search for autocomplete fields.

PostgreSQL handles the index. You handle the query. Rank matters. Sort by relevance, not ID. Context is plumbing. Pass it to every query. The worst search bug is the one that returns results in the wrong order.

Where to go next

Full-Text Search with PostgreSQL in Go turns your database into a smart search engine that understands word variations and ignores common words like 'the'. Instead of just matching exact text strings, it breaks your content down into searchable keywords and ranks results by relevance. Think of it like a library catalog that finds books about 'running' even if the title says 'jogging'.