How to Use Vector Databases from Go (Pinecone, Weaviate, Milvus)

Connect to Pinecone, Weaviate, or Milvus in Go using their official client libraries to store and query vector embeddings for similarity search.

The problem with exact matches

You are building a search feature for a developer tool. Users paste a snippet of code or a vague description like handle timeout gracefully and expect the system to return relevant documentation. A traditional database struggles here. Exact string matching fails when synonyms appear. Wildcard queries slow down as the table grows. The system needs to understand meaning, not just characters. That shift from keyword matching to semantic search is where vector databases enter the picture.

What a vector database actually does

A vector database does not store text. It stores numbers. An embedding model converts your document into a long list of floating-point values. Each number captures a subtle aspect of the meaning. Two documents with similar concepts produce similar number sequences. The database indexes these sequences and calculates distances between them. When you search, you convert your query into the same number format and ask the database for the closest matches.

Think of it like a map. Instead of finding a street by name, you drop a pin and ask for the nearest landmarks. The database is just a highly optimized index for measuring that distance in hundreds of dimensions. Most providers use algorithms like HNSW or IVF to avoid comparing every single vector in the dataset. They build a graph or a grid that lets the server jump directly to the relevant neighborhood. The math stays the same. The indexing strategy just changes how fast you get the answer.

Vector databases are not magic. They are specialized storage engines that trade exact recall for speed and semantic flexibility. You still need to clean your data, choose the right embedding model, and tune your thresholds. The database handles the heavy lifting of distance calculation and index maintenance.

Stop treating vectors like strings. Measure distance, not equality.

Your first connection

Go does not have a built-in vector client. You rely on official SDKs from providers like Pinecone, Weaviate, or Milvus. They all follow the same pattern. Initialize a client, pass a context, send data, handle errors. The SDKs wrap HTTP calls and serialize Go structs into the format the server expects.

Here is a minimal Weaviate setup that connects, inserts one document, and runs a similarity search.

package main

import (
	"context"
	"log"

	"github.com/weaviate/weaviate-go-client/v4/weaviate"
	"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)

func main() {
	// Context carries deadlines and cancellation signals
	ctx := context.Background()
	// Client wraps the HTTP transport and API version
	client, err := weaviate.New(weaviate.Config{
		Scheme: "http",
		Host:   "localhost:8080",
	})
	if err != nil {
		log.Fatal(err)
	}

	// Batch insert avoids round-trip overhead per document
	err = client.Batch().ObjectBatcher().WithObject(
		&weaviate.Object{
			Class: "Document",
			Properties: map[string]any{
				"content": "Go is a programming language.",
			},
		},
	).Do(ctx)
	if err != nil {
		log.Fatal(err)
	}

	// Query returns objects closest to the input vector
	res, err := client.GraphQL().Get().WithClassName("Document").
		WithNearVector(&graphql.NearVector{
			Vector:    []float32{0.1, 0.2, 0.3},
			Certainty: 0.7,
		}).
		WithFields("content").
		Do(ctx)
	if err != nil {
		log.Fatal(err)
	}
	log.Printf("Results: %v", res)
}

The code follows standard Go conventions. Public names start with a capital letter. Private names start lowercase. The context parameter goes first. Error handling uses the explicit if err != nil pattern. The community accepts the boilerplate because it makes the unhappy path visible. You cannot accidentally swallow a network failure.

Trust the error. Handle it immediately.

How the request travels

The program starts by creating a background context. Every long-lived call in Go should accept a context as its first parameter. The convention exists so cancellation and timeouts propagate cleanly through your call stack. The client initialization configures the HTTP transport. The SDK creates a connection pool under the hood. It reuses TCP connections instead of opening a new one for every request.

When you call Batch().ObjectBatcher().WithObject(), the SDK serializes the struct into JSON and sends a POST request. The server receives the payload, validates the schema, and writes it to disk. The query step follows the same path. You provide a raw vector and a certainty threshold. The SDK builds a GraphQL query, ships it to the server, and deserializes the response back into Go structs.

If the network drops or the server returns a non-200 status, the error bubbles up immediately. Go forces you to handle it on the spot. The compiler rejects the program with error returned but not handled if you ignore it. You cannot compile code that leaves an error dangling. This design choice prevents silent failures in production.

The runtime scheduler handles the blocking network call efficiently. When the HTTP client waits for a response, the goroutine parks. The scheduler moves another goroutine to the CPU. When the data arrives, the original goroutine wakes up and continues. You do not need callbacks or promises. The blocking call looks synchronous but behaves asynchronously under the hood.

Context is plumbing. Run it through every long-lived call site.

Realistic indexing and querying

Production code rarely runs a single insert and query in main. You wrap the client in a service struct. This keeps your HTTP logic separate from your business rules and makes testing straightforward. The receiver name follows Go convention. One or two letters matching the type, not this or self.

// VectorStore handles semantic indexing and retrieval
type VectorStore struct {
	client *weaviate.Client
}

// NewVectorStore initializes the connection and validates reachability
func NewVectorStore(endpoint string) (*VectorStore, error) {
	c, err := weaviate.New(weaviate.Config{
		Scheme: "http",
		Host:   endpoint,
	})
	if err != nil {
		return nil, err
	}
	return &VectorStore{client: c}, nil
}

The constructor returns a pointer to the struct. Go passes structs by value, so returning a pointer avoids copying the client reference. The error return follows the standard pattern. Callers check the error before using the store.

Batch operations reduce latency and server load. You accumulate objects in memory before sending them. The SDK handles chunking if you exceed the server limit, but you should still respect reasonable batch sizes. Sending ten thousand records in one call will blow up your memory and trigger connection timeouts. Keep batches between fifty and five hundred items.

// UpsertBatch sends multiple documents in a single network hop
func (v *VectorStore) UpsertBatch(ctx context.Context, docs []Document) error {
	// Batch operations reduce latency and server load
	batcher := v.client.Batch().ObjectBatcher()
	for _, d := range docs {
		batcher.WithObject(&weaviate.Object{
			Class: "Document",
			Properties: map[string]any{
				"content": d.Text,
			},
		})
	}
	return batcher.Do(ctx)
}

The context parameter lets the caller cancel the operation if the user navigates away or if a parent timeout fires. Always pass context down. Never create a new background context inside a method that already received one. If you need to add a deadline, wrap the incoming context instead of replacing it.

Querying requires mapping the response back to your domain types. The SDK returns generic maps or SDK-specific structs. You extract the fields you need and discard the rest. The underscore discards a value intentionally. result, _ := ... says you considered the second return value and chose to drop it. Use it sparingly with errors, but freely for metadata you do not need.

// Search finds documents closest to the query vector
func (v *VectorStore) Search(ctx context.Context, vec []float32, limit int) ([]Document, error) {
	// Limit prevents returning thousands of irrelevant results
	res, err := v.client.GraphQL().Get().WithClassName("Document").
		WithNearVector(&graphql.NearVector{
			Vector:    vec,
			Certainty: 0.6,
		}).
		WithLimit(limit).
		WithFields("content", "id").
		Do(ctx)
	if err != nil {
		return nil, err
	}
	// Map SDK response to domain structs
	// Omitted for brevity: iterate res.Data.Get.Document and extract fields
	return nil, nil
}

The certainty threshold controls how strict the match is. A high threshold returns fewer, more relevant results. A low threshold returns more results with higher noise. You tune this against your actual data. The compiler cannot validate semantic relevance. You run integration tests with known queries and adjust until the results match your expectations.

Don't fight the type system. Wrap the value or change the design.

Where things break

Vector databases introduce new failure modes. The most common is ignoring the dimension mismatch. If your embedding model outputs 768 dimensions but your database schema expects 1536, the server rejects the request. You will see an error like invalid vector dimension: expected 1536, got 768. The compiler cannot catch this. It is a runtime validation error. You must align your model configuration with your database schema before writing data.

Another trap is treating vector search like exact matching. Similarity scores are probabilistic. A certainty threshold of 0.9 might return zero results for a niche query. Lowering it to 0.5 floods the response with irrelevant noise. You tune the threshold against your actual data, not a theoretical ideal. Network timeouts also bite developers who forget to attach deadlines. A stuck query will hang the goroutine indefinitely. Wrap your context with a timeout. Call cancel in a defer. The goroutine releases when the deadline hits.

Goroutine leaks happen when you spawn background workers to stream results but forget to close the channel. The worker waits for more data that never arrives. The program exits, but the OS reclaims the resources anyway. In long-running services, leaked goroutines accumulate until the process crashes. Always provide a cancellation path. If a channel feeds a goroutine, close it when the producer finishes. If the producer fails, cancel the context and let the worker exit.

The worst goroutine bug is the one that never logs.

Picking the right tool

Picking a vector database depends on your infrastructure and team size. The underlying math is identical. The differences live in deployment, pricing, and query language.

Use Pinecone when you want a fully managed service with zero infrastructure maintenance. Use Weaviate when you need a self-hosted option that supports hybrid search and GraphQL out of the box. Use Milvus when you are processing millions of vectors and require distributed scaling with Kubernetes. Use PostgreSQL with pgvector when you already run a relational database and want to avoid adding a second data store. Use plain sequential code when your dataset fits in memory and you can compute distances locally.

Trust gofmt. Argue logic, not formatting.

Where to go next