How to Build an AI Chat Application Backend in Go

You have a frontend that sends a user's message to your server. The server needs to forward that text to an AI model, get the result back, and send it to the client. It feels like a simple relay race, but the backend has to parse the request, handle network errors, format the response, and keep the connection alive if the AI takes a moment to think. Go handles this well because the standard library gives you everything you need without pulling in heavy frameworks. You get a fast HTTP server, a reliable JSON encoder, and concurrency primitives that scale.

Think of your Go backend as a translator at a diplomatic summit. The frontend speaks JSON over HTTP. The AI model speaks its own protocol, usually JSON over HTTP as well. Your job is to listen to the frontend, decode the message, pass it to the AI, decode the AI's reply, and send it back. Go's net/http package is the room where this happens. encoding/json is the dictionary. You write a handler function that acts as the translator. The handler receives the request, does the work, and writes the response.

Go's standard library is the foundation. Build on it before adding layers.

The minimal chat handler

Here's the skeleton: a handler that reads JSON, echoes it back, and an HTTP server to serve it.

package main

import (
	"encoding/json"
	"net/http"
)

// Request holds the user message.
type Request struct {
	Text string `json:"text"`
}

// handleChat decodes input and returns a simulated AI response.
func handleChat(w http.ResponseWriter, r *http.Request) {
	var req Request
	// Decode JSON; return 400 if the body is malformed.
	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}
	// Set header before body to ensure correct content type.
	w.Header().Set("Content-Type", "application/json")
	// Encode response directly to the writer.
	json.NewEncoder(w).Encode(map[string]string{"reply": "Echo: " + req.Text})
}

func main() {
	http.HandleFunc("/chat", handleChat)
	// Listen on 8080; blocks the main goroutine.
	http.ListenAndServe(":8080", nil)
}

The handler is the heart. Keep it focused on decoding, processing, and encoding.

What happens under the hood

When you run go run main.go, the compiler checks types. If you pass a string where an int is expected, it stops. At runtime, main calls ListenAndServe. This starts a listener on port 8080. When a client hits /chat, the router calls handleChat. The handler gets a ResponseWriter and a Request. You decode the body. If decoding fails, you write an error and return. If it succeeds, you set headers and encode the response. The connection closes when the handler returns.

JSON tags control serialization. The struct field Text string has a tag `json:"text"`. This tells the encoder to use the key "text" in the JSON output. If you omit the tag, the encoder uses the field name Text. If you miss a tag entirely and the field is unexported (lowercase), the compiler won't complain, but the field vanishes from the JSON. You get silent data loss. Always use tags for public API fields.

Go makes error handling explicit. You see if err != nil everywhere. This is by design. The boilerplate forces you to acknowledge failure modes instead of hiding them in exceptions. The community accepts the verbosity because it makes the unhappy path visible.

Calling an AI service with context

Real backends call external services. Here's a handler that forwards the request to an AI provider and respects client cancellation.

// handleChatWithContext decodes the request and delegates to the AI service.
func handleChatWithContext(w http.ResponseWriter, r *http.Request) {
	// Use request context to handle client disconnects.
	ctx := r.Context()

	var req struct{ Text string `json:"text"` }
	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
		http.Error(w, "invalid json", http.StatusBadRequest)
		return
	}

	w.Header().Set("Content-Type", "application/json")
	// Pass context to the service call so it can be cancelled.
	callAI(ctx, req.Text, w)
}

Functions that perform work should accept context.Context as the first parameter, named ctx. This lets callers cancel long-running operations. The convention is strict: context always goes first. If you skip it, you can't propagate cancellation.

The service call runs in a goroutine so the handler can react to context changes.

// callAI performs the external request with a timeout.
func callAI(ctx context.Context, text string, w http.ResponseWriter) {
	// Buffered channel allows the goroutine to send and exit immediately.
	resultCh := make(chan string, 1)

	go func() {
		// Simulate network delay; replace with actual HTTP client call.
		time.Sleep(2 * time.Second)
		resultCh <- "AI response for: " + text
	}()

	select {
	case <-ctx.Done():
		// Client went away; stop waiting for the response.
		return
	case result := <-resultCh:
		json.NewEncoder(w).Encode(map[string]string{"reply": result})
	}
}

Context is plumbing. Run it through every long-lived call site.

The select statement waits on multiple channels. If the context is cancelled, ctx.Done() sends a signal, and the handler returns immediately. If the result arrives first, the handler writes the response. The buffered channel ensures the goroutine doesn't block if the select branch is taken. Without the buffer, the goroutine could leak if the context cancels before the send completes.

Streaming responses for long AI replies

AI models often generate text token by token. Waiting for the full response makes the user wait. Go supports streaming via http.Flusher. You can write chunks and flush them immediately.

// streamChat writes tokens as they arrive and flushes to the client.
func streamChat(w http.ResponseWriter, r *http.Request) {
	// Assert the response writer supports flushing.
	flusher, ok := w.(http.Flusher)
	if !ok {
		http.Error(w, "streaming not supported", http.StatusInternalServerError)
		return
	}

	w.Header().Set("Content-Type", "text/event-stream")
	w.Header().Set("Cache-Control", "no-cache")
	// Write header and flush to establish the stream.
	w.WriteHeader(http.StatusOK)
	flusher.Flush()

	// Simulate tokens arriving over time.
	tokens := []string{"Hello", " ", "world", "!"}
	for _, token := range tokens {
		// Write each token followed by a newline.
		fmt.Fprintf(w, "%s\n", token)
		// Push data to the client immediately.
		flusher.Flush()
		time.Sleep(500 * time.Millisecond)
	}
}

Flush early. Flush often. The client waits for nothing.

The http.Flusher interface lets you push data to the client without waiting for the handler to return. You assert the interface on the ResponseWriter. If the assertion fails, the server doesn't support streaming, and you return an error. You set headers for Server-Sent Events, write the status code, and flush. Then you loop through tokens, writing and flushing each one. The browser receives data as it arrives.

Pitfalls and compiler errors

If you miss a JSON tag, the compiler stays silent. The struct field just vanishes from the output. You get json: unknown field only if you try to decode into a struct with extra fields and strict mode, but usually, you just get missing data. If you call w.WriteHeader after writing body content, the server panics with http: superfluous response.WriteHeader call. Always set headers before writing the body.

If you forget to import a package, the compiler rejects the program with undefined: pkg. If you import a package and don't use it, you get imported and not used. Go is strict about unused imports. Remove them or use the blank identifier _ to discard a value intentionally. result, _ := ... says "I considered the second return value and chose to drop it". Use it sparingly with errors.

Goroutine leaks happen when the goroutine waits on a channel that never gets closed. Always have a cancellation path. The worst goroutine bug is the one that never logs.

Run gofmt on your code. The community expects consistent formatting. Most editors do this on save. Don't argue about indentation; let the tool decide. Public names start with a capital letter. Private names start lowercase. No keywords like public or private. Accept interfaces, return structs. This mantra keeps your code flexible.

When to use what

Use the standard library net/http when you want zero dependencies and full control over the request lifecycle. Use a framework like Gin or Echo when you need rapid routing, built-in middleware chains, and a router that handles path parameters automatically. Use gRPC when your frontend and backend are both Go services communicating over a private network and you need strict schema enforcement. Use raw TCP or WebSockets when you need persistent bidirectional streams for real-time chat without HTTP overhead. Use a reverse proxy like Nginx in front of your Go server when you need TLS termination, rate limiting, or static file serving.

Start simple. Add complexity only when the standard library forces you to.

Where to go next

Building an AI Chat Application Backend in Go sets up a basic web server that accepts text messages and sends back a response. It works like a digital receptionist that listens for incoming calls, reads the message, and replies immediately. You use this foundation to plug in real AI logic later.