How to Write Custom AST-Based Code Generators in Go

Write custom AST-based code generators in Go by parsing source files with go/parser, inspecting the AST with ast.Inspect, and printing generated code.

The problem with copy-paste

You have a project with fifty structs. Each one needs a String method, a Validate method, and a set of JSON tags. You write the first one. You copy-paste it for the second. By the tenth, you realize you're maintaining code that maintains code. You change a field name in the struct, and now you have to update the method, the validation, and the tags in three places. One slip and the code compiles but produces wrong output at runtime.

You want a tool that reads your source files, finds every struct, and spits out the missing methods automatically. That's AST-based code generation. You build a program that parses Go source code into a tree, walks that tree to find patterns, and prints new code based on what it finds. The compiler runs your generator, the generator writes the file, and then the compiler compiles the result.

The tree behind the text

Go source code is text. The compiler doesn't read text. It reads a tree. The Abstract Syntax Tree, or AST, breaks your code into nodes. A function declaration is a node. A variable assignment is a node. A string literal is a node. The tree captures the structure without the noise of indentation, comments, or whitespace.

Think of the AST like a skeleton. The text file is the skin and muscle. The AST is the bone structure. Code generation is walking the skeleton, finding the bones you care about, and building new tissue around them. You don't care about the indentation in the source file. You care that there is a struct with three fields.

The go/ast package gives you the tree. The go/parser package builds it from text. The go/token package tracks where every node came from in the source file. These are the standard library tools. You don't need third-party libraries to read and write Go code. The language provides the machinery to inspect itself.

The AST is syntax. The types package is semantics.

Minimal example: finding functions

Here's the skeleton of a generator: parse a file, walk the tree, print a wrapper for every function.

package main

import (
	"fmt"
	"go/ast"
	"go/parser"
	"go/token"
	"os"
)

func main() {
	// FileSet tracks position information for source locations
	fset := token.NewFileSet()
	// ParseFile reads the source and builds the AST node
	node, err := parser.ParseFile(fset, "main.go", nil, parser.ParseComments)
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		os.Exit(1)
	}

	// Inspect walks every node in the tree recursively
	ast.Inspect(node, func(n ast.Node) bool {
		// Type assert to check if this node is a function declaration
		if fn, ok := n.(*ast.FuncDecl); ok {
			fmt.Printf("// Wrapper for %s\n", fn.Name.Name)
			fmt.Printf("func wrap%s() { %s() }\n", fn.Name.Name, fn.Name.Name)
		}
		// Return true to continue walking children nodes
		return true
	})
}

The token.FileSet is the map. It connects AST nodes back to line numbers and file paths. You need it for error reporting. If the parser fails, the error message uses the file set to tell you exactly where the problem is.

parser.ParseFile reads the text and returns an *ast.File. This is the root of the tree. If the file has syntax errors, the parser stops and returns an error. The compiler rejects this with a parse error like expected '}', found 'EOF'. You can't generate code from broken syntax. Fix the source first.

ast.Inspect is the walker. It visits every node in the tree. You pass it a function that receives each node. The function returns a boolean. true means "keep walking into the children." false means "stop here." This lets you skip entire blocks if you find something you don't want to process.

Inside the walker, you use a type assertion: n.(*ast.FuncDecl). The AST is a forest of different node types. Every node implements the ast.Node interface. You check if the node is the type you want. If it is, you extract the data. fn.Name.Name gets the function name as a string.

Parse once. Walk once. Generate once.

Realistic example: generating methods with formatting

Generating code with fmt.Printf gets messy fast. Indentation breaks. Semicolons go missing. The output looks ugly. Go developers trust gofmt to handle formatting. If your generator outputs code, format it with go/format. The community expects generated code to look identical to hand-written code.

Here's a generator that finds every struct in a file and emits a String method. It builds the output string, formats it, and writes it to a new file.

package main

import (
	"fmt"
	"go/ast"
	"go/format"
	"go/parser"
	"go/token"
	"os"
	"strings"
)

func main() {
	// FileSet tracks positions for error reporting
	fset := token.NewFileSet()
	// Parse the file containing structs
	node, err := parser.ParseFile(fset, "models.go", nil, 0)
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		os.Exit(1)
	}

	var buf strings.Builder
	// Walk the AST to find struct definitions
	ast.Inspect(node, func(n ast.Node) bool {
		// Look for type specifications
		if ts, ok := n.(*ast.TypeSpec); ok {
			// Check if the type is a struct
			if st, ok := ts.Type.(*ast.StructType); ok {
				// Generate a String method for the struct
				// Convention: receiver name is short, matching type initial
				buf.WriteString(fmt.Sprintf("func (s %s) String() string { return \"%s{}\" }\n", ts.Name.Name, ts.Name.Name))
			}
		}
		// Continue walking
		return true
	})

	// format.Source applies gofmt rules to the generated code
	src := []byte(buf.String())
	formatted, err := format.Source(src)
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		os.Exit(1)
	}

	// Write the formatted code to a new file
	err = os.WriteFile("generated.go", formatted, 0644)
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		os.Exit(1)
	}
}

The go/format package is the same engine that powers gofmt. You pass it a byte slice of Go source code. It returns the formatted code. If the code is invalid, it returns an error. This is a safety net. If your generator produces broken syntax, format.Source catches it before you write the file.

The receiver name in the generated method is s. Convention is one or two letters matching the type. (s User) is correct. (this User) or (self User) is not. The community reads code faster when receiver names are short.

The output file is generated.go. By convention, generated files often have a suffix or a comment at the top saying // Code generated by ... DO NOT EDIT. This tells other developers not to touch the file by hand. Changes get overwritten the next time the generator runs.

Format the output. Trust the tool.

Pitfalls and type information

The AST gives you structure, not types. If you see a node named User, the AST doesn't know if User is a struct, an interface, or a type alias. You only know it's an identifier. If you need to know the fields of User, or if you need to check if a method satisfies an interface, you need the go/types package.

go/types runs type checking on the AST. It resolves imports, checks signatures, and builds a complete type map. It's more complex to set up. You need to configure an importer so the type checker can find standard library packages and your own modules.

If you skip type checking and assume a node is a struct, your generator might crash on a type alias. Or worse, it generates code that fails to compile later. The compiler rejects this with an error like cannot call non-function User. You want the generator to fail fast, not the user's build.

Another pitfall is loop variable capture. If you generate code that uses a loop variable inside a closure, you need to be careful. Go 1.22 changed loop semantics, but generated code often targets older versions. If you generate a loop with a closure, make a copy of the variable. The compiler rejects this with loop variable i captured by func literal if you forget in newer versions, but older versions just silently share the variable.

Goroutine leaks happen when the goroutine waits on a channel that never gets closed. Always have a cancellation path. This applies to generated code too. If your generator spawns background tasks, make sure they can stop.

Invalid syntax breaks the formatter. Validate before you emit.

When to generate and when to write

Code generation adds complexity. You have a tool that writes code, and then you have to debug the tool and the code it writes. Use it when the benefit outweighs the cost.

Use an AST-based generator when you need to analyze code structure and emit code that depends on types, fields, or function signatures. Use go generate with text templates when you have static boilerplate that repeats with minor variations, like SQL queries or config structs. Use a dedicated tool like stringer when you need standard methods like String or MarshalJSON for enums. Write the code by hand when the logic is unique and the boilerplate is minimal. Use reflection at runtime when you need flexibility and can accept the performance cost.

Generate when the pattern repeats. Write by hand when the logic is unique.

Where to go next

Writing custom AST-based code generators in Go lets you automatically write Go code by reading other Go code. You tell the computer to look at the structure of your existing files, find specific patterns like function definitions, and then print out new files based on what it found. It is like having a robot read your codebook and write a summary for every chapter.