When regex isn't enough
You have a directory of Go files. You need to find every function that returns an error but never wraps it. Regex feels fast. Regex also matches the word error inside a comment about error handling. Regex breaks on multi-line function signatures. You need the compiler's understanding of your code. Go provides that through the standard library.
The structure behind the text
An Abstract Syntax Tree captures the structure of your code. Think of a sentence. The words "cat", "sat", "on", "mat" are just text. The structure tells you the cat is doing the sitting and the mat is the location. The AST strips away formatting, comments, and whitespace, leaving only the logical relationships.
go/parser turns source text into this tree. go/ast defines the node types and provides tools to walk the tree. go/printer turns the tree back into valid Go source code. Together, they let you read, analyze, and generate code programmatically.
Minimal example
Here's the skeleton. You create a FileSet to track positions, parse a file, and walk the tree to find function names.
package main
import (
"fmt"
"go/ast"
"go/parser"
"go/token"
)
func main() {
// FileSet tracks source positions for error messages and formatting.
fset := token.NewFileSet()
// ParseFile reads source and builds the AST.
// ParseComments flag keeps comments in the tree for inspection.
node, err := parser.ParseFile(fset, "example.go", nil, parser.ParseComments)
if err != nil {
panic(err)
}
// Inspect walks the tree depth-first.
// Returning true continues the traversal.
ast.Inspect(node, func(n ast.Node) bool {
// Type assertion checks if the node is a function declaration.
if fn, ok := n.(*ast.FuncDecl); ok {
fmt.Println("Function:", fn.Name.Name)
}
return true
})
}
How the pieces fit
The FileSet is the anchor. It maps every node back to a line number and column. Without it, error messages are useless and the printer cannot format output. You must reuse the same FileSet for parsing and printing. Mixing FileSet instances causes position mismatches.
ParseFile reads the source and returns an *ast.File. The ParseComments flag tells the parser to keep comments in the tree. If you skip this flag, comments vanish. The third argument is the source content. Passing nil tells the parser to read the file from disk. You can pass a string or []byte to parse code in memory.
Inspect walks the tree depth-first. The callback receives every node. Returning true continues the walk. Returning false stops traversal into that branch. This is useful for skipping subtrees you don't care about. ast.Inspect is a convenience wrapper around ast.Walk. It handles the recursion for you.
Type assertions identify node types. *ast.FuncDecl is a function or method. *ast.CallExpr is a function call. *ast.Ident is a name. *ast.BasicLit is a literal value like a string or number. The go/ast package defines dozens of node types. The documentation lists them all.
gofmt uses these exact packages. When you format code, the tool parses the file, runs a formatter pass, and prints the result. You are using the same engine that formats your code. Trust the same machinery the compiler trusts.
Realistic example
This example finds functions missing documentation. It iterates the Decls slice directly instead of using Inspect, which is often faster when you only care about top-level items.
package main
import (
"fmt"
"go/ast"
"go/parser"
"go/token"
)
func main() {
// FileSet tracks positions for accurate line numbers.
fset := token.NewFileSet()
// ParseComments populates the Doc field on declarations.
file, err := parser.ParseFile(fset, "main.go", nil, parser.ParseComments)
if err != nil {
panic(err)
}
// Decls contains all top-level declarations: functions, types, vars.
for _, decl := range file.Decls {
// Type assertion filters for function declarations.
if fn, ok := decl.(*ast.FuncDecl); ok {
// Doc is nil if no comment group precedes the function.
if fn.Doc == nil {
pos := fset.Position(fn.Pos())
fmt.Printf("Line %d: missing doc for %s\n", pos.Line, fn.Name.Name)
}
}
}
}
The Decls slice holds all top-level declarations. Iterating it avoids the overhead of walking the entire tree. The Doc field on FuncDecl holds the comment group immediately preceding the function. If no comment exists, Doc is nil. This pattern works for types and variables too.
Pitfalls and errors
Modifying the AST requires care. Nodes carry position information. If you insert a new node without setting its position, the printer may panic or produce misaligned code. Always set Pos() on new nodes. The printer uses positions to decide where to place newlines and indentation.
Forgetting the FileSet is a common mistake. If you pass nil to the printer, you get printer: missing FileSet. The printer cannot function without position data.
Comments can shift if you modify the tree. The parser attaches comments to the nearest node. If you reorder nodes, comments might follow the wrong declaration. Treat the AST as a read-only structure when possible. Build a new tree for complex transformations. Direct mutation risks position corruption and comment drift.
Type assertions can fail. If you cast a node to the wrong type, the assertion returns false and a zero value. Using the comma-ok idiom prevents panics. If you use a bare cast like n.(*ast.FuncDecl) and the node is not a function, the program panics with interface conversion: ast.Node is *ast.CallExpr, not *ast.FuncDecl. Always check the boolean result.
FileSet is the map. Lose it and you're lost.
Decision matrix
Use the AST packages when you need to understand code structure, generate boilerplate, or build a linter. Use regular expressions when you are searching for literal text patterns in comments or strings and performance matters more than accuracy. Use go/types when you need to check types, resolve interfaces, or verify that a variable is actually a string. Use a simple string split when you are parsing configuration files or data formats, not Go source code.
AST is the compiler's view. Regex is a text game.