How to Replace with Regex in Go

Use regexp.MustCompile and ReplaceAllString to find patterns and replace them in Go strings.

The problem with finding and swapping text

You are processing a batch of log lines. Each line contains a timestamp, a request ID, and a user email. The emails are messy. Some have extra spaces, some use underscores instead of dots, and some contain tracking parameters you want to strip. You could write a series of string splits and manual index calculations, but that code becomes fragile the moment the format shifts. You need a pattern matcher that can find irregular structures and swap them out in one pass.

Regular expressions solve this by describing what you want to match rather than where it sits. Go treats regex as a first-class tool, but it does not attach regex methods to the string type. Instead, it isolates pattern matching in its own package and forces you to think about compilation, allocation, and replacement strategy upfront.

How Go handles regular expressions

Go's regexp package compiles a pattern string into a finite state machine. Think of it like designing a sorting conveyor belt. You spend time building the rollers and sensors once, then feed raw material through it repeatedly. The compiler does not re-parse your pattern on every call. It translates the human-readable syntax into optimized bytecode that the runtime executes directly.

This design has two consequences. Compilation costs CPU cycles, so you should never compile the same pattern inside a tight loop. The state machine also scans left to right, matching the first valid occurrence at each position. Go uses RE2 syntax, which guarantees linear time execution. You will not hit catastrophic backtracking, but you also lose some Perl-style features like lookaheads and backreferences. The tradeoff buys predictability and safety.

Strings in Go are immutable. Every replacement operation allocates a new string and copies the unchanged parts into it. The original input stays untouched. This prevents accidental mutation in concurrent code, but it means heavy replacement workloads will trigger garbage collection. Plan your allocation strategy accordingly.

Compile once. Run many times.

The simplest replacement

Here is the baseline pattern: compile a constant regex, call the replacement method, and capture the result.

package main

import (
	"fmt"
	"regexp"
)

// main demonstrates basic regex replacement.
func main() {
	// Compile once at startup. MustCompile panics if the syntax is invalid.
	re := regexp.MustCompile(`\d+`)
	
	// The input string remains unchanged throughout the operation.
	text := "Room 101, Floor 2"
	
	// ReplaceAllString scans left to right and builds a new string.
	result := re.ReplaceAllString(text, "X")
	
	fmt.Println(result)
}

The MustCompile function parses the pattern, validates the syntax, and returns a *regexp.Regexp pointer. If the pattern contains a typo, the program crashes immediately during startup rather than failing silently later. This is a deliberate Go convention: Must prefix functions are meant for constant, developer-controlled patterns where a syntax error is a programming mistake, not a runtime condition.

ReplaceAllString takes the target string and a replacement literal. It walks the input, finds every sequence of digits, and substitutes it with X. The method returns a new string. The original text variable still holds Room 101, Floor 2.

Precompile your patterns. Never pay the parsing tax at runtime.

Walking through the execution

When the program starts, regexp.MustCompile reads \d+. The compiler recognizes \d as a shorthand for digits and + as one-or-more repetition. It builds a compact state machine that tracks whether it is currently inside a digit sequence or outside it. The resulting *regexp.Regexp value caches this machine.

Calling ReplaceAllString hands the input string to the cached machine. The scanner advances byte by byte. When it hits 1, it enters the matching state. It continues through 0 and 1, then stops at the comma. The scanner records the match boundaries, copies the preceding text into a buffer, writes the replacement string X, and continues scanning from the comma. It repeats this for 2. Finally, it returns the assembled buffer as a new string.

Because Go strings are UTF-8 encoded, the scanner handles multi-byte characters correctly. It will not split a rune in half. The replacement string can also contain special syntax. $1 and $2 refer to capture groups defined in the pattern. $0 refers to the entire match. The replacement engine substitutes these tokens before writing to the buffer.

The machine does not modify the input. It reads, it builds, it returns.

Replacing in real applications

Static replacement strings cover simple cases. Real data usually requires dynamic transformation. You might want to normalize phone numbers, mask credit card digits, or convert markdown links to HTML. For that, ReplaceAllStringFunc passes each matched substring to a callback function.

package main

import (
	"fmt"
	"regexp"
	"strings"
)

// main demonstrates dynamic replacement with a callback.
func main() {
	// Capture groups let the callback inspect specific parts of the match.
	re := regexp.MustCompile(`(\w+)@(\w+)\.(\w+)`)
	
	input := "Contact alice@corp.io or bob@test.net for updates."
	
	// ReplaceAllStringFunc calls the anonymous function for every match.
	result := re.ReplaceAllStringFunc(input, func(match string) string {
		// Extract the local part and domain from the captured groups.
		submatches := re.FindStringSubmatch(match)
		local := submatches[1]
		domain := submatches[2]
		
		// Mask the email by keeping only the first two letters.
		masked := local[:2] + "***@" + domain + "." + submatches[3]
		return masked
	})
	
	fmt.Println(result)
}

The callback receives the full matched string. You can parse it further, look it up in a map, or apply formatting rules. The function returns the exact text that replaces the match. If you return an empty string, the match gets deleted. If you return the original match, the input stays unchanged for that segment.

This pattern shines when the replacement depends on external state. You might validate a matched URL against a blocklist, translate a matched keyword using a dictionary, or apply different casing rules based on surrounding context. The callback runs sequentially for each match, so it is safe to use shared read-only data. Avoid mutating global state inside the callback unless you synchronize it explicitly.

Capture groups give you structure. The callback gives you logic.

Where things go wrong

Regex replacement fails in predictable ways. The most common trap is compiling inside a loop. Every call to Compile or MustCompile rebuilds the state machine. Doing this thousands of times turns a fast scan into a CPU bottleneck. Move compilation to package scope or initialize it once in init.

// Package-level variables compile exactly once when the package loads.
var emailPattern = regexp.MustCompile(`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`)

Invalid syntax triggers a panic when using MustCompile. The runtime aborts with panic: regexp: Compile( + pattern + ): error parsing regexp: missing closing ): .... If the pattern comes from user input, configuration files, or external APIs, switch to regexp.Compile. It returns an error instead of crashing. Handle it with the standard if err != nil check. The boilerplate is verbose by design. It forces you to acknowledge failure paths rather than hiding them behind a panic.

Greedy matching catches beginners. The pattern .* consumes as much text as possible. If you try to strip HTML tags with <.*>, it will match from the first < to the last > on the entire line, swallowing everything in between. Use non-greedy quantifiers like .*? or restrict the character class to [^<>]* to match only what you intend.

Runtime panics also occur when you forget that ReplaceAllString returns a new string. Assigning the result back to the original variable is required. Forgetting the assignment leaves the input untouched and creates silent logic bugs. The compiler will not warn you. The type system sees a valid string assignment. You have to read the code carefully.

Goroutine leaks do not apply here, but channel deadlocks do if you pipe regex results into concurrent workers without closing the output channel. Always close channels when the producer finishes. Trust the compiler for syntax. Trust your tests for logic.

Picking the right replacement tool

Use strings.Replace when you know the exact substring and need maximum speed with zero compilation overhead. Use regexp.ReplaceAllString when you need pattern matching with a fixed replacement string. Use regexp.ReplaceAllStringFunc when the replacement depends on the matched text or external data. Use regexp.Compile over MustCompile when the pattern is dynamic or user-supplied. Use a package-level variable when the pattern is constant and shared across multiple functions. Use plain sequential code when you don't need pattern matching: the simplest thing that works is usually the right thing.

Where to go next