The cost of compiling
You're building a signup form. The backend needs to verify that the user's phone number follows a specific format. You grab a regex pattern, run it, and it works. You ship it. Then traffic spikes. Your CPU usage climbs to 100%. The regex is compiling on every single request.
Go's regexp package separates compilation from matching by design. The regexp.MatchString function is a convenience wrapper that compiles the pattern and tests the string in one call. It's perfect for a script or a one-off check. It's terrible for a hot loop. If you call MatchString inside a loop or an HTTP handler, you rebuild the regex engine's internal state machine on every iteration. The compilation step costs CPU cycles. Matching against a pre-compiled pattern is fast.
The solution is to compile the pattern once and reuse the result. The regexp.Compile function returns a *regexp.Regexp object. This object holds the compiled state. You call .MatchString on this object to test strings. The object is immutable after compilation and safe to use from multiple goroutines simultaneously.
Compile once. Match many. Share the object.
RE2: Safety over features
Go uses the RE2 engine for regular expressions. This engine guarantees linear time complexity. The runtime is proportional to the length of the input string. There is no exponential blowup. This design prevents Regular Expression Denial of Service (ReDoS) attacks. A malicious user cannot hang your server by sending a crafted input that triggers catastrophic backtracking.
The trade-off is that RE2 does not support some advanced features found in other engines. Lookaheads, lookbehinds, and backreferences are missing. If you need those features, Go's standard library won't help. For most validation and parsing tasks, RE2 is faster and safer than backtracking engines. The engine builds a non-deterministic finite automaton (NFA) during compilation. Matching runs the input through this automaton. The process is predictable and bounded.
RE2 guarantees linear time. No exponential blowups.
Minimal example
Here's the simplest way to check if a string matches a pattern. This code uses MatchString for a single check.
package main
import (
"fmt"
"regexp"
)
func main() {
// MatchString compiles the pattern and tests the string in one step.
// It returns true if the pattern matches anywhere in the string.
// The second return value is an error if the pattern syntax is invalid.
matched, err := regexp.MatchString(`^\d{3}-\d{2}-\d{4}$`, "123-45-6789")
if err != nil {
// An error here means the regex syntax is malformed.
// This usually indicates a bug in the code, not bad input.
panic(err)
}
fmt.Println(matched) // true
}
The pattern uses raw string literals with backticks. This is the Go convention for regex. Raw strings avoid double-escaping backslashes. The pattern `\d{3}` is cleaner than "\\d{3}". The community expects raw strings for regex patterns.
Walkthrough
When you call regexp.MatchString, the runtime parses the pattern string, validates the syntax, and builds an internal state machine. This compilation step allocates memory and performs calculations. If the pattern is invalid, the function returns an error. You might see error parsing regexp: missing closing ): ... if you forget a parenthesis. The compiler cannot check regex syntax at compile time. Regex is a string, and strings are data. Validation happens at runtime.
If the pattern is valid, the function runs the input string through the state machine. If the machine reaches an accepting state, the function returns true. Otherwise, it returns false.
If you call this function repeatedly with the same pattern, you repeat the compilation every time. The *regexp.Regexp object caches the compiled state. Creating the object costs CPU. Matching against it is cheap. The object is thread-safe. Multiple goroutines can call .MatchString on the same *regexp.Regexp without locks. The engine handles concurrent access internally.
Realistic usage
In production code, you compile patterns once and match many times. The standard approach is to define a package-level variable with regexp.MustCompile. This function compiles the pattern and panics if the syntax is invalid. Panicking at startup is the desired behavior. It fails fast if the pattern is wrong. The program never starts with a broken regex.
Here's how a validator package looks in practice.
package validator
import (
"regexp"
)
// phoneRegex is compiled once when the package initializes.
// It's safe to use from multiple goroutines simultaneously.
// MustCompile panics if the pattern is invalid, catching bugs early.
var phoneRegex = regexp.MustCompile(`^\+?[1-9]\d{1,14}$`)
// IsValidPhone checks if the input matches the phone number pattern.
// It reuses the pre-compiled regex to avoid repeated compilation costs.
func IsValidPhone(input string) bool {
// MatchString returns a boolean.
// No error handling needed because the pattern was validated at startup.
return phoneRegex.MatchString(input)
}
The var declaration runs during package initialization. The regex is ready before main starts. The IsValidPhone function is fast. It just runs the input through the existing state machine. You can call this function from thousands of goroutines. The *regexp.Regexp object handles the concurrency.
Don't pass a *regexp.Regexp around in function arguments. Use a package-level variable or embed it in a struct. Passing the pointer is cheap, but the convention is to keep compiled regexes at the package level or inside a struct that owns them.
Beyond true or false
Sometimes you need more than a boolean. You might want to extract the matched text or capture groups. The *regexp.Regexp object provides methods for this. FindString returns the first match. FindAllString returns all matches. FindStringSubmatch returns the full match plus captured groups.
Here's how to extract data from a log line.
package main
import (
"fmt"
"regexp"
)
func main() {
// Compile a pattern with capture groups.
// The parentheses define groups. Group 0 is the full match.
// Group 1 is the first parenthesis, Group 2 is the second.
logPattern := regexp.MustCompile(`\[([A-Z]+)\] (\d{4}-\d{2}-\d{2}): (.+)`)
line := "[ERROR] 2023-10-25: disk full"
// FindStringSubmatch returns a slice of strings.
// Index 0 is the entire match.
// Subsequent indices are the captured groups.
// Returns nil if there is no match.
matches := logPattern.FindStringSubmatch(line)
if matches != nil {
// matches[0] is the full string.
// matches[1] is the level, matches[2] is the date, matches[3] is the message.
fmt.Println("Level:", matches[1])
fmt.Println("Date:", matches[2])
fmt.Println("Msg:", matches[3])
}
}
Capture groups return a slice of strings. Index zero is always the full match. Index one is the first capture group. If a group didn't match, the slice contains an empty string for that index. If the pattern has no capture groups, FindStringSubmatch behaves like FindString but returns a slice with one element.
Capture groups return slices. Index zero is the full match.
Pitfalls
MatchString returns true if the pattern matches anywhere in the input. It does not require a full string match. The pattern "foo" matches "foobar". If you want to validate that the entire string conforms to the pattern, you must use anchors. ^ matches the start of the string. $ matches the end. The pattern ^foo$ only matches "foo".
Forgetting anchors is a common bug. You think you're validating a format, but you're actually checking for a substring. Use ^ and $ for validation. Use unanchored patterns for searching.
Another pitfall is confusing Match with MatchString. The Match function takes a []byte. The MatchString function takes a string. If you have a string, use MatchString. Using Match requires converting the string to bytes, which adds overhead. The compiler won't stop you from using the wrong function, but the performance difference is real.
If you use regexp.Compile instead of MustCompile, you must handle the error. The community accepts the if err != nil boilerplate. It makes the error path visible. Don't ignore the error. If the pattern is invalid, the program should fail. Hiding the error leads to silent bugs where the regex never matches anything.
Anchors matter. MatchString finds substrings by default.
Decision matrix
Use regexp.MatchString when you check a pattern once in a script or a low-frequency function.
Use regexp.Compile when the pattern comes from user input or a config file and might be invalid.
Use regexp.MustCompile when the pattern is hardcoded and you want the program to fail fast if the syntax is wrong.
Use a package-level var with MustCompile when you need to match the same pattern across many requests or goroutines.
Use strings.Contains or strings.HasPrefix when you don't need regex power. Regex is slower than simple string functions for literal checks.
Simple string functions beat regex for literal matches.