How to Count Occurrences of a Substring in Go

You need a count, not a search

You're processing a log stream and need to flag entries where the word "timeout" appears more than three times. Or you're building a simple text analyzer and want to report how often a specific phrase occurs in a document. You could write a loop to scan the string character by character, track indices, and handle edge cases manually. Go gives you a faster, safer tool.

The standard library function strings.Count finds non-overlapping occurrences of a substring. It scans the text, counts matches, and returns an integer. It handles UTF-8 correctly, avoids unnecessary allocations, and uses optimized search algorithms under the hood.

The standard tool: strings.Count

Here's the basic call: pass the text and the pattern.

package main

import (
	"fmt"
	"strings"
)

func main() {
	text := "go gopher go fast"
	// Count returns the number of non-overlapping instances of substr in s.
	count := strings.Count(text, "go")
	fmt.Println(count) // prints: 2
}

strings.Count takes two string arguments and returns an int. The first argument is the text to search. The second is the substring to find. The function returns the number of times the substring appears.

The search is case-sensitive. "Go" and "go" are different patterns. The function does not perform normalization or case folding. If you need case-insensitive counting, you must transform the strings first.

strings.Count is safe for UTF-8. If your text contains multi-byte characters, the function treats the string as a sequence of runes, not bytes. A pattern like "é" matches the full multi-byte sequence, not just the first byte. This prevents false matches on binary data that happens to share bytes with valid runes.

Convention aside: gofmt decides formatting. Run it on save. Most editors integrate gofmt automatically. Don't waste time arguing about indentation or brace placement; the tool enforces a single style so the team stays focused on logic.

strings.Count handles the hard work. Trust the standard library.

How it counts: non-overlapping and fast

The function counts non-overlapping matches. It scans from left to right. When it finds a match, it increments the counter and jumps past the matched substring to continue searching. It does not look for matches that start inside a previous match.

Consider the text "banana" and the pattern "ana".

package main

import (
	"fmt"
	"strings"
)

func main() {
	text := "banana"
	// Count finds the first "ana" and skips past it.
	// The second potential "ana" starting at index 3 is ignored.
	count := strings.Count(text, "ana")
	fmt.Println(count) // prints: 1
}

The result is 1. The first "ana" occupies indices 1 through 3. The scanner resumes at index 4. The remaining text is "na", which does not contain "ana". The second potential match, which would start at index 3 and overlap with the first, is never considered.

This behavior is intentional. Most counting tasks care about distinct occurrences. Overlapping matches are rare and usually indicate a specific algorithmic requirement, like finding repeated motifs in DNA sequences. For general text processing, non-overlapping is the correct default.

Under the hood, strings.Count uses efficient search algorithms. It does not perform a naive character-by-character comparison for every position. On many architectures, it uses vectorized instructions to compare multiple bytes at once. This makes it extremely fast, even on large strings. The time complexity is linear with respect to the length of the text. You cannot do better than O(N) for counting, and strings.Count achieves that bound with a small constant factor.

Non-overlapping is the rule. Overlapping requires a loop.

Edge case: the empty pattern

The empty string is a valid pattern. strings.Count handles it with a specific mathematical definition.

package main

import (
	"fmt"
	"strings"
)

func main() {
	text := "abc"
	// Empty pattern matches at every boundary between runes.
	// There are boundaries before 'a', between 'a' and 'b',
	// between 'b' and 'c', and after 'c'.
	count := strings.Count(text, "")
	fmt.Println(count) // prints: 4
}

The result is 4. An empty string matches at every position between runes, including the start and end of the text. For a string with N runes, the count is N + 1.

This behavior matches the definition of splitting. If you split "abc" by an empty delimiter, you get four parts: "", "a", "b", "c". The count of separators is the number of parts minus one, which aligns with the boundary count.

If your code accepts user input as the pattern, check for empty strings before calling Count. An empty pattern will return a large number that likely isn't what you want.

func safeCount(text, pattern string) int {
	if len(pattern) == 0 {
		return 0 // Or handle as an error, depending on requirements
	}
	return strings.Count(text, pattern)
}

Empty patterns count boundaries. Check for empty input if that matters.

When you need overlapping matches

If your use case requires counting overlapping occurrences, strings.Count won't help. You need to write a loop that advances by one character after each match, rather than jumping past the match.

Use strings.Index to find the next occurrence. Index returns the index of the first occurrence of the pattern at or after a given position, or -1 if not found.

func countOverlapping(text, pattern string) int {
	if len(pattern) == 0 {
		return 0 // Avoid infinite loop or boundary math
	}
	count := 0
	start := 0
	// Index finds the first occurrence in the slice text[start:].
	// The returned index is relative to the slice, so add start to get absolute index.
	for {
		idx := strings.Index(text[start:], pattern)
		if idx == -1 {
			break
		}
		count++
		// Advance by one to allow the next match to overlap with the current one.
		start += idx + 1
	}
	return count
}

This loop slices the string on each iteration. Slicing creates a new string header but shares the underlying byte array, so it avoids copying data. The loop continues until Index returns -1.

For "banana" and "ana", this function returns 2. The first match is at index 1. The loop advances start to 2. The next search finds "ana" at index 3 relative to the slice, which is index 3 in the original text. The loop advances start to 4. The next search finds nothing. The count is 2.

Overlapping matches require manual stepping.

Pitfall: splitting to count

Some developers reach for strings.Split and subtract one from the length of the resulting slice. This is an anti-pattern.

// BAD: Allocates a slice of strings.
count := len(strings.Split(text, pattern)) - 1

strings.Split creates a slice containing all substrings between the delimiters. If your text is large, this allocates memory proportional to the size of the text. You might allocate megabytes just to get a single integer. strings.Count uses zero extra memory beyond the input. It scans and counts without building intermediate structures.

If you pass a []byte to strings.Count, the compiler rejects the program with cannot use data (variable of type []byte) as string value in argument to strings.Count. The function signature requires strings. If you have bytes, use bytes.Count.

Count without splitting. Save memory.

Bytes vs strings: avoid allocation

When your data is already a byte slice, use bytes.Count. Converting a []byte to a string just to count allocates a new string. If you're processing network packets, file contents, or binary data, staying in the byte domain avoids that allocation.

package main

import (
	"bytes"
	"fmt"
)

func main() {
	data := []byte("hello world, hello universe")
	pattern := []byte("hello")
	// bytes.Count works on slices without allocation.
	count := bytes.Count(data, pattern)
	fmt.Println(count) // prints: 2
}

bytes.Count mirrors strings.Count. It takes two []byte arguments and returns an int. It handles the same logic: non-overlapping matches, efficient search, and empty pattern behavior.

If you're reading from an io.Reader and buffering into a []byte, keep the data as bytes until you need string semantics. Converting back and forth adds overhead.

Convention aside: Don't pass a *string. Strings are already cheap to pass by value. A string is just a pointer and a length. Passing a pointer to a string adds an indirection without saving memory. Pass the string directly.

Stay in bytes when you have bytes. Converting allocates.

Decision matrix

Use strings.Count when you need non-overlapping matches in a UTF-8 string. Use bytes.Count when your data is already a byte slice and you want to avoid allocation. Use a loop with strings.Index when you need to count overlapping occurrences. Use strings.ToLower followed by Count when you need case-insensitive matching. Use regexp when the pattern is complex or includes wildcards.

Pick the tool that matches your data type and match semantics.

Where to go next

Counting occurrences of a substring in Go determines how many times a specific word or phrase appears inside a larger piece of text. It's like using the "Find" feature in a document editor to see how many times a word is used. You use it when you need to analyze text frequency or validate data patterns.