How to Implement a Hash Map from Scratch in Go

Implement a Go hash map by defining a struct with a bucket slice, using maphash for key hashing, and handling collisions via chaining.

You need a map that remembers order

You're writing a frequency counter for words in a log file. You reach for map[string]int and it just works. Fast. Simple. But what if you need a map that keeps insertion order? Or a map that shrinks when memory gets tight? Or you're in an interview and the interviewer asks, "Show me how a map works under the hood." The standard library map is a black box. Building one from scratch reveals the mechanics: hashing, buckets, collisions, and resizing.

The mechanics of hashing

A hash map is a giant array of buckets. Each bucket holds key-value pairs. The magic is the hash function. You feed it a key, it spits out a number. That number becomes the index into the array. If two keys hash to the same index, you have a collision. You handle collisions by chaining: each bucket holds a linked list of entries. Go's built-in map uses a hybrid approach with open addressing and overflow buckets, but chaining is easier to implement and understand. The core invariant: index = hash(key) % len(buckets).

Think of a hash map like a post office with thousands of mailboxes. The hash function is the clerk who looks at your address and tells you which mailbox number to use. If two people have addresses that the clerk maps to the same box, they share the box. The map handles the sharing so you can still find your letter. The hash function must be fast and distribute keys evenly. If the clerk sends everyone to mailbox 42, the map degrades into a slow linked list.

Minimal implementation

Here's the core structure and the Put/Get logic. We use chaining: each bucket points to a linked list of entries. The code uses maphash for fast, randomized hashing and constrains keys to string to keep the example runnable without unsafe tricks.

package main

import (
	"crypto/rand"
	"hash/maphash"
)

// HashMap stores string keys and any value type using chaining for collisions.
type HashMap[V any] struct {
	buckets []*bucket
	size    int
	hasher  maphash.Hash
}

// bucket holds a key-value pair and links to the next entry in the chain.
type bucket struct {
	key  string
	val  V
	next *bucket
}

// NewHashMap initializes a map with 16 buckets and a random seed.
func NewHashMap[V any]() *HashMap[V] {
	var seed [32]byte
	_, _ = rand.Read(seed[:]) // fill seed with random bytes; discard error for brevity
	return &HashMap[V]{
		buckets: make([]*bucket, 16),
		hasher:  maphash.New(seed),
	}
}

// Put inserts or updates a value for the given key.
func (m *HashMap[V]) Put(key string, val V) {
	m.hasher.Reset()
	m.hasher.WriteString(key)
	hash := m.hasher.Sum64()
	idx := hash % uint64(len(m.buckets))

	// check if key already exists in the chain
	current := m.buckets[idx]
	for current != nil {
		if current.key == key {
			current.val = val
			return
		}
		current = current.next
	}

	// key not found, add new bucket at head of chain
	m.buckets[idx] = &bucket{
		key:  key,
		val:  val,
		next: m.buckets[idx],
	}
	m.size++
}

// Get retrieves the value for the key and reports whether it was found.
func (m *HashMap[V]) Get(key string) (V, bool) {
	m.hasher.Reset()
	m.hasher.WriteString(key)
	hash := m.hasher.Sum64()
	idx := hash % uint64(len(m.buckets))

	current := m.buckets[idx]
	for current != nil {
		if current.key == key {
			return current.val, true
		}
		current = current.next
	}
	var zero V
	return zero, false
}

How the lookup works

The Put method starts by resetting the hasher. maphash.Hash maintains internal state, so you must call Reset before hashing a new key. If you skip this, the hasher appends to the previous state and produces garbage results. The compiler won't catch this mistake. You'll get wrong indices and silent data corruption.

Next, WriteString feeds the key into the hasher, and Sum64 produces the final hash. The modulo operator % maps the hash to a valid bucket index. The modulo introduces a slight bias if the number of buckets isn't a power of two, but for a simple implementation, the bias is negligible. Go's built-in map uses bitwise masking for power-of-two sizes to avoid division entirely.

The chain traversal checks each entry for a key match. If found, the value updates in place. If not, a new bucket node prepends to the chain. Prepending is O(1) and avoids walking to the tail. The size field tracks total entries, which helps decide when to resize.

The Get method follows the same path. It hashes the key, finds the bucket, and walks the chain. If the key exists, it returns the value and true. Otherwise, it returns the zero value of V and false. The zero value is safe to return because the boolean flag tells the caller whether the key was present.

Convention aside: receiver names should be short and match the type. (m *HashMap[V]) uses m for map. Avoid (this *HashMap[V]) or (self *HashMap[V]). Go convention favors one or two letters. Also, maphash requires a random seed to prevent hash-flooding attacks. An attacker who knows your seed can craft keys that all collide, turning your map into O(N) lookups. Randomizing the seed per process stops this.

Hashing turns keys into indices. Chaining handles the collisions. Random seeds stop attackers from breaking your map.

Adding resizing

A map that never grows turns into a linked list under load. Here's how to resize when the load factor gets too high. The load factor is size / len(buckets). When it exceeds a threshold, the map doubles the number of buckets and rehashes all entries.

const loadFactorThreshold = 0.75

// PutWithResize inserts a value and resizes the map if the load factor exceeds the threshold.
func (m *HashMap[V]) PutWithResize(key string, val V) {
	if float64(m.size)/float64(len(m.buckets)) > loadFactorThreshold {
		m.resize()
	}
	m.Put(key, val)
}

// resize doubles the number of buckets and rehashes all existing entries.
func (m *HashMap[V]) resize() {
	newBuckets := make([]*bucket, len(m.buckets)*2)
	for _, b := range m.buckets {
		for current := b; current != nil; {
			next := current.next // save pointer before rewiring
			m.hasher.Reset()
			m.hasher.WriteString(current.key)
			hash := m.hasher.Sum64()
			idx := hash % uint64(len(newBuckets))
			current.next = newBuckets[idx] // link to new chain
			newBuckets[idx] = current      // prepend to new chain
			current = next                 // advance in old chain
		}
	}
	m.buckets = newBuckets
}

The resize method creates a new bucket slice twice the size. It iterates over every entry in the old buckets, rehashes the key against the new size, and moves the node to the new chain. The next := current.next save is critical. Rewiring current.next would break the traversal of the old chain. By saving the pointer first, we can safely update the link and advance.

Resizing is expensive. It touches every entry and allocates a new slice. The threshold of 0.75 balances space and time. A lower threshold resizes more often and wastes memory. A higher threshold increases chain length and slows lookups. Go's built-in map uses a similar threshold and performs incremental resizing to avoid pauses, but that adds significant complexity.

Resizing keeps the map fast. Rehashing is expensive, so only do it when the load factor demands it.

Pitfalls and compiler checks

Generics add constraints. The type parameter K must be comparable because the map needs to check key equality. If you try to use a slice as a key, the compiler rejects it with invalid map key type []int. Slices are not comparable because their contents can change. Maps, functions, and slices cannot be map keys.

Hashing generic types requires care. maphash works on []byte. Converting an arbitrary generic K to bytes requires unsafe or reflection. Using unsafe breaks memory safety guarantees and can cause panics if the type contains pointers. Reflection is slow. For a production generic map, you'd often require the caller to provide a hash function or constrain K to types that can be safely converted to bytes.

The maphash package is fast but not cryptographically secure. It's designed for hash maps and sets. If you need collision resistance against adversarial inputs, use crypto/sha256 or crypto/hmac. The trade-off is speed. maphash is orders of magnitude faster.

Goroutine safety is another pitfall. This implementation is not safe for concurrent access. Multiple goroutines calling Put can corrupt the chains or cause panics. Add a sync.Mutex or sync.RWMutex if you need concurrency. Or use sync.Map for read-heavy workloads with distinct key sets.

The compiler catches unhashable keys. You catch the logic bugs. Always reset the hasher.

When to build vs borrow

Use the built-in map[K]V when you need a standard dictionary with O(1) average access and don't care about iteration order.

Use a custom hash map when you need to preserve insertion order, implement LRU eviction, or control the hashing strategy for specific performance characteristics.

Use sync.Map when multiple goroutines read the same keys frequently and writes happen to distinct keys, avoiding lock contention.

Use a sorted slice when the dataset is small, you need range queries, or memory overhead must be minimal.

Use a trie when keys share long common prefixes and you need prefix matching.

Where to go next