Parsing URLs without regex
You are building a tool that fetches data from an API. The base URL comes from a configuration file. Sometimes you need to switch from http to https. Sometimes you need to append a query parameter like ?debug=true. Sometimes the user pastes a malformed link and your program crashes because you tried to split the string by / and missed the edge case where the path is empty. String concatenation for URLs is a trap. Trailing slashes multiply. Percent-encoding breaks. Query parameters overwrite each other. The net/url package turns a URL into a struct you can read and modify without guessing where the question mark goes.
The url.URL struct
A URL is a string with a strict grammar. The net/url package parses that string into a url.URL struct. This struct holds the scheme, host, path, query parameters, and fragment as separate fields. You can inspect any part without regex. You can modify parts and call String() to get a valid URL back. The parser handles percent-encoding, user info, and port numbers automatically.
The struct fields map directly to URL components. Scheme is http or https. Host is the domain and port. Path is the hierarchical part after the host. RawQuery holds the query string. Fragment is the anchor after #. The User field stores credentials as a *url.Userinfo pointer. Most fields are exported, which means you can modify them directly. Go trusts you to keep the struct consistent.
Here is the simplest parsing workflow: parse a string, read the parts, modify a field, and reconstruct the string.
package main
import (
"fmt"
"net/url"
)
func main() {
// Parse splits the string into a url.URL struct.
// The error return is ignored here for brevity.
u, _ := url.Parse("https://user:pass@example.com:8080/path?query=1#section")
// Host includes the port. Use Hostname() to get just the domain.
fmt.Println(u.Host) // example.com:8080
// Path is the hierarchical part after the host.
fmt.Println(u.Path) // /path
// RawQuery holds the query string without the leading '?'.
fmt.Println(u.RawQuery) // query=1
// Fragment is the anchor part after '#'.
fmt.Println(u.Fragment) // section
// Modify fields directly. String() reconstructs the URL.
u.Scheme = "http"
u.Host = "newhost.com"
fmt.Println(u.String()) // http://user:pass@newhost.com/path?query=1#section
}
The underscore discards the error intentionally. In production code, check the error. Here we drop it to focus on the structure. The url.Parse function returns a pointer to url.URL. You can mutate the fields on that pointer. The String() method reconstructs the URL with correct formatting, including re-adding the scheme, host, path, query, and fragment in the right order.
How parsing and reconstruction work
When you call url.Parse, the function validates the URL structure. If the string is garbage, it returns an error. The compiler won't stop you; the error happens at runtime. The error message looks like parse "htp://example.com": first path segment in URL cannot contain colon. You must check the error return value. The community convention is to wrap the error with context using fmt.Errorf and the %w verb so callers can unwrap it later.
The parser distinguishes between decoded and encoded representations. The Path field contains the decoded path. If the URL had %20, Path has a space. RawPath keeps the percent-encoded version. This distinction matters when you reconstruct the URL. String() uses RawPath if it is set, otherwise it encodes Path. This invariant prevents double-encoding. If you modify Path, String() will encode it again. If you set RawPath, String() trusts you and uses it as-is. Keep Path and RawPath in sync, or rely on String() to handle the encoding by leaving RawPath empty.
Query parameters follow a similar pattern. RawQuery holds the raw query string. The Query() method decodes RawQuery into a url.Values map. This map is ready to modify. When you are done, you must encode the map back and assign it to RawQuery. The url.Values type is a map[string][]string. This design supports query parameters that appear multiple times, like ?tag=go&tag=web. The Set method replaces all values for a key. The Add method appends a value to the slice. The Get method returns the first value.
Working with query parameters
Manipulating query parameters safely requires using url.Values. Direct string manipulation of RawQuery is error-prone. You might forget to encode special characters or mess up the ampersand separators. The Query() method handles decoding. The Encode() method handles encoding.
// BuildAPIURL adds query parameters to a base URL.
func BuildAPIURL(base string, params map[string]string) (string, error) {
// Parse validates the base URL structure.
u, err := url.Parse(base)
if err != nil {
return "", fmt.Errorf("invalid base URL: %w", err)
}
// Query() returns a decoded map of existing parameters.
q := u.Query()
for k, v := range params {
q.Set(k, v)
}
// Encode() creates the percent-encoded string.
// Set RawQuery to ensure String() uses the encoded form.
u.RawQuery = q.Encode()
return u.String(), nil
}
func main() {
result, _ := BuildAPIURL("https://api.example.com/v1/users?limit=10", map[string]string{"limit": "50"})
fmt.Println(result)
// https://api.example.com/v1/users?limit=50
}
The BuildAPIURL function demonstrates the standard pattern. Parse the base URL. Call Query() to get a mutable map. Modify the map. Encode the map and assign to RawQuery. Return String(). The Set method overwrites existing values. If you need to append a value instead, use Add. The Encode() method produces a string like limit=50&filter=active. It percent-encodes spaces and special characters correctly.
Joining paths safely
Appending path segments to a base URL is another common task. String concatenation fails when the base URL has a trailing slash or the segment has a leading slash. The url.JoinPath function handles this. It normalizes slashes and resolves . and .. segments.
package main
import (
"fmt"
"net/url"
)
func main() {
// JoinPath appends segments to the base URL.
// It handles slashes and path resolution automatically.
u, err := url.JoinPath("https://example.com/", "api", "v1", "../users")
if err != nil {
panic(err)
}
// The result has normalized slashes and resolved '..'.
fmt.Println(u) // https://example.com/api/users
}
url.JoinPath returns a string, not a struct. It is useful when you just need the final URL. If you need to modify the URL further, parse the result or use the struct methods. The function resolves relative path components. ../users moves up one level. This behavior matches how browsers resolve paths.
Pitfalls and edge cases
The url package is robust, but a few details trip up developers. The User field is a *url.Userinfo. Accessing the password requires checking a boolean return value. The Password() method returns string and bool. The boolean indicates whether a password was present. If the URL has user@host without a colon, the password is absent. Calling Password() without checking the boolean might return an empty string, which is indistinguishable from an empty password in some contexts.
Percent-encoding rules differ for paths and query strings. url.PathEscape encodes spaces as %20. url.QueryEscape encodes spaces as +. This difference exists because query strings follow the application/x-www-form-urlencoded convention. Use PathEscape for path segments. Use QueryEscape for query values if you are building the query string manually. If you use url.Values, the Encode() method handles the correct encoding automatically.
The Opaque field is for non-standard URLs like mailto:user@example.com. When Opaque is set, String() returns the opaque value instead of reconstructing from fields. This field is rarely needed for HTTP URLs. Leave it empty unless you are implementing a custom scheme.
The ForceQuery field forces a ? in the output even if the query string is empty. This is useful when the server requires a query delimiter. Set ForceQuery to true to get https://example.com/path?. Without it, String() omits the question mark.
When to use url package features
Use url.Parse when you need to validate a URL string or extract components like host and path. Use the url.URL struct when you need to modify parts of a URL and reconstruct it safely. Use url.Values when you need to add, remove, or update query parameters. Use url.JoinPath when you need to append path segments to a base URL without worrying about slashes. Use url.PathEscape when you need to encode a single path segment manually. Use url.QueryEscape when you need to encode a query parameter value manually. Use string concatenation only when you are constructing a static URL that never changes and you have verified the format manually. Avoid regex for URL parsing; the grammar is too complex and edge cases like percent-encoding will break your pattern.
Trust the parser. Reconstruct with String(). Keep Path and RawPath consistent. Encode query values through url.Values.