URLs can only safely carry a small set of ASCII characters, yet we routinely stuff names, emails, search terms, and JSON into them. Percent-encoding is the mechanism that bridges that gap, and getting it wrong produces some of the most persistent, hard-to-spot bugs in web development.

What percent-encoding actually is

A URL is defined by RFC 3986, and that spec only permits a limited alphabet of characters to appear directly in a URL. Anything outside that set, or any character that has a special structural meaning in the wrong place, must be escaped.

Percent-encoding works on bytes, not characters. The rule is simple:

  1. Take the character and encode it as one or more bytes using UTF-8.
  2. Replace each byte with a
    text
    %
    followed by its two-digit uppercase hexadecimal value.

A space (byte

text
0x20
) becomes
text
%20
. An ampersand (
text
0x26
) becomes
text
%26
. A character outside ASCII, like
text
é
, is two UTF-8 bytes (
text
0xC3 0xA9
) and therefore encodes to
text
%C3%A9
. An emoji is four bytes and becomes four percent-sequences.

text
space -> %20 & -> %26 é -> %C3%A9

This byte-level detail matters: percent-encoding is not "replace weird characters with codes," it is "serialize to UTF-8, then escape the bytes." Any encoder that doesn't go through UTF-8 first will mangle non-ASCII input.

Reserved vs unreserved characters

RFC 3986 splits characters into two groups that decide what needs escaping.

Unreserved characters

These are always safe and never need encoding:

text
A-Z a-z 0-9 - _ . ~

That's the entire unreserved set. Letters, digits, hyphen, underscore, period, and tilde. A correct encoder leaves these untouched. (Some older encoders escaped

text
~
because an obsolete spec listed it as reserved; modern ones should not.)

Reserved characters

Reserved characters have structural meaning in a URL. They are the delimiters that separate one part of the URL from another:

text
: / ? # [ ] @ (general delimiters) ! $ & ' ( ) * + , ; = (sub-delimiters)

The key insight is that reserved characters are only special in context. A

text
/
is meaningful when it separates path segments, but if a single path segment genuinely contains a slash as data, that slash must be encoded as
text
%2F
so it isn't read as a separator. Same for
text
?
introducing a query,
text
#
introducing a fragment, and
text
&
/
text
=
separating query parameters.

This is why "do I need to encode this character?" has no universal answer. It depends entirely on where the character is going and whether it's acting as a delimiter or as literal data.

encodeURI vs encodeURIComponent

JavaScript ships two global functions, and the difference between them is the single most common source of URL bugs.

encodeURI: for a whole URL

text
encodeURI
assumes you are handing it a complete, already-structured URL and you just want to make it valid. It therefore leaves reserved delimiters alone, because escaping them would break the URL's structure.

js
encodeURI('https://example.com/search?q=a b&x=1'); // 'https://example.com/search?q=a%20b&x=1'

Notice that

text
:
,
text
/
,
text
?
,
text
&
, and
text
=
all survived. Only the space got encoded. That's correct for its job, but it means
text
encodeURI
is the wrong tool for encoding a piece of a URL.

encodeURIComponent: for one piece

text
encodeURIComponent
assumes you are encoding a single component that will be dropped into a larger URL, so it escapes everything that isn't unreserved, including reserved delimiters.

js
encodeURIComponent('a b&x=1'); // 'a%20b%26x%3D1'

Here

text
&
became
text
%26
and
text
=
became
text
%3D
, which is exactly what you want when those characters are data inside a value rather than structure.

The rule of thumb

  • Building a URL from parts? Encode each part with
    text
    encodeURIComponent
    , then assemble.
  • Already have a full URL and just want to clean it up?
    text
    encodeURI
    (rarely needed in practice).

Never use

text
encodeURI
to encode a query-parameter value. If a user's search term contains
text
&
,
text
encodeURI
will leave it intact and your query string will silently split into extra parameters.

Both functions deliberately leave a few characters unescaped that

text
encodeURIComponent
arguably should escape:
text
!
,
text
'
,
text
(
,
text
)
, and
text
*
. Most servers tolerate them, but if you need strict RFC 3986 compliance, post-process them yourself:

js
function strictEncode(str) { return encodeURIComponent(str).replace( /[!'()*]/g, c => '%' + c.charCodeAt(0).toString(16).toUpperCase() ); }

Encoding query parameters correctly

Query strings are where encoding goes wrong most often, because both the keys and the values can contain reserved characters.

The wrong way is to concatenate raw strings:

js
// BROKEN: breaks if name or note contains & = # or space const url = `/save?name=${name}&note=${note}`;

If

text
note
is
text
cost = $5 & up
, the resulting URL has stray
text
=
and
text
&
characters that the server will parse as additional parameters.

Encode each key and value independently:

js
const url = `/save?name=${encodeURIComponent(name)}` + `&note=${encodeURIComponent(note)}`;

Better still, let

text
URLSearchParams
handle it. It encodes for you and removes the chance of forgetting one value:

js
const params = new URLSearchParams({ name: name, note: 'cost = $5 & up', }); const url = `/save?${params.toString()}`; // /save?name=...&note=cost+%3D+%245+%26+up

The

text
URL
API composes nicely with it:

js
const u = new URL('https://example.com/save'); u.searchParams.set('q', 'rock & roll'); u.toString(); // https://example.com/save?q=rock+%26+roll

Using these built-ins is almost always safer than hand-rolling string concatenation, and it sidesteps the next problem entirely.

The space and plus-sign trap

Here is the bug that surprises nearly everyone. There are two conventions for encoding a space in a URL, and they live in different parts of the URL.

  • In the path, a space is
    text
    %20
    . A
    text
    +
    is a literal plus.
  • In a query string using
    text
    application/x-www-form-urlencoded
    (the format browsers use for form submissions), a space is
    text
    +
    , and a literal plus is
    text
    %2B
    .

This split exists for historical reasons: HTML form encoding predates and diverges from the generic URL spec, and that legacy convention is baked into how servers parse query strings.

The practical consequences:

  • text
    URLSearchParams
    encodes spaces as
    text
    +
    (form-encoding rules). So
    text
    new URLSearchParams({q: 'a b'}).toString()
    gives
    text
    q=a+b
    .
  • text
    encodeURIComponent
    encodes spaces as
    text
    %20
    and a
    text
    +
    as
    text
    %2B
    .
  • Both
    text
    a+b
    and
    text
    a%20b
    decode to
    text
    a b
    on a well-behaved server reading a query string, but a server reading a path will treat
    text
    +
    as a literal plus.

The classic failure: a user searches for

text
C++
. You encode it with
text
encodeURIComponent
, getting
text
C%2B%2B
, which is correct. But if instead you only escaped spaces and left the pluses, the server sees
text
C++
, treats both pluses as spaces, and searches for
text
C
(C followed by two spaces). The query silently returns the wrong results with no error anywhere.

The reverse failure: you store a value like

text
2 + 2
and round-trip it through a query string. If your encoder used
text
%20
for the space but you decode with form-decoding rules, the literal
text
+
in
text
2 + 2
gets turned into a space, and you read back
text
2 2
.

How to stay safe

  • Pick one tool for a given context and let it own both encoding and decoding. If you build query strings with
    text
    URLSearchParams
    , parse them with
    text
    URLSearchParams
    too, and the
    text
    +
    /space round-trip stays consistent.
  • Never mix: don't encode with
    text
    encodeURIComponent
    and decode by manually replacing
    text
    +
    with a space, or vice versa.
  • When debugging, remember that a
    text
    +
    in a URL is ambiguous on sight. You cannot tell whether it means "space" or "literal plus" without knowing which part of the URL it's in and which convention produced it.

If you ever need to eyeball what a particular string encodes to, or decode a captured URL to see what's really in it, our URL encoder and decoder runs entirely in your browser so nothing you paste leaves your machine. It's a quick way to confirm whether that mysterious

text
%2B
in a log is a literal plus or a botched space.

A short checklist

When you're encoding a URL, run through this:

  1. Am I encoding a whole URL or a piece of one? Whole URL is rare; you almost always want to encode pieces with
    text
    encodeURIComponent
    and assemble them.
  2. Are reserved characters acting as structure or data? If
    text
    &
    ,
    text
    =
    ,
    text
    /
    ,
    text
    ?
    , or
    text
    #
    are data inside a value, they must be escaped.
  3. Path or query? Decide before you pick how spaces are handled. In doubt,
    text
    %20
    is safe in both contexts.
  4. Same tool in and out. Encode and decode with matching conventions so
    text
    +
    and
    text
    %2B
    round-trip correctly.
  5. Prefer the built-ins.
    text
    URLSearchParams
    and the
    text
    URL
    API eliminate most hand-encoding mistakes.

Percent-encoding looks fiddly, but it reduces to one idea: characters that have meaning to the URL must be escaped when used as data, and there are two slightly different conventions for spaces depending on where you are. Internalize those two facts, lean on

text
encodeURIComponent
and
text
URLSearchParams
, and the whole class of encoding bugs mostly disappears.