URL Encoding Demystified: Percent-Encoding in Practice

URLs can only safely carry a small set of ASCII characters, yet we routinely stuff names, emails, search terms, and JSON into them. Percent-encoding is the mechanism that bridges that gap, and getting it wrong produces some of the most persistent, hard-to-spot bugs in web development.

What percent-encoding actually is

A URL is defined by RFC 3986, and that spec only permits a limited alphabet of characters to appear directly in a URL. Anything outside that set, or any character that has a special structural meaning in the wrong place, must be escaped.

Percent-encoding works on bytes, not characters. The rule is simple:

Take the character and encode it as one or more bytes using UTF-8.
Replace each byte with a
text
%
followed by its two-digit uppercase hexadecimal value.

A space (byte

text

0x20

) becomes

text

%20

. An ampersand (

text

0x26

) becomes

text

%26

. A character outside ASCII, like

text

é

, is two UTF-8 bytes (

text

0xC3 0xA9

) and therefore encodes to

text

%C3%A9

. An emoji is four bytes and becomes four percent-sequences.

text
space  -> %20
&      -> %26
é      -> %C3%A9

This byte-level detail matters: percent-encoding is not "replace weird characters with codes," it is "serialize to UTF-8, then escape the bytes." Any encoder that doesn't go through UTF-8 first will mangle non-ASCII input.

Reserved vs unreserved characters

RFC 3986 splits characters into two groups that decide what needs escaping.

Unreserved characters

These are always safe and never need encoding:

text
A-Z  a-z  0-9  -  _  .  ~

That's the entire unreserved set. Letters, digits, hyphen, underscore, period, and tilde. A correct encoder leaves these untouched. (Some older encoders escaped

text

~

because an obsolete spec listed it as reserved; modern ones should not.)

Reserved characters

Reserved characters have structural meaning in a URL. They are the delimiters that separate one part of the URL from another:

text
:  /  ?  #  [  ]  @        (general delimiters)
!  $  &  '  (  )  *  +  ,  ;  =   (sub-delimiters)

The key insight is that reserved characters are only special in context. A

text

/

is meaningful when it separates path segments, but if a single path segment genuinely contains a slash as data, that slash must be encoded as

text

%2F

so it isn't read as a separator. Same for

text

?

introducing a query,

text

#

introducing a fragment, and

text

&

text

=

separating query parameters.

This is why "do I need to encode this character?" has no universal answer. It depends entirely on where the character is going and whether it's acting as a delimiter or as literal data.

encodeURI vs encodeURIComponent

JavaScript ships two global functions, and the difference between them is the single most common source of URL bugs.

encodeURI: for a whole URL

text

encodeURI

assumes you are handing it a complete, already-structured URL and you just want to make it valid. It therefore leaves reserved delimiters alone, because escaping them would break the URL's structure.

js
encodeURI('https://example.com/search?q=a b&x=1');
// 'https://example.com/search?q=a%20b&x=1'

Notice that

text

:

text

/

text

?

text

&

, and

text

=

all survived. Only the space got encoded. That's correct for its job, but it means

text

encodeURI

is the wrong tool for encoding a piece of a URL.

encodeURIComponent: for one piece

text

encodeURIComponent

assumes you are encoding a single component that will be dropped into a larger URL, so it escapes everything that isn't unreserved, including reserved delimiters.

js
encodeURIComponent('a b&x=1');
// 'a%20b%26x%3D1'

Here

text

&

became

text

%26

and

text

=

became

text

%3D

, which is exactly what you want when those characters are data inside a value rather than structure.

The rule of thumb

Building a URL from parts? Encode each part with
text
encodeURIComponent
, then assemble.
Already have a full URL and just want to clean it up?
text
encodeURI
(rarely needed in practice).

Never use

text

encodeURI

to encode a query-parameter value. If a user's search term contains

text

&

text

encodeURI

will leave it intact and your query string will silently split into extra parameters.

Both functions deliberately leave a few characters unescaped that

text

encodeURIComponent

arguably should escape:

text

!

text

'

text

(

text

)

, and

text

*

. Most servers tolerate them, but if you need strict RFC 3986 compliance, post-process them yourself:

js
function strictEncode(str) {
  return encodeURIComponent(str).replace(
    /[!'()*]/g,
    c => '%' + c.charCodeAt(0).toString(16).toUpperCase()
  );
}

Encoding query parameters correctly

Query strings are where encoding goes wrong most often, because both the keys and the values can contain reserved characters.

The wrong way is to concatenate raw strings:

js
// BROKEN: breaks if name or note contains & = # or space
const url = `/save?name=${name}&note=${note}`;

text

note

text

cost = $5 & up

, the resulting URL has stray

text

=

and

text

&

characters that the server will parse as additional parameters.

Encode each key and value independently:

js
const url = `/save?name=${encodeURIComponent(name)}` +
            `&note=${encodeURIComponent(note)}`;

Better still, let

text

URLSearchParams

handle it. It encodes for you and removes the chance of forgetting one value:

js
const params = new URLSearchParams({
  name: name,
  note: 'cost = $5 & up',
});
const url = `/save?${params.toString()}`;
// /save?name=...&note=cost+%3D+%245+%26+up

The

text

URL

API composes nicely with it:

js
const u = new URL('https://example.com/save');
u.searchParams.set('q', 'rock & roll');
u.toString(); // https://example.com/save?q=rock+%26+roll

Using these built-ins is almost always safer than hand-rolling string concatenation, and it sidesteps the next problem entirely.

The space and plus-sign trap

Here is the bug that surprises nearly everyone. There are two conventions for encoding a space in a URL, and they live in different parts of the URL.

In the path, a space is
text
%20
. A
text
+
is a literal plus.
In a query string using
text
application/x-www-form-urlencoded
(the format browsers use for form submissions), a space is
text
+
, and a literal plus is
text
%2B
.

This split exists for historical reasons: HTML form encoding predates and diverges from the generic URL spec, and that legacy convention is baked into how servers parse query strings.

The practical consequences:

text
URLSearchParams
encodes spaces as
text
+
(form-encoding rules). So
text
new URLSearchParams({q: 'a b'}).toString()
gives
text
q=a+b
.
text
encodeURIComponent
encodes spaces as
text
%20
and a
text
+
as
text
%2B
.
Both
text
a+b
and
text
a%20b
decode to
text
a b
on a well-behaved server reading a query string, but a server reading a path will treat
text
+
as a literal plus.

The classic failure: a user searches for

text

C++

. You encode it with

text

encodeURIComponent

, getting

text

C%2B%2B

, which is correct. But if instead you only escaped spaces and left the pluses, the server sees

text

C++

, treats both pluses as spaces, and searches for

text

C

(C followed by two spaces). The query silently returns the wrong results with no error anywhere.

The reverse failure: you store a value like

text

2 + 2

and round-trip it through a query string. If your encoder used

text

%20

for the space but you decode with form-decoding rules, the literal

text

+

text

2 + 2

gets turned into a space, and you read back

text

2   2

How to stay safe

Pick one tool for a given context and let it own both encoding and decoding. If you build query strings with
text
URLSearchParams
, parse them with
text
URLSearchParams
too, and the
text
+
/space round-trip stays consistent.
Never mix: don't encode with
text
encodeURIComponent
and decode by manually replacing
text
+
with a space, or vice versa.
When debugging, remember that a
text
+
in a URL is ambiguous on sight. You cannot tell whether it means "space" or "literal plus" without knowing which part of the URL it's in and which convention produced it.

If you ever need to eyeball what a particular string encodes to, or decode a captured URL to see what's really in it, our URL encoder and decoder runs entirely in your browser so nothing you paste leaves your machine. It's a quick way to confirm whether that mysterious

text

%2B

in a log is a literal plus or a botched space.

A short checklist

When you're encoding a URL, run through this:

Am I encoding a whole URL or a piece of one? Whole URL is rare; you almost always want to encode pieces with
text
encodeURIComponent
and assemble them.
Are reserved characters acting as structure or data? If
text
&
,
text
=
,
text
/
,
text
?
, or
text
#
are data inside a value, they must be escaped.
Path or query? Decide before you pick how spaces are handled. In doubt,
text
%20
is safe in both contexts.
Same tool in and out. Encode and decode with matching conventions so
text
+
and
text
%2B
round-trip correctly.
Prefer the built-ins.
text
URLSearchParams
and the
text
URL
API eliminate most hand-encoding mistakes.

Percent-encoding looks fiddly, but it reduces to one idea: characters that have meaning to the URL must be escaped when used as data, and there are two slightly different conventions for spaces depending on where you are. Internalize those two facts, lean on

text

encodeURIComponent

and

text

URLSearchParams

, and the whole class of encoding bugs mostly disappears.

We use cookies

URL Encoding Demystified: Percent-Encoding in Practice

What percent-encoding actually is

Reserved vs unreserved characters

Unreserved characters

Reserved characters

encodeURI vs encodeURIComponent

encodeURI: for a whole URL

encodeURIComponent: for one piece

The rule of thumb

Encoding query parameters correctly

The space and plus-sign trap

How to stay safe

A short checklist

Related guides