Base64 Explained: What It Is and When to Use It

Base64 is one of those things every developer copies, pastes, and uses for years before ever stopping to ask what it actually does. It is a way to represent arbitrary binary data using only printable ASCII characters, and understanding it properly saves you from a surprising number of bugs.

What Base64 Actually Is

Base64 is an encoding, not a compression or an encryption scheme. It takes raw bytes and re-expresses them using a 64-character alphabet that is safe to put almost anywhere text is allowed:

text

A-Z

text

a-z

text

0-9

, plus

text

+

and

text

/

, with

text

=

reserved for padding.

The core idea is a clean numeric trick. A byte is 8 bits, and 64 is 2 to the 6th power, so each Base64 character carries exactly 6 bits. The least common multiple of 8 and 6 is 24, which means three input bytes (24 bits) map perfectly onto four output characters (also 24 bits). Base64 always works in these 3-byte to 4-character groups.

Here is the transformation for the three bytes that spell

text

Man

text
Text:      M           a           n
ASCII:     77          97          110
Binary:    01001101    01100001    01101110
Regroup:   010011  010110  000101  101110
Decimal:   19      22      5       46
Base64:    T       W       F       u

text

Man

becomes

text

TWFu

. The encoder ignores byte boundaries entirely and just slices the bit stream into 6-bit chunks, looking each one up in the alphabet.

What the padding is for

Input rarely divides evenly into groups of three. When the last group has one or two leftover bytes, the encoder zero-pads the bits to fill the next 6-bit chunk and appends

text

=

characters so the output length stays a multiple of four.

3 input bytes encode to 4 characters, no padding
2 input bytes encode to 3 characters plus one
text
=
1 input byte encodes to 2 characters plus two
text
==

That is why you so often see Base64 strings ending in

text

=

text

==

. The padding is purely structural; it carries no data of its own.

Why Binary-to-Text Encoding Exists at All

The obvious question is why we would inflate our data just to move it around. The answer is historical and still relevant: large parts of the internet were designed to carry text, not arbitrary bytes.

Email is the classic example. The original SMTP and the message format behind it were built for 7-bit ASCII. A raw byte with the high bit set, a null byte, or a stray carriage return could be silently mangled, stripped, or interpreted as a control signal by some intermediate server. If you want to send a JPEG or a PDF through that pipe intact, you have to first turn it into something that survives a text-only channel. Base64 is that something.

The same problem shows up anywhere a transport is text-shaped:

Putting binary inside JSON or XML, which have no native byte type
Embedding a small image directly in HTML or CSS
Stuffing a cryptographic key or certificate into a config file or environment variable
Passing binary through a URL or an HTTP header

In each case the rule is the same: when the channel only guarantees safe passage for printable characters, encode your bytes into printable characters first. Base64 is the most common answer because it is simple, reversible, and supported everywhere.

The ~33% Size Overhead

Base64 is not free. You are spending 4 output characters to represent every 3 input bytes, which is a 4/3 ratio, or roughly a 33% increase in size before you even count padding and any line breaks the format adds.

text
3 bytes  -> 4 chars   (+33%)
300 bytes -> 400 chars (+33%)
3 MB image -> ~4 MB of Base64 text

This matters more than people expect. A 3 MB image becomes about 4 MB of text. If that text also gets gzipped in transit you recover some of the loss, because Base64 output is still fairly compressible, but you never fully break even versus shipping the raw bytes. The overhead is the price of compatibility, and you should only pay it when you actually need a text-safe representation.

A frequent mistake is treating Base64 as if it shrinks data. It never does. If your goal is smaller payloads, you want compression; if your goal is safe transport through a text channel, you want Base64. They solve different problems.

Data URLs: Base64 You See Every Day

One of the most visible uses is the data URL, which lets you inline a resource directly instead of linking to a separate file:

html
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." />

The shape is

text

data:[mediatype][;base64],<data>

. The MIME type tells the browser how to interpret the bytes, and the

text

;base64

flag says the payload that follows is Base64-encoded.

Data URLs are genuinely useful for small assets. A tiny icon or an SVG embedded as a data URL saves an HTTP round trip, which can be worth it for above-the-fold content. The tradeoffs:

They inflate the host document by ~33%, and a data URL cannot be cached independently of the page or stylesheet that contains it.
They are best for small, rarely-changing assets. Inlining a large hero image bloats your HTML and forces the browser to re-download it on every page load.

As a rough rule, reach for a data URL when the asset is a few kilobytes and saving a request matters; otherwise serve it as a normal cacheable file. When you need to generate or sanity-check one, a quick Base64 encoder and decoder is faster than wiring up a script.

JWTs and the Base64url Variant

If you have worked with authentication, you have stared at a JSON Web Token: three chunks separated by dots.

text
eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0In0.dQw4w9WgXcQ...

The first two segments are JSON objects, the header and the payload, each encoded with a URL-safe flavor of Base64. Because standard Base64 uses

text

+

and

text

/

, which have special meaning in URLs, and

text

=

, which often needs escaping, the base64url variant swaps

text

+

for

text

-

text

/

for

text

_

, and drops the padding entirely. Same algorithm, friendlier alphabet for query strings and headers.

The single most important thing to understand here: the header and payload of a JWT are encoded, not encrypted. Anyone holding the token can decode those segments and read every claim inside in plain text. The third segment is a signature that proves the token was not tampered with and was issued by someone holding the secret, but it does nothing to hide the contents.

This trips up developers constantly. Never put anything you would not want the client to see inside a standard JWT payload. To inspect what a token actually contains, decode it with a JWT decoder and you will see the claims in the clear, which is exactly the point being made.

When NOT to Use Base64

The most common abuse of Base64 comes from confusing "unreadable to me" with "secure."

It is not encryption

Base64 has no key and no secret. The transformation is fully public and trivially reversible by anyone, including with a one-line command:

bash
echo "cGFzc3dvcmQxMjM=" | base64 --decode
# password123

Encoding a password, an API key, or a token in Base64 and storing or transmitting it provides exactly zero confidentiality. It only makes the value slightly less obvious at a glance. If you need secrecy, use real encryption with a managed key. If you need to store passwords, use a purpose-built password hash. Base64 belongs in neither workflow.

It is not a checksum or integrity guarantee

Base64 will happily encode corrupted bytes and decode them right back to the same corrupted bytes. It tells you nothing about whether the data is intact. For integrity you want a hash or a signature.

It is wasteful when the channel is already binary-safe

If you are writing bytes to a file, a binary database column, or a protocol that handles raw bytes cleanly, encoding to Base64 first just adds 33% and a pointless round trip. Use it at the boundary where text-only transport forces your hand, and not a layer earlier.

A Practical Mental Model

Keep three questions in mind:

Am I moving bytes through a text-only channel? If yes, Base64 is the right tool. If no, you probably do not need it.
Do I need this to be secret? Base64 never helps here. Reach for encryption.
Do I care about size? Then remember the 33% tax and consider compressing the raw bytes instead.

Base64 is a small, honest tool that does exactly one job well: making binary survive a journey designed for text. Use it for that, keep it away from anything security-shaped, and it will serve you reliably for the rest of your career.

We use cookies

Base64 Explained: What It Is and When to Use It

What Base64 Actually Is

What the padding is for

Why Binary-to-Text Encoding Exists at All

The ~33% Size Overhead

Data URLs: Base64 You See Every Day

JWTs and the Base64url Variant

When NOT to Use Base64

It is not encryption

It is not a checksum or integrity guarantee

It is wasteful when the channel is already binary-safe

A Practical Mental Model

Related guides