MD5, SHA-1, SHA-256: Which Hash Should You Use?

"Should I use MD5 or SHA-256?" is the wrong question if you don't first ask what the hash is for. The same algorithm can be a perfectly reasonable choice for one job and a security disaster for another.

This guide walks through the three hashes developers reach for most, where each is still acceptable, where they're dangerous, and what you should actually use for the two cases people get wrong: integrity verification and password storage.

What a hash function actually does

A cryptographic hash takes input of any size and produces a fixed-length digest. Feed it one byte or a 4 GB file, you get back the same-length string. The same input always produces the same output, and (in theory) you can't run the function backwards to recover the input.

Three properties matter for security:

Preimage resistance — given a digest, you can't find an input that produces it.
Second-preimage resistance — given an input, you can't find a different input with the same digest.
Collision resistance — you can't find any two distinct inputs that hash to the same value.

When people say an algorithm is "broken," they almost always mean collision resistance has fallen. That distinction is the whole story of MD5 and SHA-1, so keep it in mind.

MD5: fast, ubiquitous, and broken

MD5 produces a 128-bit (16-byte) digest. It's everywhere — config files, legacy databases, ETags, old download pages — because it's fast and was the default for years.

It is also thoroughly broken for collision resistance. Practical collision attacks have existed since the mid-2000s, and they're not academic. Attackers used an MD5 collision to forge a certificate authority signature, and the Flame malware abused MD5 to fake a Microsoft code-signing certificate. You can generate two different files with the same MD5 on a laptop in seconds.

What MD5 collisions let an attacker do is craft two files with the same digest. So MD5 fails any time a digest is meant to prove "this is the exact content I signed off on":

Digital signatures
Certificate fingerprints
Deduplication where a malicious party controls the inputs
Any "this file matches the trusted one" check against an adversary

text
# Both of these can be made to share one MD5 digest
md5sum legit.bin malicious.bin
d41d8cd98f00b204e9800998ecf8427e  legit.bin
d41d8cd98f00b204e9800998ecf8427e  malicious.bin

Note what is not on the broken list: preimage resistance. Nobody has demonstrated a practical way to take an arbitrary MD5 digest and reverse it to the original input. That's why MD5 still has a narrow, legitimate use — covered below — but never trust it where an attacker gets to pick the inputs.

SHA-1: deprecated, and don't start new projects with it

SHA-1 produces a 160-bit digest and was the workhorse for Git, TLS certificates, and signing for a long time. It's stronger than MD5, but it's also fallen.

The 2017 SHATTERED attack produced the first real SHA-1 collision (two PDFs with the same digest). A 2020 follow-up demonstrated a chosen-prefix collision, which is the more dangerous kind — it lets an attacker control meaningful structure in both colliding files, the property needed to forge certificates and signatures. The cost has only dropped since.

The industry response was decisive. Browsers stopped trusting SHA-1 certificates years ago, and certificate authorities won't issue them. SHA-1 is deprecated for all signature use.

Git is the nuance people cite. Git uses SHA-1 for object identity, and a SHA-1 collision is a theoretical concern for it — but Git ships collision-detection hardening that rejects known attack patterns, and the project is migrating to SHA-256. Git's use is content-addressing among (mostly) trusted collaborators, not a signature against an adversary, which is why it hasn't been an active disaster. Still: do not pick SHA-1 for anything new.

Where MD5 and SHA-1 are still fine

Here's the part security checklists tend to flatten into "never use MD5," which is too blunt.

When the threat model has no adversary choosing the input — you're guarding against accidental corruption, not malicious substitution — a fast hash is fine. Examples:

Detecting bit rot in your own backups
A non-cryptographic cache key or shard key
Verifying a file survived a flaky network copy you control end to end
Spotting accidental duplicate files in a personal photo library

The distinction is integrity against accidents vs. integrity against attackers. MD5 catches a flipped bit just as well as SHA-256, because random corruption isn't going to engineer a collision. The moment an attacker can supply or tamper with a file, MD5 and SHA-1 are off the table — switch to a SHA-2 family hash.

If you just need to fingerprint a string or file and check it later, you can compute any of these digests with our hash generator — it runs entirely in your browser, so the data never leaves your machine.

SHA-256 and SHA-512: the current default

For anything security-relevant where you need a plain cryptographic hash, reach for the SHA-2 family — almost always SHA-256.

SHA-256 produces a 256-bit digest and has no known practical collision or preimage attacks. It's the default for:

File and release integrity against tampering (
text
sha256sum
)
Certificate fingerprints and modern signatures
Content addressing where inputs may be hostile
HMACs for message authentication (
text
HMAC-SHA256
)
Blockchain and Merkle-tree constructions

text
sha256sum app-release.aab
9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08  app-release.aab

When to choose SHA-512 instead: it produces a 512-bit digest and, perhaps counterintuitively, is often faster than SHA-256 on 64-bit hardware because its internal operations are 64-bit native. If you're hashing large volumes on a modern server and want more digest length, SHA-512 (or its truncated variant SHA-512/256) is a fine pick. On 32-bit or constrained devices, SHA-256 is usually the better performer. For most applications the security difference is irrelevant; pick based on platform and whether you need the longer digest.

SHA-3 (Keccak) exists and is standardized. It's a structurally different design that was selected as a hedge in case SHA-2 ever falls. It's a perfectly good choice, but SHA-2 remains unbroken and far more widely supported, so there's no urgency to migrate. Use SHA-3 if you have a specific reason; otherwise SHA-2 is the safe default.

The big one: never hash passwords with any of these

Here's the mistake that shows up in breach after breach. SHA-256 is the wrong tool for storing passwords — not because it's weak, but because it's fast. That speed is exactly the problem.

When an attacker dumps your user table, they want to brute-force the original passwords. A fast hash helps them. Modern GPUs compute billions of SHA-256 hashes per second. Salt your SHA-256 (you must) and you defeat precomputed rainbow tables — but a salt does nothing against an attacker hashing a wordlist against each stolen digest. Against a fast hash, common passwords fall in seconds.

The fix is a password hashing function designed to be deliberately slow and tunable:

argon2 (specifically argon2id) — the modern recommendation. It's memory-hard, meaning it requires a large amount of RAM per hash, which neutralizes the GPU and ASIC advantage attackers rely on. Tune memory, iterations, and parallelism.
scrypt — also memory-hard, a solid choice and widely available.
bcrypt — older, battle-tested, and still acceptable. Its main caveat is a 72-byte input limit; pre-hash long inputs if needed. Tune the cost factor so a single hash takes a meaningful fraction of a second.

The key idea: you pick a work factor so that one hash takes, say, 100-250ms on your server. Users never notice the delay on a single login. An attacker who needs to try billions of guesses now faces a wall of compute time and memory that makes large-scale cracking economically impractical.

python
# Right tool for passwords (argon2id)
from argon2 import PasswordHasher

ph = PasswordHasher()  # memory-hard, tunable, salt handled for you
digest = ph.hash("correct horse battery staple")
ph.verify(digest, "correct horse battery staple")  # True or raises

# WRONG for passwords — fast, GPU-friendly, crackable at scale
import hashlib
hashlib.sha256(b"correct horse battery staple").hexdigest()

A good password library generates a random salt per user and embeds it (plus the parameters) in the output string, so you store one value and don't manage salts manually. Don't roll your own salting and stretching on top of SHA-256 — use a purpose-built password hasher.

Quick decision guide

Checking a file you control didn't get corrupted in transit → any hash works; MD5/SHA-1 are fine.
Verifying a download or release against tampering → SHA-256.
Signatures, certificates, content addressing against adversaries → SHA-256 (or SHA-512).
Message authentication → HMAC-SHA256.
Storing passwords → argon2id, scrypt, or bcrypt. Never a plain hash.
Anything new where you're unsure → SHA-256 for hashing, argon2id for passwords. You'll rarely be wrong.

The one principle that covers all of it: match the algorithm to the threat model, not to habit. MD5's collision break doesn't matter when there's no attacker, and SHA-256's strength doesn't save you when speed is the vulnerability. Ask what you're defending against first, and the right hash becomes obvious.

Need to compute a digest right now to verify a file or compare a fingerprint? The Cosmovex hash generator does MD5, SHA-1, SHA-256, and SHA-512 locally in your browser — nothing you paste in is ever uploaded.

We use cookies