Regular expressions look like line noise until the moment they save you an afternoon of fragile string-slicing code. The good news is that the syntax is small and mostly composable, so once you understand the handful of building blocks you can read and write almost any pattern. This guide builds regex up piece by piece using the JavaScript flavor, then assembles a few patterns you'd actually use.

What a regex actually is

A regular expression is a small program that describes a set of strings. You hand it some input text, and it answers questions: does this match? where? what did it capture? In JavaScript you write one with slashes:

js
const re = /cat/; re.test("the cat sat"); // true re.test("dog"); // false

The pattern

text
/cat/
matches the literal three characters
text
c
,
text
a
,
text
t
appearing in sequence anywhere in the string. That's the simplest possible regex, and it's worth internalizing: by default a regex looks for its pattern somewhere inside the input, not for the whole string to equal it.

You can also build one from a string with the

text
RegExp
constructor, which matters when the pattern is dynamic:

js
const word = "cat"; const re = new RegExp(word); // same as /cat/

With the constructor you have to double-escape backslashes (

text
"\\d"
instead of
text
\d
), so prefer the literal
text
/.../
form whenever the pattern is known at write time.

Literals and metacharacters

Most characters in a pattern match themselves. The exceptions are metacharacters, which have special meaning:

text
. ^ $ * + ? ( ) [ ] { } | \
. When you want a literal version of one of these, escape it with a backslash. A literal dot is
text
\.
, a literal plus is
text
\+
.

This is the single most common beginner bug. The pattern

text
/3.14/
does not match only
text
3.14
— the
text
.
matches any character, so it also matches
text
3x14
and
text
3 14
. To match a literal period you need
text
/3\.14/
.

Character classes

A character class, written with square brackets, matches exactly one character from a set:

js
/[aeiou]/ // any single vowel /[0-9]/ // any single digit (range) /[a-fA-F]/ // a hex letter, either case

Ranges use a hyphen. You can combine ranges and individual characters in one class:

text
[a-z0-9_]
matches one lowercase letter, digit, or underscore.

A caret as the first character inside the brackets negates the class:

js
/[^0-9]/ // any single character that is NOT a digit

Because some classes are so common, regex provides shorthands:

  • text
    \d
    — a digit, same as
    text
    [0-9]
  • text
    \w
    — a "word" character:
    text
    [A-Za-z0-9_]
  • text
    \s
    — whitespace (space, tab, newline, and more)
  • text
    \D
    ,
    text
    \W
    ,
    text
    \S
    — the negated versions of each

And

text
.
is the broadest of all: any character except a newline (unless you turn on the
text
s
flag, covered below).

A subtle point: inside a character class, most metacharacters lose their power.

text
[.+*]
matches a literal dot, plus, or asterisk — no escaping needed. The characters you still need to be careful with inside a class are
text
]
,
text
\
,
text
^
(at the start), and
text
-
(between two characters).

Quantifiers

Quantifiers say how many of the preceding element to match:

  • text
    *
    — zero or more
  • text
    +
    — one or more
  • text
    ?
    — zero or one (i.e. optional)
  • text
    {n}
    — exactly n
  • text
    {n,}
    — n or more
  • text
    {n,m}
    — between n and m, inclusive

Examples:

js
/colou?r/ // matches "color" and "colour" /\d{3}-\d{4}/ // 123-4567 /a{2,4}/ // "aa", "aaa", or "aaaa" /\w+/ // one or more word characters

Quantifiers attach to whatever immediately precedes them: a single character, a character class, or a group.

Greedy vs. lazy

By default quantifiers are greedy — they grab as much as they can while still allowing the overall pattern to match. This trips people up constantly. Consider extracting the contents of an HTML tag:

js
"<b>one</b><b>two</b>".match(/<b>(.*)<\/b>/)[1]; // "one</b><b>two"

The

text
.*
ate everything up to the last
text
</b>
. Add a
text
?
after a quantifier to make it lazy, matching as little as possible:

js
"<b>one</b><b>two</b>".match(/<b>(.*?)<\/b>/)[1]; // "one"

(For real HTML, use a DOM parser — but lazy quantifiers are exactly the right tool for many small text-extraction jobs.)

Anchors and boundaries

Anchors don't match characters; they match positions.

  • text
    ^
    — start of the string (or start of a line, with the
    text
    m
    flag)
  • text
    $
    — end of the string (or end of a line, with
    text
    m
    )
  • text
    \b
    — a word boundary, the edge between a
    text
    \w
    and a non-
    text
    \w
    character

Anchors are how you require a pattern to span the whole input rather than just appearing within it:

js
/^\d+$/.test("42"); // true — the whole string is digits /^\d+$/.test("42px"); // false

Word boundaries let you match whole words.

text
/\bcat\b/
matches
text
cat
in "the cat sat" but not in "category" or "scatter":

js
/\bcat\b/.test("category"); // false /\bcat\b/.test("the cat"); // true

Groups and capturing

Parentheses do two jobs at once: they group a sub-pattern so a quantifier can apply to the whole thing, and they capture the matched text for later use.

js
/(ab)+/ // one or more repetitions of "ab"

When a regex with capturing groups matches, you get the captured substrings back:

js
const m = "2026-06-02".match(/(\d{4})-(\d{2})-(\d{2})/); m[1]; // "2026" (first group) m[2]; // "06" m[3]; // "02"

Named groups

Numbered groups get unreadable fast. Name them with

text
(?<name>...)
and read them off the
text
groups
object:

js
const m = "2026-06-02".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/); m.groups.year; // "2026" m.groups.month; // "06"

Non-capturing groups

If you only need grouping for a quantifier and don't care about capturing, use

text
(?:...)
. It's slightly faster and keeps your capture numbering clean:

js
/(?:https?:\/\/)?example\.com/ // the protocol is optional, but not captured

Alternation

The pipe

text
|
means "or" and has very low precedence, so it splits the entire pattern unless you contain it in a group. This distinction matters:

js
/^cat|dog$/ // ^cat OR dog$ — probably not what you meant /^(cat|dog)$/ // exactly "cat" or exactly "dog"

Almost always you want alternation wrapped in a group so the surrounding anchors and quantifiers apply to both branches.

Flags

Flags go after the closing slash and change how the whole pattern behaves:

  • text
    g
    — global: find all matches, not just the first
  • text
    i
    — case-insensitive
  • text
    m
    — multiline:
    text
    ^
    and
    text
    $
    match at line breaks, not just string ends
  • text
    s
    — dotall:
    text
    .
    also matches newlines
  • text
    u
    — unicode: correct handling of code points beyond the basic range

The

text
g
flag changes which methods are useful. To collect every match with its groups,
text
matchAll
is the clean modern option:

js
const text = "a1 b2 c3"; for (const m of text.matchAll(/([a-z])(\d)/g)) { console.log(m[1], m[2]); // a 1, then b 2, then c 3 }

One gotcha worth knowing: a

text
RegExp
object with the
text
g
(or
text
y
) flag is stateful — it remembers a
text
lastIndex
between calls to
text
.test()
and
text
.exec()
. Reusing the same global regex across calls can give surprising alternating true/false results. If you don't need that statefulness, don't add
text
g
, or create a fresh regex each time.

Putting it together

With the pieces in hand, here are a few patterns built from what we've covered.

A hex color — a

text
#
followed by exactly three or six hex digits:

js
/^#(?:[0-9a-fA-F]{3}|[0-9a-fA-F]{6})$/

A simple time in 24-hour form — hours

text
00
text
23
, minutes
text
00
text
59
. This is where regex's character-by-character thinking shows: you constrain each digit position rather than parsing a number:

js
/^([01]\d|2[0-3]):[0-5]\d$/

Pulling key-value pairs from a query string fragment:

js
const q = "name=ada&lang=js&year=2026"; const pairs = [...q.matchAll(/(?<key>\w+)=(?<val>\w+)/g)] .map(m => [m.groups.key, m.groups.val]); // [["name","ada"],["lang","js"],["year","2026"]]

A word of honesty about email: the truly correct email regex is enormous and still rejects valid addresses. For real validation, check for a single

text
@
with something on each side and confirm by sending a verification message. A pragmatic sanity check is fine — just don't pretend a regex proves an address exists:

js
/^[^\s@]+@[^\s@]+\.[^\s@]+$/

Habits that keep regex maintainable

Regex rewards a few disciplines. Anchor with

text
^
and
text
$
whenever you mean "the entire string," or you'll match unexpected substrings. Reach for lazy quantifiers when you're extracting between delimiters. Name your groups once a pattern has more than one. And when a pattern grows past a line or two, that's usually a signal to split the work between a simpler regex and ordinary code — readable beats clever.

Finally, never ship a pattern you haven't run against real inputs, including the messy edge cases and the strings that should not match. Building the pattern incrementally and testing as you go beats staring at a wall of metacharacters. You can prototype against your own sample text in the free Cosmovex regex tester, which highlights matches and capture groups live so you can see exactly what each part of the pattern is doing before it reaches production. Start with the literal you know matches, add one construct at a time, and confirm each step does what you expect.