Regular Expressions

Med

Regular expressions (regex) are patterns used to match character combinations in strings. JavaScript supports regex both as literals (/pattern/flags) and via the RegExp constructor. They power powerful string operations like validation, extraction, and replacement.

Interactive Visualization

Tagged Templates

A tagged template literal is called
highlight`Hello ${name}!`

Key Points

  • Two creation forms: /pattern/flags literal and new RegExp("pattern", "flags")
  • Flags: g (global), i (case-insensitive), m (multiline), s (dotAll), u (unicode)
  • Character classes: \d (digit), \w (word), \s (whitespace), . (any char)
  • Quantifiers: + (1+), * (0+), ? (0-1), {n,m} (range)
  • Capturing groups () and named groups (?<name>...) extract matched substrings
  • Lookahead (?=...) and lookbehind (?<=...) match without consuming characters

Code Examples

Creating Regex and Basic Matching

const pattern = /hello/i;
console.log(pattern.test('Hello World')); // true

// Constructor syntax for dynamic patterns
const search = 'world';
const dynamic = new RegExp(search, 'gi');

// match() returns array of matches
const str = 'cat bat hat';
console.log(str.match(/[cbh]at/g)); // ['cat', 'bat', 'hat']

Literal /.../ for static patterns, RegExp constructor for dynamic patterns. test() returns boolean.

Character Classes and Quantifiers

const phone = '555-123-4567';
console.log(/\d{3}-\d{3}-\d{4}/.test(phone)); // true

// Lazy vs greedy
const html = '<b>bold</b> and <i>italic</i>';
console.log(html.match(/<.+>/));   // ['<b>bold</b> and <i>italic</i>'] greedy
console.log(html.match(/<.+?>/));  // ['<b>'] lazy

Character classes match categories. Add ? after a quantifier for lazy (non-greedy) matching.

Capturing Groups and Named Groups

const date = '2024-03-15';
const match = date.match(/(\d{4})-(\d{2})-(\d{2})/);
console.log(match[1]); // '2024'

// Named groups
const named = date.match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
console.log(named.groups.year);  // '2024'

// matchAll for multiple matches
const text = 'John: 30, Jane: 25';
const entries = [...text.matchAll(/(?<name>\w+): (?<age>\d+)/g)];
entries.forEach(m => console.log(m.groups.name, m.groups.age));

Groups capture substrings. Named groups make code readable. matchAll() returns all matches.

Lookahead, Lookbehind, and Replace

// Positive lookahead (?=...)
console.log('100px 200em'.match(/\d+(?=px)/g)); // ['100']

// Positive lookbehind (?<=...)
console.log('$100 €200'.match(/(?<=\$)\d+/g)); // ['100']

// replace() with capture group references
const name = 'Smith, John';
console.log(name.replace(/(\w+), (\w+)/, '$2 $1')); // 'John Smith'

// split() with regex
console.log('one1two2three'.split(/\d/)); // ['one', 'two', 'three']

Lookahead/lookbehind assert conditions without consuming characters. replace() supports backreferences.

Common Mistakes

  • Forgetting to escape special chars in RegExp constructor: new RegExp("\\d+") needs double backslash
  • Using greedy quantifiers when lazy is needed
  • Forgetting the g flag and only getting the first match
  • Not anchoring patterns with ^ and $ when validating entire strings

Interview Tips

  • Know common patterns: email, digits, URL validation
  • Explain the difference between test() (boolean) and match() (array)
  • Be able to write regex for basic string parsing on the spot
  • matchAll() is the modern way to iterate all matches with groups

Related Concepts