Skip to main content
Technology & EngineeringRegex136 lines

Lookahead Lookbehind

Lookahead and lookbehind assertions for matching patterns based on surrounding context without consuming characters

Quick Summary29 lines
You are an expert in zero-width assertions for context-dependent pattern matching with regular expressions.

## Key Points

- .NET (fully variable-length)
- Python `regex` module (third-party)
- JavaScript (as of ES2018, variable-length supported in V8/Chrome)
- Use lookarounds to keep matches clean. When you only need the data and not the surrounding delimiters, lookarounds avoid post-processing to strip context.
- Stack multiple lookaheads at the start of a pattern for multi-condition validation (as in the password example).
- Prefer lookaround over complex alternation when the logic is "match X only in context Y."
- Keep lookbehind patterns short and fixed-length for maximum engine compatibility.
- Test lookaround patterns with edge cases at the start and end of the input string.
- Forgetting that lookarounds are zero-width. They do not consume characters, so the same position can be tested by multiple lookarounds.
- Using variable-length lookbehind in engines that do not support it, causing a runtime error.
- Confusing `(?!...)` (negative lookahead, checks what comes next) with `(?<!...)` (negative lookbehind, checks what came before).
- Accidentally creating patterns where a lookahead and the main pattern conflict, resulting in no match.

## Quick Example

```
Input:  "Price is 50 dollars"
Match:  "50"
```

```regex
^(?=.*[A-Z])(?=.*[a-z])(?=.*\d).{8,}$
```
skilldb get regex-skills/Lookahead LookbehindFull skill: 136 lines
Paste into your CLAUDE.md or agent config

Lookahead & Lookbehind — Regular Expressions

You are an expert in zero-width assertions for context-dependent pattern matching with regular expressions.

Core Philosophy

Overview

Lookahead and lookbehind (collectively called "lookaround") let you assert that a certain pattern exists before or after the current position without including it in the match. They are zero-width, meaning they check a condition but do not advance the regex engine's cursor.

Core Concepts

The Four Lookaround Types

SyntaxNameMeaning
(?=...)Positive lookaheadWhat follows must match ...
(?!...)Negative lookaheadWhat follows must NOT match ...
(?<=...)Positive lookbehindWhat precedes must match ...
(?<!...)Negative lookbehindWhat precedes must NOT match ...

How Zero-Width Works

In the pattern \d+(?= dollars), the engine matches one or more digits, then checks that the text immediately after is dollars. The string " dollars" is not part of the returned match.

Input:  "Price is 50 dollars"
Match:  "50"

Lookbehind Length Restrictions

Most engines (Java, JavaScript, Python re) require lookbehind patterns to have a fixed or bounded length. Variable-length lookbehinds are supported in:

  • .NET (fully variable-length)
  • Python regex module (third-party)
  • JavaScript (as of ES2018, variable-length supported in V8/Chrome)

Implementation Patterns

Password strength validation

Require at least one uppercase letter, one lowercase letter, one digit, and minimum 8 characters:

^(?=.*[A-Z])(?=.*[a-z])(?=.*\d).{8,}$

Each (?=.*X) is a lookahead that scans the entire string for the required character class.

Match a word NOT followed by a specific word

Match "foo" only when it is not followed by "bar":

foo(?!bar)

Matches "foo" in "fooBAZ" and "foo qux" but not in "foobar".

Match a number preceded by a currency symbol

(?<=\$)\d+(\.\d{2})?

Matches "99" in "$99" and "12.50" in "$12.50", but not "99" in "99 items".

Extract words not preceded by a hashtag

(?<!#)\b\w+\b

Matches "hello" and "world" in "hello #tag world" but not "tag".

Comma-format a number (insert thousands separators)

Use lookahead and lookbehind together in a search-and-replace:

(?<=\d)(?=(\d{3})+(?!\d))

Replace with ,. Turns 1234567 into 1,234,567.

Match content between delimiters without capturing delimiters

Match text inside parentheses without including them:

(?<=\()[^)]+(?=\))

Input: "call(arg1, arg2)" — Match: "arg1, arg2"

Negative lookbehind to avoid escaped characters

Match double quotes that are NOT preceded by a backslash:

(?<!\\)"

Best Practices

  • Use lookarounds to keep matches clean. When you only need the data and not the surrounding delimiters, lookarounds avoid post-processing to strip context.
  • Stack multiple lookaheads at the start of a pattern for multi-condition validation (as in the password example).
  • Prefer lookaround over complex alternation when the logic is "match X only in context Y."
  • Keep lookbehind patterns short and fixed-length for maximum engine compatibility.
  • Test lookaround patterns with edge cases at the start and end of the input string.

Common Pitfalls

  • Forgetting that lookarounds are zero-width. They do not consume characters, so the same position can be tested by multiple lookarounds.
  • Using variable-length lookbehind in engines that do not support it, causing a runtime error.
  • Confusing (?!...) (negative lookahead, checks what comes next) with (?<!...) (negative lookbehind, checks what came before).
  • Accidentally creating patterns where a lookahead and the main pattern conflict, resulting in no match.
  • Performance traps: a lookahead containing .* inside a repeated group can cause excessive backtracking. Keep lookaround sub-patterns as specific as possible.

Anti-Patterns

Over-engineering for hypothetical scale. Building for millions of users when you have hundreds adds complexity without value. Solve today's problems first.

Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide well wastes time and introduces unnecessary risk.

Premature abstraction. Creating elaborate frameworks and utilities before you have enough concrete cases to know what the abstraction should look like produces the wrong abstraction.

Neglecting error handling at boundaries. Internal code can trust its inputs, but system boundaries (user input, APIs, file I/O) require defensive validation.

Skipping documentation for obvious code. What is obvious to you today will not be obvious to your colleague next month or to you next year.

Install this skill directly: skilldb add regex-skills

Get CLI access →