Lookahead Lookbehind
Lookahead and lookbehind assertions for matching patterns based on surrounding context without consuming characters
You are an expert in zero-width assertions for context-dependent pattern matching with regular expressions.
## Key Points
- .NET (fully variable-length)
- Python `regex` module (third-party)
- JavaScript (as of ES2018, variable-length supported in V8/Chrome)
- Use lookarounds to keep matches clean. When you only need the data and not the surrounding delimiters, lookarounds avoid post-processing to strip context.
- Stack multiple lookaheads at the start of a pattern for multi-condition validation (as in the password example).
- Prefer lookaround over complex alternation when the logic is "match X only in context Y."
- Keep lookbehind patterns short and fixed-length for maximum engine compatibility.
- Test lookaround patterns with edge cases at the start and end of the input string.
- Forgetting that lookarounds are zero-width. They do not consume characters, so the same position can be tested by multiple lookarounds.
- Using variable-length lookbehind in engines that do not support it, causing a runtime error.
- Confusing `(?!...)` (negative lookahead, checks what comes next) with `(?<!...)` (negative lookbehind, checks what came before).
- Accidentally creating patterns where a lookahead and the main pattern conflict, resulting in no match.
## Quick Example
```
Input: "Price is 50 dollars"
Match: "50"
```
```regex
^(?=.*[A-Z])(?=.*[a-z])(?=.*\d).{8,}$
```skilldb get regex-skills/Lookahead LookbehindFull skill: 136 linesLookahead & Lookbehind — Regular Expressions
You are an expert in zero-width assertions for context-dependent pattern matching with regular expressions.
Core Philosophy
Overview
Lookahead and lookbehind (collectively called "lookaround") let you assert that a certain pattern exists before or after the current position without including it in the match. They are zero-width, meaning they check a condition but do not advance the regex engine's cursor.
Core Concepts
The Four Lookaround Types
| Syntax | Name | Meaning |
|---|---|---|
(?=...) | Positive lookahead | What follows must match ... |
(?!...) | Negative lookahead | What follows must NOT match ... |
(?<=...) | Positive lookbehind | What precedes must match ... |
(?<!...) | Negative lookbehind | What precedes must NOT match ... |
How Zero-Width Works
In the pattern \d+(?= dollars), the engine matches one or more digits, then checks that the text immediately after is dollars. The string " dollars" is not part of the returned match.
Input: "Price is 50 dollars"
Match: "50"
Lookbehind Length Restrictions
Most engines (Java, JavaScript, Python re) require lookbehind patterns to have a fixed or bounded length. Variable-length lookbehinds are supported in:
- .NET (fully variable-length)
- Python
regexmodule (third-party) - JavaScript (as of ES2018, variable-length supported in V8/Chrome)
Implementation Patterns
Password strength validation
Require at least one uppercase letter, one lowercase letter, one digit, and minimum 8 characters:
^(?=.*[A-Z])(?=.*[a-z])(?=.*\d).{8,}$
Each (?=.*X) is a lookahead that scans the entire string for the required character class.
Match a word NOT followed by a specific word
Match "foo" only when it is not followed by "bar":
foo(?!bar)
Matches "foo" in "fooBAZ" and "foo qux" but not in "foobar".
Match a number preceded by a currency symbol
(?<=\$)\d+(\.\d{2})?
Matches "99" in "$99" and "12.50" in "$12.50", but not "99" in "99 items".
Extract words not preceded by a hashtag
(?<!#)\b\w+\b
Matches "hello" and "world" in "hello #tag world" but not "tag".
Comma-format a number (insert thousands separators)
Use lookahead and lookbehind together in a search-and-replace:
(?<=\d)(?=(\d{3})+(?!\d))
Replace with ,. Turns 1234567 into 1,234,567.
Match content between delimiters without capturing delimiters
Match text inside parentheses without including them:
(?<=\()[^)]+(?=\))
Input: "call(arg1, arg2)" — Match: "arg1, arg2"
Negative lookbehind to avoid escaped characters
Match double quotes that are NOT preceded by a backslash:
(?<!\\)"
Best Practices
- Use lookarounds to keep matches clean. When you only need the data and not the surrounding delimiters, lookarounds avoid post-processing to strip context.
- Stack multiple lookaheads at the start of a pattern for multi-condition validation (as in the password example).
- Prefer lookaround over complex alternation when the logic is "match X only in context Y."
- Keep lookbehind patterns short and fixed-length for maximum engine compatibility.
- Test lookaround patterns with edge cases at the start and end of the input string.
Common Pitfalls
- Forgetting that lookarounds are zero-width. They do not consume characters, so the same position can be tested by multiple lookarounds.
- Using variable-length lookbehind in engines that do not support it, causing a runtime error.
- Confusing
(?!...)(negative lookahead, checks what comes next) with(?<!...)(negative lookbehind, checks what came before). - Accidentally creating patterns where a lookahead and the main pattern conflict, resulting in no match.
- Performance traps: a lookahead containing
.*inside a repeated group can cause excessive backtracking. Keep lookaround sub-patterns as specific as possible.
Anti-Patterns
Over-engineering for hypothetical scale. Building for millions of users when you have hundreds adds complexity without value. Solve today's problems first.
Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide well wastes time and introduces unnecessary risk.
Premature abstraction. Creating elaborate frameworks and utilities before you have enough concrete cases to know what the abstraction should look like produces the wrong abstraction.
Neglecting error handling at boundaries. Internal code can trust its inputs, but system boundaries (user input, APIs, file I/O) require defensive validation.
Skipping documentation for obvious code. What is obvious to you today will not be obvious to your colleague next month or to you next year.
Install this skill directly: skilldb add regex-skills
Related Skills
Basics Syntax
Core regular expression syntax including character classes, quantifiers, anchors, and alternation
Email URL Validation
Practical regex patterns for validating emails, URLs, IP addresses, and other common string formats
Log Parsing
Regex patterns for parsing structured and semi-structured log files from common servers, applications, and systems
Named Groups
Named capture groups for readable, maintainable regex patterns with structured data extraction
Performance
Regex performance optimization, catastrophic backtracking prevention, and engine internals for writing efficient patterns
Search Replace
Regex-powered find and replace patterns for text transformation, refactoring, and data reformatting