Lunr
Lunr is a small, fast JavaScript search library for browsers and Node.js. It allows you to build a search index directly within your application, providing full-text search capabilities without a backend API or external service. It's ideal for static sites, documentation, or client-side applications requiring offline-capable search.
You are an expert in client-side search implementations, particularly adept at leveraging `lunr.js` to deliver fast, offline-capable search experiences directly within web applications. You understand its strengths in performance, minimal infrastructure, and customizability for in-browser search. ## Key Points * **Manage Index Size**: Be mindful of the total size of your index. Extremely large indices can consume significant browser memory and impact performance, especially on mobile devices. * **Debounce Search Input**: When integrating with a UI, debounce user input to avoid firing a search query on every keystroke. This prevents excessive re-rendering and CPU spikes. * **Provide Clear Feedback**: Inform users when a search is ongoing (e.g., with a spinner) or when no results are found. This improves the user experience. * **Asynchronous Index Loading**: For very large pre-built indices, consider loading and deserializing the index in a Web Worker to prevent blocking the main UI thread. ## Quick Example ```bash npm install lunr ```
skilldb get search-services-skills/LunrFull skill: 228 linesYou are an expert in client-side search implementations, particularly adept at leveraging lunr.js to deliver fast, offline-capable search experiences directly within web applications. You understand its strengths in performance, minimal infrastructure, and customizability for in-browser search.
Core Philosophy
Lunr operates on the principle of bringing the search engine directly to the client. Unlike server-side search solutions, Lunr builds and maintains its index entirely in memory within the browser or Node.js environment. This design eliminates the need for a dedicated search server or external API calls for every query, resulting in extremely fast search responses, especially for users with unreliable network access or in offline-first applications.
The core advantage of Lunr is its "zero-infrastructure" footprint. You don't manage servers, databases, or external services; the search capability is bundled with your application's JavaScript. This makes it a perfect fit for static site generators, documentation portals, small e-commerce sites, or any web application where the dataset is manageable on the client side and you want to avoid backend complexity for search. Its flexible indexing pipeline also allows for significant customization to tailor search relevance to your specific content.
Choose Lunr when your dataset is not excessively large (typically up to a few thousand documents, depending on document size and browser memory constraints), and you prioritize speed, offline capability, and infrastructure simplicity. If you need real-time indexing of frequently changing data, advanced analytics, or search over millions of documents, a server-side solution like Elasticsearch or Algolia would be more appropriate.
Setup
To integrate lunr into your project, you first need to install it. It's available as an npm package and can be used in both browser and Node.js environments.
Install Lunr:
npm install lunr
After installation, you can import and initialize Lunr in your JavaScript code. The most common pattern involves defining your index fields and then adding your documents.
// main.js
import * as lunr from 'lunr';
// Example data: an array of documents you want to make searchable
const articles = [
{
id: '1',
title: 'Getting Started with Lunr.js',
body: 'Lunr.js is a client-side search library. It allows you to build a full-text search index in the browser.',
tags: ['javascript', 'search', 'frontend']
},
{
id: '2',
title: 'Advanced Lunr.js Techniques',
body: 'Learn about custom pipelines, boosting fields, and serializing your index for faster loads.',
tags: ['javascript', 'search', 'advanced']
},
{
id: '3',
title: 'Building a Static Site with Eleventy',
body: 'Eleventy is a popular static site generator. Combine it with Lunr for powerful client-side search.',
tags: ['eleventy', 'static-site', 'javascript']
},
];
// Initialize the Lunr index
// The 'this' context inside the function refers to the Lunr builder
const idx = lunr(function () {
this.ref('id'); // 'id' field is used as the unique reference for each document
this.field('title', { boost: 10 }); // 'title' field, with a higher boost for relevance
this.field('body'); // 'body' field
this.field('tags'); // 'tags' field
// Add each document to the index
articles.forEach(function (doc) {
this.add(doc);
}, this); // Pass 'this' (the builder) as the context for forEach callback
});
console.log("Lunr index created successfully.");
// Now you can perform searches
const searchResults = idx.search("lunr javascript");
console.log("Search results for 'lunr javascript':", searchResults);
/*
Example searchResults structure:
[
{ ref: '1', score: 1.234, matchData: { ... } },
{ ref: '2', score: 0.876, matchData: { ... } }
]
*/
Key Techniques
1. Basic Indexing and Searching
This is the fundamental pattern for using Lunr. You define which fields of your documents should be indexed and then add the documents. Queries return an array of result objects, each containing the document's reference (ref) and a relevance score.
import * as lunr from 'lunr';
const books = [
{ id: 'book-1', title: 'The Great Gatsby', author: 'F. Scott Fitzgerald', description: 'A novel about the American Dream.' },
{ id: 'book-2', title: '1984', author: 'George Orwell', description: 'A dystopian social science fiction novel.' },
{ id: 'book-3', title: 'To Kill a Mockingbird', author: 'Harper Lee', description: 'A novel about the injustices of the American South.' },
];
const bookIndex = lunr(function () {
this.ref('id');
this.field('title');
this.field('author');
this.field('description');
books.forEach(book => this.add(book));
});
// Perform a simple search
function searchBooks(query) {
const results = bookIndex.search(query);
return results.map(result => {
// Find the original document using the ref
return books.find(book => book.id === result.ref);
});
}
console.log("Search for 'Gatsby':", searchBooks('Gatsby'));
// Output: [ { id: 'book-1', title: 'The Great Gatsby', author: 'F. Scott Fitzgerald', ... } ]
console.log("Search for 'novel':", searchBooks('novel'));
// Output: [ { id: 'book-1', ... }, { id: 'book-2', ... }, { id: 'book-3', ... } ]
2. Serializing and Deserializing Indices
For larger datasets, rebuilding the index on every page load can be slow and resource-intensive. Lunr allows you to serialize your index to JSON and then deserialize it, either to save it to local storage or to pre-build it at compile time (e.g., during a static site build) and load it directly.
import * as lunr from 'lunr';
const documents = [
{ id: 'a', text: 'Apple pie is delicious.' },
{ id: 'b', text: 'Bananas are yellow.' },
{ id: 'c', text: 'Carrots are orange.' },
];
// --- Step 1: Build and serialize the index ---
const initialIndex = lunr(function () {
this.ref('id');
this.field('text');
documents.forEach(doc => this.add(doc));
});
const serializedIndex = JSON.stringify(initialIndex);
console.log("Serialized Index:", serializedIndex.substring(0, 100) + '...'); // Truncated for display
// In a real application, you might save this to localStorage:
// localStorage.setItem('myLunrIndex', serializedIndex);
// --- Step 2: Deserialize and use the index ---
// Imagine this code runs on a subsequent page load or after fetching the pre-built index
// const storedIndex = localStorage.getItem('myLunrIndex');
const loadedIndex = lunr.Index.load(JSON.parse(serializedIndex));
const results = loadedIndex.search('yellow');
console.log("Search results from loaded index for 'yellow':", results.map(r => r.ref));
// Output: Search results from loaded index for 'yellow': [ 'b' ]
3. Advanced Querying and Field Boosting
Lunr supports powerful query syntax for more precise searches, including requiring terms, excluding terms, wildcard searches, and fuzzy matching. You can also boost specific fields during index creation to make matches in those fields more relevant.
import * as lunr from 'lunr';
const articles = [
{ id: '1', title: 'Quick Start Guide', body: 'This guide helps you set up quickly.' },
{ id: '2', title: 'Advanced Configuration', body: 'Explore advanced settings and features.' },
{ id: '3', title: 'Troubleshooting Common Issues', body: 'Solutions for frequently encountered problems.' },
{ id: '4', title: 'Using the CLI Tool', body: 'Command line interface for common tasks.' },
];
const boostedIndex = lunr(function () {
this.ref('id');
this.field('title', { boost: 10 }); // Titles are 10 times more important
this.field('body');
articles.forEach(doc => this.add(doc));
});
// Search for "guide" - 'Quick Start Guide' should rank higher due to title boost
console.log("Search for 'guide':", boostedIndex.search('guide').map(r => ({ ref: r.ref, score: r.score })));
// Notice '1' has a higher score
// Query for "advanced" AND "features"
console.log("Search for '+advanced +features':", boostedIndex.search('+advanced +features').map(r => r.ref));
// Output: [ '2' ]
// Query for "guide" but NOT "start"
console.log("Search for 'guide -start':", boostedIndex.search('guide -start').map(r => r.ref));
// Output: [ '3' ] (Troubleshooting Common Issues might have 'guide' in body implicitly, or 'guide' is stemmed from 'guidance' etc.)
// A more precise example would be to ensure 'guide' is only in '1' and '3' for better demonstration.
// Let's refine the articles for a clearer example:
const refinedArticles = [
{ id: '1', title: 'Quick Start Guide', body: 'This guide helps you set up quickly.' },
{ id: '2', title: 'Advanced Configuration', body: 'Explore advanced settings and features.' },
{ id: '3', title: 'Troubleshooting', body: 'Common issues and guides for solutions.' }, // 'guides' in body
];
const refinedIndex = lunr(function () {
this.ref('id');
this.field('title', { boost: 10 });
this.field('body');
refinedArticles.forEach(doc => this.add(doc));
});
console.log("\nRefined search for 'guide -start':", refinedIndex.search('guide -start').map(r => r.ref));
// Output: [ '3' ] (because '1' contains 'start')
// Fuzzy search for "congif" (misspelling of "config")
console.log("Fuzzy search for 'congif~1':", boostedIndex.search('congif~1').map(r => r.ref));
// Output: [ '2' ] (matches 'Configuration' with a single edit distance)
Best Practices
- Pre-build and Serialize Your Index: For static sites or applications with stable data, build the Lunr index at compile time (e.g., during your CI/CD pipeline or static site generation) and ship the serialized JSON. This drastically reduces client-side load times and CPU usage.
- Index Only What's Necessary: Avoid indexing entire document bodies if only specific fields (like title, description, tags) are relevant for search. A smaller index is faster to build, serialize, load, and query.
- Manage Index Size: Be mindful of the total size of your index. Extremely large indices can consume significant browser memory and impact performance, especially on mobile devices.
- Customize the Processing Pipeline: Lunr's pipeline allows for custom tokenizers, stemmers, and stop word filters. For non-English languages, integrate appropriate stemmers (e.g.,
lunr-languages) to improve search relevance. - Debounce Search Input: When integrating with a UI, debounce user input to avoid firing a search query on every keystroke. This prevents excessive re-rendering and CPU spikes.
- Provide Clear Feedback: Inform users when a search is ongoing (e.g., with a spinner) or when no results are found. This improves the user experience.
- Asynchronous Index Loading: For very large pre-built indices, consider loading and deserializing the index in a Web Worker to prevent blocking the main UI thread.
Anti-Patterns
Indexing entire documents. Only index specific, relevant fields (e.g., title, tags, a summary) rather than the full, lengthy content of every document
Install this skill directly: skilldb add search-services-skills
Related Skills
Algolia
"Algolia: instant search, faceted search, InstantSearch.js/React, indexing, ranking, search analytics"
Elasticsearch
"Elasticsearch: full-text search, aggregations, mapping, bulk indexing, Node.js client, relevance tuning"
Fuse Js
Fuse.js is a lightweight, powerful fuzzy-search library for JavaScript that runs entirely client-side. It's ideal for quickly adding flexible, typo-tolerant search capabilities to web applications without server-side infrastructure.
Manticore Search
"Manticore Search: open-source full-text search, SQL-based queries, real-time indexes, columnar storage, Elasticsearch-compatible API"
Meilisearch
"Meilisearch: self-hosted search engine, typo tolerance, faceting, filtering, sorting, REST API, JavaScript SDK"
Opensearch
OpenSearch is a community-driven, open-source search and analytics suite derived from Elasticsearch. It's ideal for powering full-text search, log analytics, security monitoring, and real-time application monitoring, offering powerful scalability and flexibility for diverse data needs.