33.3 Lunr.js Integration: Indexing and Querying
Right, so you’ve decided you want search on your Hugo site. Good for you. It’s a fantastic feature, but let’s be clear: Hugo doesn’t give you a search button to click. You have to build the engine yourself. It’s like buying a fancy car frame and being handed a box of engine parts. Let’s get our hands greasy.
The first and most “classic” way to do this is with Lunr.js. It’s a pure JavaScript, in-browser, full-text search library. The big idea is simple but powerful: during your site build, Hugo generates a massive JSON file containing the text of every page you want to search. Then, when a user visits your site, their browser downloads this JSON file, Lunr.js loads it, builds a search index right there in their browser, and then queries that index. No server required. Neat, huh?
Generating the Search Index (The Hugo Part)
This is where the magic starts. You need to tell Hugo what data to put in that big JSON file. You do this by creating a template for it in /layouts/_default/index.json.json. Yes, the double .json is intentional and a bit silly. It tells Hugo to use this template to generate a JSON file at the root of your site (/index.json).
Let’s create a robust one. We’ll include the page’s title, its permalink (so we can link to results), its tags, and its plain text content.
// layouts/_default/index.json.json
{
"docs": [
{{ range $index, $page := .Site.RegularPages }}
{{ if $index }},{{ end }}
{
"title": {{ $page.Title | jsonify }},
"url": {{ $page.Permalink | jsonify }},
"tags": {{ $page.Params.tags | jsonify }},
"body": {{ $page.Plain | jsonify }}
}
{{ end }}
]
}
jsonify is your best friend here. It safely escapes strings for JSON. Never skip it.
Now, run hugo and look in your public/ directory. You should see a fat index.json file. This is your site’s entire corpus, ready for Lunr to devour. The first potential pitfall is size. If you have a huge site (thousands of pages), this JSON file can get massive, impacting page load time. We’ll talk about mitigating that later.
The Lunr.js Client-Side Script (The Browser Part)
Now for the client-side part. You’ll need to include Lunr. You can grab it from a CDN. Create a search.js file and include it in your site’s footer, or stick this in a <script> tag at the end of your search page.
Here’s the full script, which I’ll then break down.
<!-- Load Lunr from a CDN -->
<script src="https://cdn.jsdelivr.net/npm/lunr@2.3.9/lunr.min.js"></script>
<!-- Your search form -->
<input type="text" id="search" placeholder="Search...">
<div id="results"></div>
<script>
// 1. Wait for the page and our data to load
window.addEventListener('DOMContentLoaded', (event) => {
// 2. Reference our HTML elements
const searchInput = document.getElementById('search');
const resultsContainer = document.getElementById('results');
// 3. Variables to hold our data and index
let lunrIndex;
let pages = [];
// 4. Fetch the giant JSON file we built
fetch('/index.json')
.then(response => response.json())
.then(data => {
pages = data.docs; // Store the pages for later use
// 5. Build the Lunr index
lunrIndex = lunr(function() {
this.ref('url'); // The unique identifier for each document
this.field('title', { boost: 10 }); // Boost title matches
this.field('tags', { boost: 5 }); // Boost tag matches
this.field('body'); // Search the body too
// 6. Add each document to the index
pages.forEach(function(page) {
this.add(page);
}, this);
});
})
.catch(error => console.error('Error fetching search index:', error));
// 7. Listen for search input
searchInput.addEventListener('keyup', (event) => {
const query = event.target.value.trim();
resultsContainer.innerHTML = ''; // Clear previous results
if (query.length < 2) {
return; // Don't search for single characters, it's useless
}
// 8. Perform the search
const results = lunrIndex.search(query);
// 9. Display the results
if (results.length === 0) {
resultsContainer.innerHTML = '<p>No results found. Try being less specific. Or more. I don\'t know, I just work here.</p>';
} else {
results.forEach(result => {
// Find the full page data from our stored array
const page = pages.find(p => p.url === result.ref);
const li = document.createElement('div');
li.innerHTML = `<h3><a href="${page.url}">${page.title}</a></h3><p>${page.url}</p>`;
resultsContainer.appendChild(li);
});
}
});
});
</script>
Why This Works and Where It Bites You
The beauty is in its simplicity. It’s entirely static. You host a JSON file and a script. No PHP, no Go, no databases. It’s brilliantly decoupled.
But let’s call out the rough edges, because they matter:
- Index Size: As mentioned, that
index.jsonfile is a beast. For a large blog, it can easily be several megabytes. You’re forcing every single potential searcher to download your entire site’s text content. This is the fundamental trade-off. You can mitigate it by being selective in your template. Maybe only include a summary ({{ .Summary }}) instead of the full plain text ({{ .Plain }}). - Building the Index on the Client: Building the Lunr index happens in the user’s browser. On a slow phone or old laptop, this can cause a noticeable JavaScript-blocking pause. You’ll see a delay between the page loading and the search box becoming usable. You can use Web Workers to offload this, but that’s a whole other level of complexity.
- Stemming and Language: By default, Lunr handles English stemming (so “running” matches “run”). If your site is in another language, you must load the corresponding Lunr language stemmer and configure it. This is non-optional and often forgotten.
- The
refPitfall: Notice we use'url'as theref. This must be unique. It’s what Lunr gives back to you to find your original data. If you accidentally used a non-unique field (like'title'), you’d have a very bad time with missing results.
Lunr.js is a workhorse. It’s not the shiniest new tool, but it’s reliable, deeply customizable, and teaches you exactly how client-side search works. It’s the foundation everything else is built on.