33.1 Client-Side Search Architecture: JSON Index + JavaScript

Alright, let’s get our hands dirty. Client-side search in Hugo is a bit of a magic trick. We’re going to pre-build a search index—a highly structured JSON file that’s basically a map of every word on your site and where it lives—and then we’ll teach a JavaScript library how to read that map to find what you’re looking for, all without ever bothering a server. It’s fast, it’s static, and it’s surprisingly powerful.

The core idea is beautifully simple, almost stupidly so, which is why it works so well. During your site build (hugo), you use Hugo’s superpowers to serialize all your content into a single, massive JSON file. Then, when a visitor hits your site, you load that JSON file and a JavaScript library (our “search engine”) into their browser. Their computer does the heavy lifting of searching through that data. No PHP, no databases, no fancy server config. Just you, some JSON, and a whole lot of forEach loops.

The Almighty `outputs.json`

First, you need to tell Hugo to actually create that magic JSON file. This is the most common “why isn’t my search working?!” moment, so pay attention. By default, Hugo won’t generate a JSON file because it doesn’t know you need one. You have to be explicit.

You’ll configure this in your hugo.toml (or config.toml). The key is the outputs section for your home page format. You’re telling Hugo: “For the pages that use the ‘home’ bundle type (i.e., your root /), I want you to output not just HTML, but also a JSON file.”

[outputs]
home = ["HTML", "JSON"]

Now, when you run hugo, you’ll see a index.json file appear in your public directory. Boom. Foundation laid.

Crafting the Index JSON Structure

Next, we need to control what goes into that JSON file. Hugo’s default JSON output for the home bundle is… not great for our purposes. We need to shape it. We do this by creating a template file at layouts/_default/index.json. This file dictates the exact structure of our search index.

Here’s a robust starting point. We’re creating an array of objects, one for each page. Each object contains the essential info our search library will need: the page’s title, its permanent URL, a summary of its content, and crucially, the raw text we want to search against.

{{- $.Scratch.Add "index" slice -}}
{{- range where site.RegularPages "Type" "not in" (slice "search") -}}
{{- $.Scratch.Add "index" (dict "title" .Title "url" .Permalink "summary" .Summary "content" (.Plain | htmlUnescape)) -}}
{{- end -}}
{{- $.Scratch.Get "index" | jsonify -}}

Let’s break down the clever bits:

{{- range where site.RegularPages "Type" "not in" (slice "search") -}}: This is a classic Hugo gotcha. We’re filtering out any pages of type “search”. Why? Because if you have a search results page, the last thing you want is for the search page itself to show up in the search results. It’s embarrassingly meta.
(.Plain | htmlUnescape): This is the gold. .Plain gives us the raw text of the page, stripped of all HTML tags. The htmlUnescape filter converts things like & back into a simple &. This is vital because your JavaScript search library doesn’t care about HTML entities; it just wants clean, searchable text.

The Client-Side Dance: Fetching and Parsing

Now for the browser’s part. We need to write a script that:

Fetches our index.json file.
Loads a search library (we’ll use Fuse.js for this example).
Hooks up a search input field to perform queries and display results.

Here’s the JavaScript to make it happen. I’m putting this in a file like assets/js/search.js and including it on my search page.

// assets/js/search.js
document.addEventListener('DOMContentLoaded', function () {
    const searchInput = document.getElementById('search-input');
    const resultsContainer = document.getElementById('search-results');

    // Let's be honest, this is a big file. Log so we know it's working.
    console.log("Fetching search index... please hold.");

    fetch('/index.json')
        .then(response => response.json())
        .then(data => {
            console.log(`Index loaded! ${data.length} pages indexed.`);

            // Configure Fuse.js. These options are a great starting point.
            const fuseOptions = {
                keys: ['title', 'summary', 'content'], // Which keys in our object to search
                includeMatches: true, // We want match info for highlighting later
                minMatchCharLength: 3, // Ignore searches for "a" or "it". Be sensible.
                threshold: 0.4 // A tolerance for fuzziness. Tweak this to your liking.
            };

            const fuseIndex = new Fuse(data, fuseOptions);

            searchInput.addEventListener('input', function (e) {
                const query = e.target.value;
                if (query.length < fuseOptions.minMatchCharLength) {
                    resultsContainer.innerHTML = '<p>Keep typing...</p>'; // Don't be annoying with short queries
                    return;
                }

                const results = fuseIndex.search(query);
                displayResults(results);
            });
        })
        .catch(error => console.error("Error loading search index:", error));

    function displayResults(results) {
        if (results.length === 0) {
            resultsContainer.innerHTML = '<p>No results found. Try a different term?</p>';
            return;
        }

        const html = results.map(result => {
            const { item, matches } = result;
            // A simple highlight function. This is where you can get fancy.
            let preview = item.summary;
            const contentMatch = matches.find(m => m.key === 'content');
            if (contentMatch) {
                // Grab the first match and show a snippet around it
                const firstMatch = contentMatch.indices[0];
                const start = Math.max(0, firstMatch[0] - 20);
                const end = Math.min(item.content.length, firstMatch[1] + 80);
                preview = '...' + item.content.substring(start, end) + '...';
            }
            return `
                <article>
                    <h4><a href="${item.url}">${item.title}</a></h4>
                    <p>${preview}</p>
                </article>
            `;
        }).join('');

        resultsContainer.innerHTML = html;
    }
});

The Inevitable Performance Conversation

Let’s address the elephant in the room: you’re shipping your entire site’s content twice—once as HTML and once as JSON. If you have a 500-page blog, that index.json file is going to be massive. This is the fundamental trade-off.

Best practices to avoid crushing your users’ browsers:

Be Selective: Don’t index everything. The example above uses site.RegularPages. Maybe you don’t need to index your 100 “thank you for the email” pages. Use Hugo’s where function to be picky.
Trim the Fat: Are you sure you need the entire .Plain content of every page for the index? For a long documentation page, the first 500 characters might be enough. Consider using .Summary or a custom variable for the search key to keep the JSON lean.
Compress: Ensure your hosting provider (Netlify, Vercel, etc.) serves the .json file with gzip or Brotli compression. This can often cut the transfer size by 70-80%. This isn’t just a nice-to-have; it’s non-negotiable.
Be Transparent: A “Loading search index…” message isn’t a sign of failure; it’s good UX. That JSON file needs to download and parse, which on a slow connection takes time. Acknowledge it.

The Almighty outputs.json

Crafting the Index JSON Structure

The Client-Side Dance: Fetching and Parsing

The Inevitable Performance Conversation

The Almighty `outputs.json`