18.1 Storing Data: JSON, YAML, TOML, and CSV in data/

Right, let’s talk about your data/ directory. This is Hugo’s designated “I’m not a page, I’m just data” drawer. It’s where you stash all the structured information your site needs but that doesn’t deserve (or want) its own front matter and Markdown content. Think of it as your site’s personal JSON, YAML, TOML, and CSV junk drawer—hopefully more organized than your actual junk drawer.

The beauty here is that Hugo, in its infinite wisdom, automatically slurps up any file in data/ and makes it available globally in your templates under the .Site.Data variable. The name of the file becomes the top-level key. It’s dead simple and incredibly powerful.

The Format Face-Off: JSON, YAML, TOML, and CSV

You’ve got options, and they all have their place. Let’s break down the usual suspects.

JSON is the old reliable workhorse of the web. It’s unambiguous and universally supported. The downside? It’s a bit fussy for humans to write by hand. No comments, trailing commas will betray you, and it’s generally just…noisy.

// data/authors/jane-doe.json
{
  "name": "Jane Doe",
  "role": "Senior Mischief Manager",
  "social": [
    {
      "platform": "mastodon",
      "handle": "@jane@mastodon.social"
    }
  ]
}

YAML is what you use when you value human readability above all else. No braces, no quotes (mostly), and comments are a thing! It’s perfect for configuration and data you need to hand-edit frequently. The only gotcha is its whitespace sensitivity; you must mind your tabs and spaces. (Spoiler: always use spaces).

# data/authors/jane-doe.yaml
name: Jane Doe
role: Senior Mischief Manager
social:
  - platform: mastodon
    handle: "@jane@mastodon.social"
# See? So much cleaner. And I can leave notes for myself!

TOML is the newcomer that’s gained a lot of love, especially from the Rust crowd (which Hugo is written in). It aims to be a less ambiguous YAML. It’s great for configuration that’s a mix of simple and complex keys. I find its syntax for nested structures a bit less intuitive than YAML’s indentation, but it’s a solid choice.

# data/config.toml
site_name = "My Brilliant Site"

[author]
name = "Jane Doe"
role = "Senior Mischief Manager"

[[author.social]]
platform = "mastodon"
handle = "@jane@mastodon.social"

CSV is your friend for tabular data. If you’re pulling in a spreadsheet from Google Sheets or exporting from a database, this is the way. Hugo will automatically parse it into a slice of rows, with the first row (usually) treated as the header containing the keys. It’s wonderfully boring and effective for its specific use case.

# data/products.csv
id,title,price,featured
1,Widget,19.99,true
2,Gadget,29.99,false

Accessing Your Data in Templates

This is the whole point. Accessing data is a matter of traversing the .Site.Data map. The path you use mirrors your directory structure within data/.

Accessing the author data from our examples above would look like this:

{{ with .Site.Data.authors.jane-doe }}
  <h3>{{ .name }}</h3>
  <p>Role: {{ .role }}</p>
  {{ range .social }}
    <a href="https://{{ .platform }}.com/{{ .handle }}">{{ .platform }}</a>
  {{ end }}
{{ end }}

For the CSV file, you’d get a slice of rows. You’d typically range over it.

{{ range $index, $row := .Site.Data.products }}
  {{ if eq $index 0 }}
    <thead><tr>{{ range $row }}<th>{{ . }}</th>{{ end }}</tr></thead>
  {{ else }}
    <tr>{{ range $row }}<td>{{ . }}</td>{{ end }}</tr>
  {{ end }}
{{ end }}

Note: The above is a bit naive. Your first row (index 0) is your header, which is fine for a quick demo, but for a real site, you’d probably want to use the getCSV function in a partial for more control, especially if you’re fetching from a URL.

Organizing Your Data Directory

Don’t just throw everything in the root of data/. Make subdirectories. It keeps things sane. For a large site, you might have:

data/
  authors/
    jane-doe.yaml
    john-smith.yaml
  products/
    widgets.csv
    gadgets.csv
  site/
    config.yaml
    navigation.yaml

This organization is reflected in your template access: .Site.Data.authors.jane_doe, .Site.Data.products.widgets, etc.

The Gotchas: Where Things Get Weird

File Names are Keys: Your filename jane-doe.yaml becomes the key jane-doe. This means you access it with .Site.Data.authors.jane-doe. Note the hyphen. If you name a file foo.bar.json, the key becomes foo.bar, which is perfectly valid but can look strange. Stick to letters, numbers, hyphens, and underscores for sanity.
Caching and Live Reload: Hugo is smart about watching your data/ files. Change one, and it should trigger a live reload. Mostly. I’ve occasionally seen it get confused with very deep nested structures or a huge number of files. A quick hugo server --disableFastRender usually kicks it back into gear.
Performance with Massive Data Sets: Hugo is a static generator, not a database server. If you’re trying to load a 50MB JSON file of every product you’ve ever sold, you’re gonna have a bad time. The entire data file is read into memory at build time. For enormous datasets, you’re better off using Hugo’s build hooks to pull and filter data from an external API or a proper database before the build runs.

The data/ directory is one of Hugo’s killer features. It effortlessly bridges the gap between a simple static site and a powerful, data-driven application. Use it to drive navigation, build entire pages from a CSV file, or manage shared configuration. Just keep it organized, and it will never let you down.