18.1 Storing Data: JSON, YAML, TOML, and CSV in data/
Right, let’s talk about your data/ directory. This is Hugo’s designated “I’m not a page, I’m just data” drawer. It’s where you stash all the structured information your site needs but that doesn’t deserve (or want) its own front matter and Markdown content. Think of it as your site’s personal JSON, YAML, TOML, and CSV junk drawer—hopefully more organized than your actual junk drawer.
The beauty here is that Hugo, in its infinite wisdom, automatically slurps up any file in data/ and makes it available globally in your templates under the .Site.Data variable. The name of the file becomes the top-level key. It’s dead simple and incredibly powerful.
The Format Face-Off: JSON, YAML, TOML, and CSV
You’ve got options, and they all have their place. Let’s break down the usual suspects.
JSON is the old reliable workhorse of the web. It’s unambiguous and universally supported. The downside? It’s a bit fussy for humans to write by hand. No comments, trailing commas will betray you, and it’s generally just…noisy.
// data/authors/jane-doe.json
{
"name": "Jane Doe",
"role": "Senior Mischief Manager",
"social": [
{
"platform": "mastodon",
"handle": "@jane@mastodon.social"
}
]
}
YAML is what you use when you value human readability above all else. No braces, no quotes (mostly), and comments are a thing! It’s perfect for configuration and data you need to hand-edit frequently. The only gotcha is its whitespace sensitivity; you must mind your tabs and spaces. (Spoiler: always use spaces).
# data/authors/jane-doe.yaml
name: Jane Doe
role: Senior Mischief Manager
social:
- platform: mastodon
handle: "@jane@mastodon.social"
# See? So much cleaner. And I can leave notes for myself!
TOML is the newcomer that’s gained a lot of love, especially from the Rust crowd (which Hugo is written in). It aims to be a less ambiguous YAML. It’s great for configuration that’s a mix of simple and complex keys. I find its syntax for nested structures a bit less intuitive than YAML’s indentation, but it’s a solid choice.
# data/config.toml
site_name = "My Brilliant Site"
[author]
name = "Jane Doe"
role = "Senior Mischief Manager"
[[author.social]]
platform = "mastodon"
handle = "@jane@mastodon.social"
CSV is your friend for tabular data. If you’re pulling in a spreadsheet from Google Sheets or exporting from a database, this is the way. Hugo will automatically parse it into a slice of rows, with the first row (usually) treated as the header containing the keys. It’s wonderfully boring and effective for its specific use case.
# data/products.csv
id,title,price,featured
1,Widget,19.99,true
2,Gadget,29.99,false
Accessing Your Data in Templates
This is the whole point. Accessing data is a matter of traversing the .Site.Data map. The path you use mirrors your directory structure within data/.
Accessing the author data from our examples above would look like this:
{{ with .Site.Data.authors.jane-doe }}
<h3>{{ .name }}</h3>
<p>Role: {{ .role }}</p>
{{ range .social }}
<a href="https://{{ .platform }}.com/{{ .handle }}">{{ .platform }}</a>
{{ end }}
{{ end }}
For the CSV file, you’d get a slice of rows. You’d typically range over it.
{{ range $index, $row := .Site.Data.products }}
{{ if eq $index 0 }}
<thead><tr>{{ range $row }}<th>{{ . }}</th>{{ end }}</tr></thead>
{{ else }}
<tr>{{ range $row }}<td>{{ . }}</td>{{ end }}</tr>
{{ end }}
{{ end }}
Note: The above is a bit naive. Your first row (index 0) is your header, which is fine for a quick demo, but for a real site, you’d probably want to use the getCSV function in a partial for more control, especially if you’re fetching from a URL.
Organizing Your Data Directory
Don’t just throw everything in the root of data/. Make subdirectories. It keeps things sane. For a large site, you might have:
data/
authors/
jane-doe.yaml
john-smith.yaml
products/
widgets.csv
gadgets.csv
site/
config.yaml
navigation.yaml
This organization is reflected in your template access: .Site.Data.authors.jane_doe, .Site.Data.products.widgets, etc.
The Gotchas: Where Things Get Weird
File Names are Keys: Your filename
jane-doe.yamlbecomes the keyjane-doe. This means you access it with.Site.Data.authors.jane-doe. Note the hyphen. If you name a filefoo.bar.json, the key becomesfoo.bar, which is perfectly valid but can look strange. Stick to letters, numbers, hyphens, and underscores for sanity.Caching and Live Reload: Hugo is smart about watching your
data/files. Change one, and it should trigger a live reload. Mostly. I’ve occasionally seen it get confused with very deep nested structures or a huge number of files. A quickhugo server --disableFastRenderusually kicks it back into gear.Performance with Massive Data Sets: Hugo is a static generator, not a database server. If you’re trying to load a 50MB JSON file of every product you’ve ever sold, you’re gonna have a bad time. The entire data file is read into memory at build time. For enormous datasets, you’re better off using Hugo’s build hooks to pull and filter data from an external API or a proper database before the build runs.
The data/ directory is one of Hugo’s killer features. It effortlessly bridges the gap between a simple static site and a powerful, data-driven application. Use it to drive navigation, build entire pages from a CSV file, or manage shared configuration. Just keep it organized, and it will never let you down.