Alright, let’s talk about Loki. If you’ve ever run a kubectl logs command for a pod that’s since been deleted and gotten that sinking feeling of “where did my logs go?”, you understand the why of log aggregation. Loki is Grafana’s answer to this, but it takes a fundamentally different, and frankly, more cost-effective approach than the elephants in the room (I’m looking at you, Elasticsearch).
The core premise is brilliantly simple and a bit contrarian: don’t index the content of the log lines. Index only the labels associated with them (like namespace, pod_name, container_name). When you want to search your logs, you first use those labels to narrow down the set of logs you’re dealing with to a manageable chunk, and then you do a brute-force grep-style search on that subset. This is the opposite of full-text indexing, where you pay a massive upfront cost in CPU, memory, and storage to index every word so you can find it instantly later. Loki makes the query a bit slower so the ingest is cheaper, faster, and simpler. For the vast majority of debugging use cases, this is a trade-off you absolutely want to make.