43.7 Centralizing Logs: rsyslog to a SIEM or Log Aggregation Platform
Right, so you’ve got logs spewing out of every server like a firehose. You could try to read them by SSHing into each box and tailing files until your eyes bleed, but let’s be honest: that’s a special kind of masochism reserved for people who also enjoy assembling IKEA furniture without the instructions. The only sane way to make sense of this chaos is to get all those logs off the individual machines and into a central system—a SIEM, an Elasticsearch cluster, a cloud-based log aggregator, whatever. You need a single pane of glass, even if that glass is sometimes a little dirty.
The workhorse for this job on Linux is, and has been for years, rsyslog. It’s powerful, it’s ubiquitous, and its configuration syntax looks like it was designed by a medieval scribe who was also having a bad day. But we love it because it works.
Getting Your Logs Off the Machine
First, we need to tell rsyslog to send its messages somewhere else. This is done with a forwarding rule. The classic method uses the @ syntax for UDP and @@ for TCP. Always, always use TCP. UDP for logs is a prank played on you by network gremlins; you will lose messages during a burst event, precisely when you need them most.
Here’s how you do it. Crack open /etc/rsyslog.conf or, better yet, drop a new file into /etc/rsyslog.d/ (e.g., 10-forward-to-siem.conf). This keeps your changes isolated and safe from package upgrades.
# Send ALL facility.severity messages to our central log host via TCP
*.* @@log-aggregator.example.com:6514
That *.* means “every facility and every severity.” You might want to be more specific later, but this is a great starting point. The @@ tells it to use TCP, and we’re using port 6514, which is the standard for secure syslog over TLS (more on that in a sec).
Why You Absolutely Must Use TLS
Sending your logs in plaintext across the network is like sending your diary via postcard. Anyone with a packet sniffer can read your authentication failures, your application errors, your everything. It’s a security nightmare. So we encrypt the connection.
This is where the rsyslog config gets… fun. We need to tell it to use a network stream driver with TLS. This requires a certificate on both ends. Here’s a config snippet that will make it happen:
# Configure the TLS connection
$DefaultNetstreamDriver gtls
$ActionSendStreamDriverMode 1 # require TLS for the connection
$ActionSendStreamDriverAuthMode x509/name # authenticate by certificate name
$ActionSendStreamDriverPermittedPeer *.example.com # only talk to servers with a cert matching this pattern
# Your actual forwarding rule, now with TLS goodness
*.* @@(o)://log-aggregator.example.com:6514
Notice the (o) in the rule? That’s what ties it to the TLS configuration above. You’ll also need to make sure the CA certificate that signed your log aggregator’s certificate is on this client machine. You’ll typically point to it with a line like:
$DefaultNetstreamDriverCAFile /etc/ssl/certs/ca-certificates.crt
The path to your CA bundle will vary by distro. This is the most common point of failure. If the client can’t validate the server’s certificate, the connection will silently fail. Which brings me to…
Debugging the Firehose
When this stuff breaks, it fails quietly. Your messages just vanish into the ether. Your first tool for debugging is the rsyslog log file itself. Check /var/log/syslog or /var/log/messages on the client. Look for errors about TLS handshakes or connection refused.
You can also run rsyslogd in the foreground with debug output. It’s gloriously verbose.
sudo systemctl stop rsyslog
sudo rsyslogd -dn
Now try to trigger a log message (logger "test message"). Watch the output for clues. It’ll tell you exactly why it’s refusing to connect to your server.
Templates: Because the Default Format is Garbage
By default, rsyslog will forward messages in their raw, original format. This is often useless. Your SIEM needs to parse these messages, and to do that reliably, it needs structure. The best way to provide this is to send messages in a structured format like JSON.
This is where templates come in. You define a template that formats the log message as a JSON object, and then you apply that template to your forwarding action.
# Define a template that structures the log as JSON
template(name="LogToJson" type="list") {
constant(value="{")
constant(value="\"timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"host\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"program\":\"") property(name="program")
constant(value="\",\"pid\":\"") property(name="procid")
constant(value="\",\"message\":\"") property(name="msg" format="json")
constant(value="\"}\n")
}
# Now use that template for our forwarding rule
*.* action(
type="omfwd"
protocol="tcp"
port="6514"
target="log-aggregator.example.com"
StreamDriver="gtls"
StreamDriverMode="1"
StreamDriverAuthMode="x509/name"
StreamDriverPermittedPeers="*.example.com"
template="LogToJson" # <-- This is the key part
)
Yes, the template syntax is arcane. It’s the kind of thing you write once and then copy-paste forever. But the payoff is huge: your SIEM will receive clean, parsed JSON fields instead of a monolithic string it has to guess how to split up.
The real best practice? Test this incrementally. Don’t forward all your logs at once. Start with a specific facility (local0.*) or a single application. Make sure the messages are arriving, parsed correctly, and looking beautiful in your SIEM before you flip the switch on your entire infrastructure. It saves you from the special kind of panic that comes from breaking your primary debugging tool while debugging it.