Home Lab Observability: Logs, Metrics, and Proactive Alerts with Graylog & Zabbix

Home Lab Observability: Logs, Metrics, and Proactive Alerts with Graylog & Zabbix

Our home lab has evolved from a simple collection of devices into a fully observable network and security environment. By combining Graylog and Zabbix, we achieve both log-based security visibility and metric-based performance monitoring, mirroring enterprise-level monitoring architectures.

What You Will Learn

  • How to set up Graylog for centralized firewall and network logging
  • How to deploy Zabbix for metrics monitoring and proactive alerts
  • Secure SNMPv3 configuration for your ASA and switches
  • Segmentation using Zabbix Proxy to reduce server load
  • Monitoring of switch uplinks and ICMP latency/packet loss
  • Implementing alert escalation strategies
  • Tuning Zabbix database for high performance
  • Advanced event correlation to reduce alert noise and detect root causes

This multi-part post walks through:

  • Graylog centralized logging
  • Zabbix metrics monitoring
  • SNMPv3 secure telemetry
  • Zabbix Proxy segmentation
  • Switch uplinks and ICMP monitoring
  • Alert escalation strategies
  • Database tuning
  • Advanced event correlation

Part 1: Graylog — Centralized Log Collection

We started with Graylog to centralize firewall logs from our Cisco ASA 5525-X:

  • Blocked connection tracking
  • Source/destination IP visibility
  • Rule hit frequency dashboards
  • Security incident analysis

Graylog provides event visibility — answering “What happened?” — but cannot provide performance metrics, like WAN utilization or interface saturation.

Screenshot Placeholder: Graylog Dashboard showing ASA firewall blocked connections


Part 2: Introducing Zabbix for Metrics Monitoring

Why Zabbix?

Zabbix adds metric collection, historical trends, and proactive alerts. We now monitor:

  • Interface bandwidth utilization
  • Packet drops and errors
  • Firewall CPU/memory usage
  • ICMP latency and packet loss
  • Switch uplinks

This complements Graylog’s logs with performance visibility — answering “How is the network behaving over time?”

Architecture Overview

Network diagram showing Graylog, Zabbix server, Zabbix proxy, Cisco ASA 5525-X, switches, and device connectivity.


Part 3: Deploying Zabbix

Server Sizing (Lab Scale)

ComponentSpec
CPU2 vCPU
RAM8 GB
Disk150 GB SSD
DatabasePostgreSQL
Retention60 days history, 365 days trends

Installation (Ubuntu 22.04 example)

wget https://repo.zabbix.com/zabbix/6.0/ubuntu/pool/main/z/zabbix-release/zabbix-release_latest_6.0+ubuntu22.04_all.deb
dpkg -i zabbix-release_latest_6.0+ubuntu22.04_all.deb
apt update
apt install zabbix-server-pgsql zabbix-frontend-php zabbix-nginx-conf zabbix-sql-scripts zabbix-agent postgresql

Database Setup

sudo -u postgres createuser --pwprompt zabbix
sudo -u postgres createdb -O zabbix zabbix
zcat /usr/share/zabbix-sql-scripts/postgresql/server.sql.gz | sudo -u zabbix psql zabbix

Zabbix Server Config

DBName=zabbix
DBUser=zabbix
DBPassword=<password>

Start Services

systemctl enable zabbix-server zabbix-agent nginx php8.1-fpm
systemctl start zabbix-server zabbix-agent nginx php8.1-fpm

Screenshot Placeholder: Zabbix Server dashboard


Part 4: ASA 5525 SNMPv3 Monitoring

ASA Configuration

conf t
snmp-server group ZABBIX-GROUP v3 priv
snmp-server user zabbix-user ZABBIX-GROUP v3 auth sha AuthPass123 priv aes 128 PrivPass123
snmp-server host inside 192.168.10.10 version 3 priv zabbix-user
write memory

Zabbix Host Configuration

  • Version: SNMPv3
  • Security Name: zabbix-user
  • Auth Protocol: SHA
  • Auth Pass: AuthPass123
  • Privacy Protocol: AES
  • Privacy Pass: PrivPass123

Test Connectivity

snmpwalk -v3 -u zabbix-user -l authPriv -a SHA -A AuthPass123 -x AES -X PrivPass123 192.168.10.1

Screenshot Placeholder: SNMPv3 test output


Part 5: Zabbix Proxy for Segmentation

Install Proxy

apt install zabbix-proxy-pgsql postgresql
sudo -u postgres createuser --pwprompt zabbix_proxy
sudo -u postgres createdb -O zabbix_proxy zabbix_proxy
zcat /usr/share/zabbix-sql-scripts/postgresql/proxy.sql.gz | sudo -u zabbix_proxy psql zabbix_proxy

Proxy Config

Server=192.168.1.50
Hostname=ZBX-PROXY-01
DBName=zabbix_proxy
DBUser=zabbix_proxy
DBPassword=<password>
ProxyMode=0
systemctl enable zabbix-proxy
systemctl start zabbix-proxy

Screenshot Placeholder: Zabbix Proxy registration


Part 6: Switch Uplink & ICMP Monitoring

Switch Uplink Trigger (1Gbps)

{Switch01:net.if.out.util[GigabitEthernet1/0/48].avg(5m)} > 75

ICMP Latency & Packet Loss

Latency: {ASA5525:icmppingsec.avg(5m)} > 0.05
Packet Loss: {ASA5525:icmppingloss.avg(5m)} > 5

Screenshot Placeholder: Uplink and ICMP graphs


Part 7: Alert Escalation Strategy

WAN Interface Utilization Example

SeverityThresholdAction
Warning70–80%Slack notification
High80–90%Email + Slack
Disaster >90%Phone call / PagerDuty

ICMP Packet Loss Escalation

  • >5% → Initial email
  • >15% sustained → Escalate to senior engineer
  • >30% → Notify ISP

Screenshot Placeholder: Alert action configuration


Part 8: Tuning Zabbix Database Performance

  • SSD storage is critical
  • Enable housekeeping and retention (History 60 days, Trends 365 days)
  • Consider TimescaleDB for high-frequency metrics
  • Index key tables: history(itemid, clock)
  • Optimize polling intervals
  • Use Zabbix proxies to offload writes

TimescaleDB Example

CREATE EXTENSION IF NOT EXISTS timescaledb;
SELECT create_hypertable('history', 'clock');

Screenshot Placeholder: DB performance dashboard


Part 9: Observability Outcome

  • Historical WAN & interface utilization
  • Proactive saturation alerts
  • Packet loss & latency detection
  • CPU/memory monitoring
  • Switch uplink visibility
  • Segmented polling via Proxy
  • Security events from Graylog intact

Screenshot Placeholder: Combined dashboards


Part 10: Next Steps

  • Add additional devices (APs, routers, switches)
  • Build SLA dashboards in Zabbix/Grafana
  • Implement event correlation rules
  • Explore TimescaleDB compression
  • Document alert runbooks

Part 11: Advanced Event Correlation

Why Event Correlation?

  • Reduce noisy alerts
  • Combine related events
  • Trigger escalations only for true root causes

WAN Dependency Example

Trigger NameDepends On
ASA ICMP Latency HighWAN Interface Down
ASA ICMP Packet LossWAN Interface Down
ASA VPN CPU SpikeWAN Interface Down

Event Correlation Rule Example

  • Tag triggers: type=network, device=ASA5525
  • Condition: >2 events in 5 minutes
  • Operations: Notify Admin via Slack + Email, optionally PagerDuty

Screenshot Placeholder: Event correlation rule setup

Integration Summary

LayerToolFunction
LogsGraylogSecurity & firewall events
MetricsZabbix ServerPerformance monitoring & availability
SegmentationZabbix ProxyDistributed polling
NetworkSNMPv3Encrypted telemetry
HealthICMPLatency & packet loss
Event CorrelationZabbix RulesIntelligent alert escalation

With these configurations, your home lab now mirrors enterprise NOC/SOC observability workflows.

Comments

Popular posts from this blog

Building a Secure Virtual OPNsense 26.1 Firewall with VLANs, DMZ, and CARP High Availability

Proxmox VE + full Kubernetes (kubeadm) step-by-step

Monitoring Virtualized Environments with Graylog: A Complete Guide