Docker Loki Grafana Vector Setup for Unified Container, SSH, and Fail2ban Log Monitoring

This setup uses Docker Compose to quickly deploy a Loki, Grafana, and Vector stack that centrally collects Docker container logs, SSH login logs, and Fail2ban security logs. It solves the common problems of scattered log sources, inefficient search, and slow troubleshooting. Keywords: Docker log monitoring, Loki, Grafana.

Technical Specifications Snapshot

Parameter Description
Deployment Method Docker Compose
Core Languages / Config YAML, TOML, LogQL
Log Protocols / Interfaces Docker Socket, file-based collection, HTTP push to Loki
Visualization Component Grafana
Storage Component Loki
Collection Component Vector
Typical Monitoring Targets N8N, generic containers, SSH, Fail2ban
Reference Popularity The original source is a CSDN hands-on technical article and does not provide a GitHub star count
Core Dependencies grafana/loki:2.9.0, grafana/grafana:latest, timberio/vector:0.55.0-alpine

This is a logging stack designed for single-host and lightweight operations scenarios

The value of this architecture is not that it is massive or all-inclusive. Its strength is that it is practical and easy to adopt. For self-hosted services, cloud servers, or home labs, the most common problem is not the absence of logs. It is that logs are scattered across containers, the operating system, and security components, making unified search difficult.

The Loki, Grafana, and Vector combination maps cleanly to three responsibilities: Vector handles collection and forwarding, Loki provides low-cost storage, and Grafana handles querying and visualization. Compared with ELK, this stack is lighter, requires less configuration, and is better suited for fast deployment.

The architecture flow can be summarized in three steps

  1. Vector listens to the Docker Socket and captures container stdout logs.
  2. Vector reads /var/log/auth.log and /var/log/fail2ban.log.
  3. All logs are written to Loki, and Grafana queries them with LogQL.
mkdir -p ~/apps/log-monitor
cd ~/apps/log-monitor

These commands create an isolated deployment directory so you can manage configuration files and data volumes in one place.

Basic runtime requirements must be in place before deployment

The server should ideally run Ubuntu or Debian with Docker and Docker Compose already installed. Open port 3000 for Grafana and port 3100 for Loki. If you only use the stack on an internal network, restrict source IPs as needed.

Another key point is permissions. Vector reads container logs through a mounted docker.sock and reads system logs through a read-only mount of /var/log, so the host log files must allow the process inside the container to read them.

docker-compose brings up all three service types in one step

version: "3.8"
services:
  loki:
    image: grafana/loki:2.9.0
    container_name: loki
    restart: always
    ports:
      - "3100:3100"
    volumes:
      - ./loki-data:/loki
    command:
      - -config.file=/etc/loki/local-config.yaml

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: always
    ports:
      - "3000:3000"
    volumes:
      - ./grafana-data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin123456 # Set the initial Grafana admin password
    depends_on:
      - loki

  vector:
    image: timberio/vector:0.55.0-alpine
    container_name: vector
    restart: always
    volumes:
      - ./vector.toml:/etc/vector/vector.toml # Mount the main Vector configuration
      - /var/run/docker.sock:/var/run/docker.sock # Read container logs
      - /var/log:/host/var/log:ro # Collect system logs in read-only mode

This configuration orchestrates the three core parts of the stack: log storage, visualization, and collection.

Vector configuration determines collection scope and stability

A core lesson from the original implementation is to avoid replaying historical logs all at once. If Vector reads a large volume of old logs immediately, Loki can easily return 429 Too Many Requests because of ingestion limits. That makes read_from = "end" a critical setting for stable operation.

At the same time, label design should remain minimal. For beginner-friendly scenarios, a fixed job="vector" label is enough for most searches. Introducing complex dynamic labels too early often creates parsing or routing issues instead of improving observability.

A stable and practical vector.toml looks like this

data_dir = "/var/lib/vector"

[sources.docker_logs]
type = "docker_logs"
docker_host = "unix:///var/run/docker.sock"

[sources.host_auth]
type = "file"
include = ["/host/var/log/auth.log"]
read_from = "end" # Read only new log entries to avoid overwhelming Loki with historical logs

[sources.fail2ban]
type = "file"
include = ["/host/var/log/fail2ban.log"]
read_from = "end" # Start reading from the end of the file to reduce initialization pressure

[sinks.loki]
type = "loki"
inputs = ["docker_logs", "host_auth", "fail2ban"]
endpoint = "http://loki:3100"
encoding.codec = "json"
labels.job = "vector" # Write all logs with a fixed label to simplify queries

This configuration covers container logs, SSH security logs, and Fail2ban blocking logs while also avoiding several common pitfalls.

The startup and validation workflow should begin by verifying the ingestion path

After the configuration is complete, start the services in detached mode. A reachable Grafana UI does not guarantee that log collection works. Your first real checkpoint should be that Vector logs show no errors and that the Loki data source passes its connection test.

docker compose up -d
docker compose logs -f vector

These commands start the stack and let you monitor the health of the collection component in real time.

If the logs show Healthcheck passed, Vector has usually passed its basic configuration validation. Next, log in to Grafana, add a Loki data source, set the URL to http://loki:3100, and save it. Once the test succeeds, you can move to Explore.

The Grafana query phase should start with low-noise statements

The benefit of unified labels is that queries remain straightforward. First verify the full data stream, then gradually narrow the scope to specific containers or security events. This is the fastest way to determine whether the problem lies in collection, permissions, or query conditions.

You can reuse these common LogQL queries directly

{job="vector"}
{container="n8n"}
{job="vector"} |= "sshd"
{job="vector"} |= "fail2ban"

These queries correspond to all logs, N8N container logs, SSH login logs, and Fail2ban ban-event logs.

The main risks in this setup are concentrated in four operational details

First, Grafana limits the number of returned log lines by default, so a large log volume can make you incorrectly assume that system logs are missing. Second, if Vector reads too many historical logs, Loki will throttle ingestion. Third, when host log file permissions are insufficient, file sources may be configured correctly but still return no results. Fourth, TOML is format-sensitive, so manual edits can easily introduce syntax errors.

This permission fix is commonly used when system logs do not appear

sudo chmod 644 /var/log/auth.log /var/log/fail2ban.log

This command relaxes the read permissions on SSH and Fail2ban log files so Vector can access them.

If the Grafana page appears empty, first increase the query limit from 1000 to 10000, then combine it with |= "keyword" to reduce noise. If you encounter a 429 error, focus on whether read_from = "end" is actually in effect instead of blindly restarting services.

This solution works well as a lightweight logging foundation

For personal operations, small to medium projects, and self-hosted multi-container services, this Loki + Grafana + Vector architecture provides a strong balance between deployment complexity, resource usage, and maintainability. It does not aim for advanced analytics first. Instead, it solves the more urgent problem: getting all logs into one searchable view.

If you want to expand later, you can add more file sources, container labels, alerting rules, and Grafana dashboards. For the first phase, however, keep the configuration minimal and make sure the ingestion path stays stable before adding complexity.

FAQ

1. Why choose Loki + Grafana + Vector instead of ELK?

Because this combination is lighter, faster to deploy, and less resource-intensive, especially for single-server environments, self-hosted services, and small to medium logging workloads.

2. Why must read_from = "end" be set?

It prevents Vector from ingesting a massive amount of historical logs on first startup, which can trigger Loki write throttling and produce 429 errors.

3. Grafana opens normally, but I cannot find SSH or Fail2ban logs. What should I check first?

First check the permissions on /var/log/auth.log and /var/log/fail2ban.log. Then inspect the Vector logs and confirm that your query uses the correct keywords.

AI Readability Summary

This article presents a lightweight log monitoring architecture built with Docker Compose. Vector collects Docker container logs, SSH login logs, and Fail2ban blocking logs, sends them to Loki, and Grafana provides unified search and visualization. The solution is well suited for self-hosted services and small to medium servers that need a practical centralized logging stack.