Skip to content

Observability

Observability is opt-in. The default validator stack ships no observability containers — most operators either go without or scrape the validator's /metrics endpoints with their own external Prometheus.

Two independent opt-in feature flags. They compose freely (use none, one, or both):

  • Alloy push (--with-alloy) — lightweight; Alloy + cAdvisor forward metrics/logs/traces to a remote Prometheus / Loki / Tempo you already operate. Recommended when you have (or can point at) a central observability backend.
  • Local monitoring (--with-local-monitoring) — opt-in heavy mode; the validator host also runs Prometheus + Grafana + Loki + Tempo + alert rules. Useful for self-contained dashboards on the same box.

Combinations:

--with-alloy --with-local-monitoring Result
no no Default. No observability containers. /metrics still on :21100.
yes no Push to your remote backend (recommended).
no yes Local Prometheus + Grafana + Loki + Tempo + alerts only.
yes yes Both — local dashboards AND remote forwarding.

What Linera itself exposes (always on)

Every Linera service exposes Prometheus metrics on port 21100 and OTLP traces via the LINERA_OTLP_EXPORTER_ENDPOINT env var (already wired by the chart and compose). Logs go to stdout in structured JSON when RUST_LOG is set (default).

These endpoints are visible whether you run the observability overlays or not — point any external scraper at them.

Docker Compose

Default — no observability containers

./scripts/deploy-validator.sh validator.example.com admin@example.com

You still get the /metrics endpoint on port 21100 of every service. Hook your own Prometheus to it.

./scripts/deploy-validator.sh --with-alloy \
  validator.example.com admin@example.com

Fill the push endpoints in .env:

PROMETHEUS_OTLP_URL=https://prometheus.example.com/otlp
PROMETHEUS_OTLP_USER=...
PROMETHEUS_OTLP_PASS=...
LOKI_PUSH_URL=https://loki.example.com/loki/api/v1/push
LOKI_PUSH_USER=...
LOKI_PUSH_PASS=...
TEMPO_OTLP_URL=https://tempo.example.com/tempo/otlp
TEMPO_OTLP_USER=...
TEMPO_OTLP_PASS=...

Without those set Alloy silently drops data — there's nowhere to forward it.

Local monitoring (heavy opt-in)

Brings up Prometheus + Grafana + Loki + Tempo + alert rules on the host:

./scripts/deploy-validator.sh --with-local-monitoring \
  validator.example.com admin@example.com

Combine with --with-alloy to push the same data out to a remote backend in parallel:

./scripts/deploy-validator.sh --with-alloy --with-local-monitoring \
  validator.example.com admin@example.com

Ports:

  • 3000 — Grafana (login admin / ${GRAFANA_ADMIN_PASSWORD:-admin})
  • 9090 — Prometheus
  • 3100 — Loki
  • 3200 — Tempo
  • 12345 — Alloy UI

Dashboards in docker/dashboards/; drop a new .json and re-docker compose up to pick it up. Alert rules in docker/alerts.rules.yml.

Layering it yourself

Without the deploy script:

cd docker

# Default (off)
docker compose -f docker-compose.yml up -d

# Alloy push
docker compose -f docker-compose.yml -f docker-compose.alloy.yml up -d

# Local monitoring
docker compose -f docker-compose.yml -f docker-compose.local-monitoring.yml up -d

# Both — alloy pushes AND on-box dashboards
docker compose \
  -f docker-compose.yml \
  -f docker-compose.alloy.yml \
  -f docker-compose.local-monitoring.yml \
  up -d

Helm

The Helm chart follows the same opt-in / composable philosophy as the compose stack, but at finer granularity. The chart itself ships no observability stack — Prometheus, Grafana, Loki, Tempo are your cluster's concern. The chart only emits the CRs your existing stack needs to start scraping, alerting, and rendering dashboards.

Three independent flags. Enable any combination:

Flag Emits Requires
serviceMonitor.enabled ServiceMonitor targeting shards + proxy /metrics. Prometheus Operator (or Alloy configured to discover SMs).
prometheusRule.enabled PrometheusRule with default Linera alerts + yours. Prometheus Operator + PrometheusRule CRD.
dashboards.enabled One ConfigMap per JSON dashboard, sidecar-labelled. Grafana with the dashboard sidecar enabled.

Defaults are all false — a bare helm install produces no observability resources, same as a bare docker compose up produces no observability containers.

# my-values.yaml — turn on everything
serviceMonitor:
  enabled: true
  labels:
    release: prometheus        # match your Prometheus's serviceMonitorSelector
prometheusRule:
  enabled: true
dashboards:
  enabled: true

Remote push (Alloy / agent-based forwarding)

The Helm chart does not ship an Alloy sidecar — that's a cluster concern, not a per-chart concern. If you want the same "push to a remote Prometheus / Loki / Tempo" behaviour the compose --with-alloy flag gives you, install Alloy (or the Grafana Agent, or OpenTelemetry Collector) as a separate release in the cluster and point it at the ServiceMonitor created by serviceMonitor.enabled=true.

Customising alerts

Extend the shipped rules without forking the chart:

prometheusRule:
  enabled: true
  extraRules:
    - alert: MyCustomAlert
      expr: up{...} == 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: 

Dashboards shipped with the chart

dashboards.enabled=true emits one ConfigMap per JSON file under helm/linera-validator/dashboards/, recursively, labelled for the Grafana sidecar. Current set:

  • linera/general.json — validator overview (proxy + shard health, rps, error rate)
  • linera/execution.json — execution layer metrics
  • linera/views.json — view cache / materialisation metrics
  • linera/storage/storage.json — generic storage layer metrics
  • linera/storage/rocksdb.json — RocksDB-specific counters (when dual-store enabled)
  • linera/storage/scylladb.json — ScyllaDB-specific counters
  • linera/vms/ethereum.json — EVM-runtime dashboards
  • profiling/cpu.json, profiling/jemalloc-memory.json — continuous-profiling views (Pyroscope)
  • scylla/scylla-{overview,detailed,advanced,cql,ks,os,alternator}.json — upstream ScyllaDB dashboards (tag 6.2)
  • scylla-manager/scylla-manager.json — ScyllaDB Manager dashboard (tag 3.4)

Drop additional JSON files into the same directory and re-run helm upgrade — the chart picks them up automatically.