Monitoring guide

Monitor Loki-VL-proxy as Client -> Proxy -> VictoriaLogs

The right monitoring model is not a flat pile of proxy counters. It is a flow: downstream client demand on the left, proxy translation and cache behavior in the middle, and upstream VictoriaLogs latency or errors on the right.

Open the observability docs See the operations guide

Route-aware labels

The main request metrics split by `system`, `direction`, `endpoint`, `route`, and `status`

Use route, not only endpoint, in dashboards.

Client vs upstream

End-to-end and backend latency are separate histograms

This is how you isolate proxy overhead from VictoriaLogs slowness.

Cache layers visible

Hits, misses, window cache, peer cache, and resource metrics are all exported

Useful for cost and latency tuning.

Logs fill the gap

Structured request logs carry `proxy.overhead_ms`, `proxy.duration_ms`, and `upstream.duration_ms`

Per-request decomposition belongs in logs.

Clients and GrafanaDownstream rate, status, latencyProxy translation and cache behaviorUpstream VictoriaLogs latency and errorsProxy resources and fleet health

Metric family	What it answers operationally
`loki_vl_proxy_requests_total`	Request rate and error rate by `system`, `direction`, `endpoint`, `route`, and `status`.
`loki_vl_proxy_request_duration_seconds`	Client-visible end-to-end latency on each normalized downstream route.
`loki_vl_proxy_backend_duration_seconds`	VictoriaLogs or rules-backend latency on the upstream side, split by route.
`loki_vl_proxy_cache_hits_by_endpoint` / `loki_vl_proxy_cache_misses_by_endpoint`	Route-aware cache efficiency for label browsing, metadata, patterns, and query paths.
`loki_vl_proxy_window_*`	Long-range `query_range` window cache, prefilter, merge, and adaptive parallelism behavior.
`loki_vl_proxy_peer_cache_*`	Fleet-cache hits, misses, peer failures, and cluster member counts in multi-replica topologies.
`loki_vl_proxy_process_*`	Runtime CPU, memory, disk, network, PSI, and file-descriptor health for the proxy itself.

Downstream p95 by route

Start here when users say Grafana feels slow. This is the client-visible latency across normalized Loki routes.

histogram_quantile(
  0.95,
  sum by (le, route) (
    rate(loki_vl_proxy_request_duration_seconds_bucket{
      system="loki",
      direction="downstream"
    }[5m])
  )
)

Upstream p95 by route

Use this to see whether the pain is really VictoriaLogs or rules backend slowness rather than the proxy path itself.

histogram_quantile(
  0.95,
  sum by (le, route) (
    rate(loki_vl_proxy_backend_duration_seconds_bucket{
      system="vl",
      direction="upstream"
    }[5m])
  )
)

Cache hit ratio by route

Use this on metadata-heavy and repeated dashboard paths to see whether the cache stack is actually reducing backend work.

sum by (route) (
  rate(loki_vl_proxy_cache_hits_by_endpoint{
    system="loki",
    direction="downstream"
  }[5m])
)
/
clamp_min(
  sum by (route) (
    rate(loki_vl_proxy_cache_hits_by_endpoint{
      system="loki",
      direction="downstream"
    }[5m]) +
    rate(loki_vl_proxy_cache_misses_by_endpoint{
      system="loki",
      direction="downstream"
    }[5m])
  ),
  1
)

What logs add that metrics do not

`http.route` for the normalized path.
`loki.api.system` and `proxy.direction` for path orientation.
`proxy.overhead_ms` for proxy-only time on a request.
`upstream.duration_ms` when the backend call exists.

The questions operators should answer quickly

Which downstream routes are slow or erroring right now?
Is VictoriaLogs slow on the same routes or only the proxy path?
Which routes are missing cache and forcing backend work repeatedly?
Are peer-cache failures or resource saturation pushing more traffic upstream?

Resource view that stays consistent

Use `loki_vl_proxy_process_*` families for CPU, memory, disk, network, and PSI.
Read network and disk as up/down time-series, not only point-in-time stats.
Watch file descriptors and resident memory by pod for slow leak or churn patterns.
Keep runtime charts beside route health so regressions are easier to correlate.

Monitoring follow-up docs

Cache and fleet-cache guide Runbooks High latency runbook System resources runbook