Skip to main content

Roadmap

Completed​

  • LogQL -> LogsQL translation (stream selectors, line filters, parsers, metric queries)

  • Response format conversion (VL NDJSON -> Loki streams, VL stats -> Prometheus matrix/vector)

  • Request coalescing (singleflight: N queries -> 1 backend request)

  • Rate limiting (per-client token bucket + global concurrency)

  • Circuit breaker (closed->open->half-open with built-in defaults)

  • Query normalization (sort matchers, collapse whitespace for cache keys)

  • Tiered cache (per-endpoint TTLs, L1 in-memory, L2 bbolt on-disk)

  • Multitenancy (string->int tenant mapping, numeric passthrough, SIGHUP reload)

  • WebSocket tail (/loki/api/v1/tail -> VL NDJSON streaming)

  • L2 disk cache with gzip compression, write-back buffer (encryption via cloud provider)

  • OTLP telemetry push (gzip/zstd compression, TLS)

  • HTTP hardening (timeouts, body limits, security headers)

  • Index stats, volume, volume_range via VL /select/logsql/hits

  • Query fingerprinting + analytics (/debug/queries)

  • Graceful HTTP server shutdown (SIGTERM/SIGINT)

  • Grafana datasource config (maxLines, basic auth, backend timeout, TLS, header/cookie forwarding, optional listener mTLS)

  • Derived fields (regex extraction for trace linking)

  • Chunked streaming (Transfer-Encoding: chunked for large results)

  • OTel label translation (bidirectional dot<->underscore for 50+ fields)

  • Custom field remapping (-field-mapping)

  • Per-tenant metrics (request rate, latency, error rate by X-Scope-OrgID)

  • Client error breakdown (bad_request, rate_limited, not_found, body_too_large)

  • Grafana dashboard & alerting rules (Helm PrometheusRule CR)

  • Write safeguard (/push blocked with 405)

  • | decolorize proxy-side ANSI stripping

  • | ip("CIDR") proxy-side IP range filtering

  • | line_format full Go templates

  • pprof, SIGHUP reload, and rate-limit response headers

  • Per-endpoint cache/backend metrics, CB state gauge

  • Fuzz testing (1.2M+ executions, no panics)

  • Nested binary metric queries (sum(rate(...)) / sum(rate(...)))

  • /loki/api/v1/patterns proxy-side Loki-style Drain tokenizer/clustering extraction

  • direction parameter (forward/backward sort)

  • quantile_over_time() mapped to VL quantile

  • label_format multi-rename (comma-separated)

  • Extended binary ops (%, ^, ==, !=, >, <, >=, <=)

  • Datasource compatibility handlers (/rules, /alerts, /config)

  • Playwright UI e2e tests

  • /tail browser and ops coverage (origin policy, native fallback, ingress recovery, upstream 401/403/5xx parity)

  • Multi-tenant Explore and Logs Drilldown coverage for __tenant_id__, labels, series, and detected field/label browser/resource surfaces

  • Delete API endpoint with safeguards (confirmation, tenant scoping, audit logging)

  • without() clause detection and clear error message

  • IsScalar supports negative and scientific notation

  • Circuit breaker half-open metrics fix

  • Tenant map reload race condition fix

  • group() outer aggregation — inner metric translated normally, proxy normalises all values to 1 (v1.21.0)

  • label_replace() — proxy post-processing via marker suffix; Prometheus no-match semantics (v1.21.0)

  • label_join() — proxy post-processing via marker suffix; missing src labels skipped (v1.21.0)

  • count_values() — returns descriptive error (not translatable to VL; VL has no group-by-value primitive) (v1.21.0)

  • Circuit breaker sliding-window failure counting — 30s window replaces consecutive-failure model; sporadic slow-query resets no longer open the breaker (v1.18.0)

  • Bare label matcher error — app="value" (missing braces) returns HTTP 400 with descriptive Loki-style parse error instead of silently emitting malformed VL syntax (v1.20.0)

  • detected_level inference from _msg content — proxy infers level from JSON/logfmt log body when not present in stream labels; Drilldown volume API gains automatic parser unpacking (v1.20.0)

  • Deterministic log stream ordering for multi-window queries — streams sorted by canonical key, per-stream values sorted by timestamp before response emission (v1.21.1)

  • bool modifier on comparison operators — stripped at translation (applyOp returns 1/0 for all comparisons)

  • Field-specific parser | json field1, field2 / | logfmt field1, field2 — maps to full unpack (VL extracts all fields)

  • Backslash-escaped quotes in stream selectors — findMatchingBrace handles \"

  • Binary expression detection before metric query (fixes rate(...) > 0 being misrouted)

  • Peer cache design doc + headless service Helm template

  • Performance-focused optimization pass (buffer pools, sync.Pool, connection-pool tuning, cache hot-path benchmarks)

  • Complete Helm chart: 11 templates, GOMEMLIMIT auto-calc, HTTPRoute

  • Broad test suite, CI bench job, and regression gates

  • Coverage and quality-gate reinforcement for runtime, middleware, cache, and proxy-path tests

  • Tier0 compatibility-edge cache with bounded memory budget, safe GET-only guardrails, and reload invalidation

  • Fleet shadow-copy validation for 3-peer cache reuse plus Tier0/fleet micro-benchmarks

  • Named tenant routing enhancements — YAML/JSON tenant map file with mtime-polling hot-reload and SIGHUP; label-based tenant isolation (-tenant-label injects {field="orgID"} into VL queries); -forward-tenant-header passthrough for Lakehouse; uint32 validation at load time; LogsQL injection prevention via backslash-before-quote escaping (v1.31.x)

  • Exhaustive LogQL parity machine — 316 parse/translation cases covering stream selectors, line filters, parsers, metric queries, binary ops, subqueries, offset modifier, unwrap unit conversion, field-specific parsers, named regexp groups, vector matching, complex pipelines, and edge cases; LogQL syntax validator for Loki error parity; comprehensive pipeline parity tests; all 14 previously tracked proxy_bug and proxy_strict KnownGaps resolved (v3.7.1)

  • | drop field=value matcher semantics — proxy applies conditional field removal via ParseDropConditions + applyDropConditions post-processing; the value predicate is now respected (previously the field was always dropped regardless of value) (v3.7.1)

  • structuredMetadata vs parsedFields classification — proxy correctly classifies fields by comparing against _msg JSON content, preventing structured metadata from being misclassified as parsedFields (v3.7.1)

  • Non-OTel structured metadata e2e tests — push tests for plain structured metadata (non-OTel) added to log-generator; default label-style changed to underscores and metadata-field-mode to translated (v3.7.1)

  • Config examples folder (examples/) — runnable env files, tenant map YAML, Docker Compose stack, Grafana datasource provisioning, systemd unit, Kubernetes ConfigMap with full annotated flag/env reference

Recently Shipped​

  • on()/ignoring()/group_left()/group_right() vector matching (0.22.0)
  • @ timestamp modifier (0.19.0)
  • unwrap duration()/bytes() unit conversion (0.21.0)
  • Subquery syntax rate(...)[1h:5m] — proxy-side evaluation (0.23.0)
  • LRU cache eviction (0.21.0)
  • Peer cache Phase 1 implementation (DNS discovery + peer fetch) (0.24.0)
  • System metrics in /metrics (CPU, memory, IO, network via /proc) (0.19.0)
  • Native VL stream selector optimization for known _stream_fields (0.23.0)
  • PR quality report workflow with coverage, compatibility, and performance delta comments (0.26.0)
  • Add bounded peer hot-read-ahead (top-N hot keys with per-interval key/byte/concurrency budgets, jitter, and tenant fairness) to improve non-owner local hit rates without causing peer traffic storms (1.0.25)
  • Add regression/perf suite for collapse forwarding and hot-read-ahead interactions (owner/non-owner paths, coalescing efficiency, backend offload delta) (1.0.25)
  • Promote compose-backed e2e fleet cache smoke coverage into required GitHub Actions for pull requests and post-merge main runs (0.27.7)
  • fastjson + streaming rewrite of stats hot path — eliminate per-entry heap allocations in collectRangeMetricSamples and stats response assembly (1.30.0–1.31.2)
  • stats_query_range fast path for grouped and ungrouped metric queries — sum by (...) count/rate/bytes queries bypass raw log scan; heavy workload throughput 4× vs cold proxy baseline (1.30.0)
  • Cold storage backend routing — split queries across hot VL and cold Victoria Lakehouse by configurable time boundary (1.28.0)
  • allocation pool pass — gzip reader pool, gob buffer pool, aggregator scalar eliminations, patternJoinBuilderPool, logfmt single-pass scanning, series/stats buffer pooling (1.31.2)
  • Parser-stage guard removal from fast path — rate/count_over_time/bytes_rate/bytes_over_time with | json/| logfmt use VL native stats for tumbling windows (1.31.2)

Committed — decided and in progress​

Everything below is decided work with its direction already proven in the codebase — no exploratory items are listed here.

  • Tighten remaining merged-tenant Drilldown metadata accuracy for field and label cardinality surfaces
  • Convert more upstream Loki, Logs Drilldown, and VictoriaLogs edge cases into regression tests (ongoing — LogQL parity machine at 555+ cases)
  • Malformed drop/keep matcher → HTTP 400 error — enforced by the typed LogQL AST validator in the query/query_range handlers (regression-locked), and rules-migrate now runs the same validator so malformed rule expressions fail conversion instead of silently skipping the broken stage
  • Migrate remaining string-based translation paths to the typed logsql AST builder and roll logql.Translate into the proxy handlers — the typed translation path (internal/logql/translate.go) is implemented and tested; the handler call-site migration is the remaining step
  • Expand browser-level multi-tenant Explore and Drilldown scenarios where API parity already exists but UI combinations still need live regression coverage
  • | keep field=value stream label mutation — matcher form now applies to stream labels in addition to structured metadata and parsed fields (v1.50.1)
  • Parallel multi-tenant fanout — goroutine-per-tenant dispatch with sync.WaitGroup (v1.43.0)
  • Streaming backward hot+cold merge — ring-buffer reverse with bounded memory (maxRingSize=5000) and early termination (v1.43.0)