Roadmap
Completed
-
LogQL -> LogsQL translation (stream selectors, line filters, parsers, metric queries)
-
Response format conversion (VL NDJSON -> Loki streams, VL stats -> Prometheus matrix/vector)
-
Request coalescing (singleflight: N queries -> 1 backend request)
-
Rate limiting (per-client token bucket + global concurrency)
-
Circuit breaker (closed->open->half-open with built-in defaults)
-
Query normalization (sort matchers, collapse whitespace for cache keys)
-
Tiered cache (per-endpoint TTLs, L1 in-memory, L2 bbolt on-disk)
-
Multitenancy (string->int tenant mapping, numeric passthrough, SIGHUP reload)
-
WebSocket tail (
/loki/api/v1/tail-> VL NDJSON streaming) -
L2 disk cache with gzip compression, write-back buffer (encryption via cloud provider)
-
OTLP telemetry push (gzip/zstd compression, TLS)
-
HTTP hardening (timeouts, body limits, security headers)
-
Index stats, volume, volume_range via VL
/select/logsql/hits -
Query fingerprinting + analytics (
/debug/queries) -
Graceful HTTP server shutdown (SIGTERM/SIGINT)
-
Grafana datasource config (maxLines, basic auth, backend timeout, TLS, header/cookie forwarding, optional listener mTLS)
-
Derived fields (regex extraction for trace linking)
-
Chunked streaming (Transfer-Encoding: chunked for large results)
-
OTel label translation (bidirectional dot<->underscore for 50+ fields)
-
Custom field remapping (
-field-mapping) -
Per-tenant metrics (request rate, latency, error rate by X-Scope-OrgID)
-
Client error breakdown (bad_request, rate_limited, not_found, body_too_large)
-
Grafana dashboard & alerting rules (Helm PrometheusRule CR)
-
Write safeguard (
/pushblocked with 405) -
| decolorizeproxy-side ANSI stripping -
| ip("CIDR")proxy-side IP range filtering -
| line_formatfull Go templates -
pprof, SIGHUP reload, and rate-limit response headers
-
Per-endpoint cache/backend metrics, CB state gauge
-
Fuzz testing (1.2M+ executions, no panics)
-
Nested binary metric queries (
sum(rate(...)) / sum(rate(...))) -
/loki/api/v1/patternsproxy-side Loki-style Drain tokenizer/clustering extraction -
directionparameter (forward/backward sort) -
quantile_over_time()mapped to VL quantile -
label_formatmulti-rename (comma-separated) -
Extended binary ops (
%,^,==,!=,>,<,>=,<=) -
Datasource compatibility handlers (
/rules,/alerts,/config) -
Playwright UI e2e tests
-
/tailbrowser and ops coverage (origin policy, native fallback, ingress recovery, upstream401/403/5xxparity) -
Multi-tenant Explore and Logs Drilldown coverage for
__tenant_id__, labels, series, and detected field/label browser/resource surfaces -
Delete API endpoint with safeguards (confirmation, tenant scoping, audit logging)
-
without()clause detection and clear error message -
IsScalarsupports negative and scientific notation -
Circuit breaker half-open metrics fix
-
Tenant map reload race condition fix
-
boolmodifier on comparison operators — stripped at translation (applyOp returns 1/0 for all comparisons) -
Field-specific parser
| json field1, field2/| logfmt field1, field2— maps to full unpack (VL extracts all fields) -
Backslash-escaped quotes in stream selectors — findMatchingBrace handles
\" -
Binary expression detection before metric query (fixes
rate(...) > 0being misrouted) -
Peer cache design doc + headless service Helm template
-
Performance-focused optimization pass (buffer pools, sync.Pool, connection-pool tuning, cache hot-path benchmarks)
-
Complete Helm chart: 11 templates, GOMEMLIMIT auto-calc, HTTPRoute
-
Broad test suite, CI bench job, and regression gates
-
Coverage and quality-gate reinforcement for runtime, middleware, cache, and proxy-path tests
-
Tier0 compatibility-edge cache with bounded memory budget, safe GET-only guardrails, and reload invalidation
-
Fleet shadow-copy validation for 3-peer cache reuse plus Tier0/fleet micro-benchmarks
Planned
-
on()/ignoring()/group_left()/group_right()vector matching (v0.22.0) -
@timestamp modifier (v0.19.0) -
unwrap duration()/bytes()unit conversion (v0.21.0) - Subquery syntax
rate(...)[1h:5m]— proxy-side evaluation (v0.23.0) - LRU cache eviction (v0.21.0)
- Peer cache Phase 1 implementation (DNS discovery + peer fetch) (v0.24.0)
- System metrics in /metrics (CPU, memory, IO, network via /proc) (v0.19.0)
- Native VL stream selector optimization for known
_stream_fields(v0.23.0) - PR quality report workflow with coverage, compatibility, and performance delta comments (v0.26.0)
- Tighten remaining merged-tenant Drilldown metadata accuracy for field and label cardinality surfaces
- Convert more upstream Loki, Logs Drilldown, and VictoriaLogs edge cases into regression tests
- Expand browser-level multi-tenant Explore and Drilldown scenarios where API parity already exists but UI combinations still need live regression coverage
- Add bounded peer hot-read-ahead (top-N hot keys with per-interval key/byte/concurrency budgets, jitter, and tenant fairness) to improve non-owner local hit rates without causing peer traffic storms (v1.0.25)
- Add regression/perf suite for collapse forwarding and hot-read-ahead interactions (owner/non-owner paths, coalescing efficiency, backend offload delta) (v1.0.25)
- Promote compose-backed e2e fleet cache smoke coverage into required GitHub Actions for pull requests and post-merge
mainruns (v0.27.7)