// the find

filodb/FiloDB

★ 1,463 · Scala · Apache-2.0 · updated Jun 2026

Distributed Prometheus time series database

FiloDB is a distributed time series database built on Cassandra and Kafka, designed to store and query millions of Prometheus-compatible metrics at scale. It uses columnar compression and off-heap memory to keep data in-memory cheaply, and speaks PromQL via an HTTP API that Grafana can point at directly. Target audience: teams running their own metrics infrastructure who need something between bare Prometheus and fully managed observability.

First-class histogram columns are a genuinely good idea — storing all buckets together rather than as separate 'le' time series gives real compression wins and faster quantile queries. The step-multiple range notation ([1i], [2i]) solves a real PromQL footgun where hardcoded lookbacks silently break as Grafana step changes. Off-heap memory management means GC pauses don't spike your query latency when the working set is large. The separation of shard keys from partition keys lets you scope queries to a subset of shards for low-latency dashboard use cases.

The dependency stack is brutal: Cassandra, Kafka, Zookeeper, Akka Cluster, plus optional Rust native components and a C compiler — you're standing up six infrastructure pieces before writing a single byte of metrics data. PromQL coverage is self-reported at ~60%, which means standard Grafana dashboards will hit missing functions at inconvenient moments. The README still shows Travis CI badges and Scala 2.12 artifacts, and Spark support is deprecated with no clear migration path — the project shows signs of organizational attrition. No sharding reconfiguration at runtime: the shard count is baked into dataset config and changing it requires a full re-ingest.

View on GitHub →