finds.dev← search

// the find

MentatInnovations/datastream.io

★ 915 · Python · Apache-2.0 · updated Mar 2020

An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana

datastream.io is a Python CLI and library for streaming anomaly detection over tabular time-series data, with optional output to a Bokeh dashboard or Elasticsearch/Kibana. It wraps scikit-learn detectors behind a thin abstraction and handles the plumbing of re-streaming data at configurable speed. Aimed at data scientists who want to prototype anomaly detection pipelines without wiring up all the visualization themselves.

The sklearn AnomalyMixin bridge is genuinely useful — it lets you drop in any existing sklearn-compatible model without rewriting it from scratch. Auto-detection of the time dimension from CSV columns is a small but real convenience. The Bokeh + Jupyter embedding path means you can run the whole thing inside a notebook without a separate server. The plugin architecture for custom detectors (inherit from AnomalyDetector, implement train/update/score) is clean and minimal.

Abandoned since March 2020 — Python ecosystem has moved fast and dependencies like Bokeh and the Elasticsearch client have had breaking changes since then; expect version-pinning pain on install. Only handles 1D detectors by default (Gaussian1D, percentile1d), so multivariate anomaly detection requires you to bring your own model with no guidance on how to structure it. No persistence of trained model state — every run starts cold, which makes this useless for real production streaming where you need incremental learning. Elasticsearch support is locked to version 5.x (based on the README), which is end-of-life several major versions ago.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →