finds.dev← search

// the find

risinglightdb/risinglight

★ 1,835 · Rust · Apache-2.0 · updated Aug 2025

An educational OLAP database system.

RisingLight is a from-scratch OLAP database written in Rust, built explicitly for learning how databases work. It implements its own storage engine (columnar rowsets with RLE/dict encoding), a cost-based optimizer using egg (e-graph rewriting), and a full SQL pipeline from parser to executor. Target audience is CS students and developers who want to read real database internals without wading through decades of production cruft.

The storage layer is unusually complete for an educational project — it has its own columnar block format with multiple encoding types (RLE, dictionary, primitive), a manifest-based version manager, compaction, and delete vectors. The planner uses egg for rule-based optimization, which is a legitimate modern approach rather than a toy hand-rolled rewriter. Full TPC-H coverage across all 22 queries gives a concrete correctness bar. sqllogictest-based test suite means the SQL behavior is verifiable and regression-proof.

Linux/macOS only — no Windows support, which cuts off a chunk of the student audience it's aimed at. No WAL or crash recovery; if the process dies mid-write, data integrity is undefined, which is a significant gap for anyone trying to understand OLAP durability. The optimizer cost model appears thin — there's a cost.rs file but no statistics-driven cardinality estimation feeding into join ordering, so query plans on larger datasets will often be wrong in non-obvious ways. Python extension exists but the docs for it are sparse, making it hard to use as an embedded engine from data science tooling.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →