// the find
gazette/core
Build platforms that flexibly mix SQL, batch, and stream processing paradigms
Gazette is a distributed streaming platform for Go that uses cloud object storage (S3/GCS/Azure) as the backing store for its 'journal' abstraction — append logs that are simultaneously live streams and immutable files in a data lake. It targets teams that want a single data substrate they can query with SQL batch jobs and consume as a low-latency stream without running two separate systems. Five years in production according to the README.
The journal-as-files design is genuinely clever: you get millisecond-latency reads from the broker and direct SQL/batch access to the same data in object storage without any ETL. The consumers framework handles exactly-once semantics with a recovery log, which is one of the hardest parts to get right in stream processing. The allocator uses a sparse push-relabel flow network for shard assignment — that's real distributed systems engineering, not a naive round-robin. Store backends are pluggable (RocksDB, SQLite, JSON, SQL) so you can pick the right state store per workload.
etcd as a coordination dependency is a significant operational burden — you're running three systems (broker, etcd, your app) before you process a single record. The consumer framework is Go-only with no cross-language client story, so this is a non-starter if your team writes Python or JVM services. 792 stars after five years in production suggests limited adoption, which means sparse community help and few battle-tested patterns you can copy. The RocksDB and SQLite store backends require CGo, which complicates builds and cross-compilation in ways the README doesn't flag upfront.