// the find
lw-lin/streaming-readings
Streaming System 相关的论文读物
A reading list of 20+ academic papers on stream processing systems, spanning 2002–2016, written in Chinese. Covers the theoretical foundations behind Storm, Flink, Spark Streaming, MillWheel, and the Dataflow model. Aimed at engineers who want to understand *why* these systems are designed the way they are, not just how to use them.
The curation traces a clear intellectual lineage — watermarks from 2002 punctuation semantics through 2008 OOP to MillWheel — which helps you see how ideas compound. Each entry has a substantive Chinese-language annotation that goes beyond the abstract, calling out specific design decisions and tradeoffs. Coverage of both academic (VLDB, SIGMOD, SOSP) and industry papers (Facebook, Twitter, Google) gives a realistic picture of what actually shipped versus what was theorized. The 2013–2016 era papers (Flink checkpoint, Drizzle, Heron) are particularly relevant for understanding modern systems you're likely running today.
Last updated February 2022, so the entire post-Flink-1.13 era is missing — no Materialize, no Kafka Streams maturation, no RisingWave, nothing on streaming SQL standardization efforts. The README is Chinese-only with no English summaries, which cuts the audience in half for a list that covers papers originally written in English. No organization by topic (fault tolerance vs. unified batch/stream vs. state management) — it's purely chronological, so finding papers on a specific problem requires reading everything. At 735 stars and 155 forks with zero code, it's essentially a maintained bookmark list, and maintainers drift; several linked PDFs will likely be dead or moved.