// the find
andkret/Cookbook
The Data Engineering Cookbook
A Markdown-based reference book covering the full data engineering stack — Kafka, Spark, Hadoop, cloud platforms, NoSQL stores, and ML pipelines. It's aimed at people transitioning into data engineering or preparing for interviews, not practitioners looking for implementation depth.
Roadmaps segmented by starting point (analyst, scientist, software engineer) are genuinely useful for career orientation. The case studies section links to real engineering blogs from Netflix, Airbnb, Spotify, etc. — primary sources are better than paraphrase. The 1001 interview questions section covers the actual topics hiring panels hit. Active maintenance (last push yesterday) and 15k stars mean broken links get fixed.
Almost no runnable code — the 'Code Examples' folder is three files and a stale Spark snippet from what appears to be a YouTube series. For a 'cookbook', you'd expect working recipes, not just explanations and links. Heavy Hadoop/MapReduce coverage that reads like 2018; dbt gets a passing mention while modern tooling like DuckDB, Iceberg, or Polars doesn't appear. The author's paid academy is promoted throughout, which blurs the line between reference material and funnel.