// the find
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
A large, well-maintained list of papers and engineering blog posts from major tech companies (Airbnb, Netflix, Uber, Google, Meta, etc.) describing how they built ML systems in production. It is not a library or framework — it is a reading list, organized by problem domain. The audience is ML engineers and data scientists who want to learn from real deployments rather than academic benchmarks.
Breadth of coverage is hard to beat: 30+ topic areas from data quality and feature stores through recsys, NLP, CV, and MLOps. Primary sources are preferred — papers and first-party engineering posts, not summaries or tutorials. Company and year tags on every entry make it easy to filter by era or by who you trust. The domain groupings (search vs. recsys vs. forecasting) are well-chosen and actually match how practitioners think about these problems.
The list stopped being updated in mid-2024, so the entire LLM-era MLOps stack (evals, RAG pipelines, prompt management, inference optimization) is essentially absent. There is no signal about which entries are worth reading vs. which are marketing fluff from a company's PR team. The README is the entire repo — no scripts, no tooling, nothing you can run. Maintenance has always depended on one person, and it shows: contribution activity dropped sharply after 2022.