// the find

hhursev/recipe-scrapers

★ 2,176 · Python · MIT · updated Jun 2026

Python package for scraping recipes data

A Python library for extracting structured recipe data (ingredients, instructions, times, etc.) from cooking websites. It handles JSON-LD/Microdata/OpenGraph schemas generically, plus has per-site scrapers for ~600+ sites where the generic path fails. Useful for anyone building meal planning apps, grocery list tools, or recipe databases.

The per-site scraper architecture is the right call — schema.org markup on recipe sites is notoriously inconsistent and often wrong, so having explicit fallbacks for allrecipes, BBC Good Food, etc. is what makes this actually work in production. Test coverage looks solid with CI on every push. The separation between HTML parsing and HTTP fetching (scrape_html vs scrape_me) is good API design — lets you plug in your own session handling, proxies, or caching. Active maintenance is evident from the 600+ individual site files and a push two days ago.

The per-site file approach is a maintenance tax that scales poorly — 600 Python files that each need updating when a site redesigns. There's no version pinning strategy for when a supported site breaks (you just get a silent parse failure or wrong data). No async support, which matters if you're scraping at any volume — you'll need to wrap everything in asyncio.to_thread or run a thread pool yourself. Ingredient parsing is explicitly out of scope, so you get '2 cups flour' as a string — splitting quantity/unit/ingredient is your problem.

View on GitHub → Homepage ↗