finds.dev← search

// the find

gocolly/colly

★ 25,334 · Go · Apache-2.0 · updated May 2026

Elegant Scraper and Crawler Framework for Golang

Colly is a Go scraping/crawling framework built around a collector pattern with callback hooks for HTML elements, requests, and responses. It handles the plumbing — concurrency, rate limiting, cookies, caching, robots.txt — so you can focus on what to extract. Best fit for structured data extraction from static or lightly-dynamic sites.

Callback-based API maps cleanly to how web pages are structured: OnHTML, OnRequest, OnResponse, OnError — each does one thing. Built-in per-domain rate limiting and concurrency controls mean you won't accidentally hammer a target or get banned by writing one extra line. The storage interface is properly abstracted, so you can swap the default in-memory visited-URL store for Redis or custom backends without touching your scraper logic. 25k stars and an active example library means most common scraping patterns (login flows, sitemaps, queues, proxy rotation) already have working reference code.

No JavaScript rendering — Colly is pure HTTP, so anything behind a React SPA or lazy-loaded content is invisible to it; you'll need Chromedp or Rod for that, and they don't compose with Colly well. The distributed scraping story is underdeveloped: the queue package exists but there's no built-in coordination layer, so 'distributed' in practice means you wire it yourself with Redis and hope. Maintenance has slowed noticeably — the CHANGELOG hasn't had a meaningful entry in years, several open issues have stale workarounds in the comments, and the README now prominently features a proxy vendor sponsorship which is a mild smell. Error handling defaults to silently dropping errors unless you register OnError, which is a footgun for anyone not reading docs carefully.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →