finds.dev← search

// the find

donnemartin/viz

★ 812 · Python · NOASSERTION · updated Jan 2018

Visualize GitHub's most popular repos. http://www.donnemartin.com/viz

A Python pipeline that pulls GitHub API data, wrangles it with pandas, and feeds it into Tableau Public for interactive visualizations of the most-starred repos by year and language. Built by Donne Martin as a one-person project around 2015-2017, it's a historical snapshot tool rather than a live dashboard.

The data pipeline is clean — GitHub API → pandas notebook → CSV → Tableau is a straightforward, reproducible chain you can follow and rerun. The per-language breakdown across multiple time windows (1/3/6-month rolling plus annual) is genuinely useful for spotting language trends that all-time stats obscure. The frozen annual datasets are a nice archival touch — you can diff 2015 vs 2016 without the numbers shifting under you. The geocoding of user locations via Google Maps adds a dimension most GitHub stat tools skip.

Abandoned in January 2018 — the data stops at 2017 and the live Tableau Public dashboard has almost certainly expired or reset, which makes the interactive half of the pitch dead on arrival. The visualization layer is entirely locked in Tableau, a proprietary tool that requires a reader install for offline use; there's no path to reproducing the visuals in anything open. Stars-only ranking bakes in the repo age bias the FAQ acknowledges but doesn't actually fix — a 2016 repo with 2 years of accumulation outranks a 2017 repo that's genuinely better. No automation or scheduling: everything is manual re-runs of a Jupyter notebook, so 'continually updated' was aspirational.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →