// the find
tintinweb/smart-contract-sanctuary
🐦🌴🌴🌴🦕 A home for ethereum smart contracts. 🏠
A mass archive of Etherscan-verified Solidity source files across 9 EVM-compatible chains, organized as a git index repo with per-chain submodules. Primarily useful as a training/research dataset for smart contract analysis, security tooling, or ML models. Not a library or framework — it's a corpus.
The multi-chain coverage (Ethereum, Arbitrum, BSC, Polygon, etc.) in one place is genuinely useful for cross-chain analysis. Twice-daily automated updates keep it reasonably fresh for a passive archive. The address-based directory sharding avoids filesystem limits that would kill a flat layout at this scale. Academic citation metadata is included, which matters if you're publishing research using this data.
The full recursive clone is 2GB+ and growing — this is a painful data dependency for any CI pipeline or script that needs it. Last push was June 2024, so either the automation quietly died or the update cadence slipped badly. `contracts.json` is explicitly documented as incomplete ('the filesystem may contain more files'), which means any tooling that relies on the index rather than directory traversal will silently miss contracts. No deduplication — the same contract code deployed at different addresses will appear multiple times across chains.