// the find
Doriandarko/RepoToTextForLLMs
Automate the analysis of GitHub repositories for LLMs with RepoToTextForLLMs. Fetch READMEs, structure, and non-binary files efficiently. Outputs include analysis prompts to aid in comprehensive repo evaluation
A single Python script that dumps a GitHub repo's README, file tree, and all non-binary file contents into one text blob for feeding to an LLM. Exactly what it says on the tin — no more, no less. Useful if you need a quick one-shot repo dump and don't want to set up something heavier.
Single-file, zero-config tool that does one thing — easy to audit and modify. Iterative traversal avoids Python recursion limits on deep trees. Skips binary files automatically, which matters when you're trying to stay under context limits. MIT license, no strings attached.
Last commit was May 2024 and the project looks abandoned — 789 stars but no recent activity. No filtering by file extension or directory, so you'll dump test fixtures, generated files, and lock files alongside the code you actually care about. No token counting or context-window budget management, so large repos will silently blow past your model's limit. The README still has a hardcoded 'YOUR TOKEN HERE' placeholder rather than proper env-var guidance, which is a bad look for a tool that handles GitHub tokens.