finds.dev← search

// the find

Doriandarko/RepoToTextForLLMs

★ 789 · Python · updated May 2024

Automate the analysis of GitHub repositories for LLMs with RepoToTextForLLMs. Fetch READMEs, structure, and non-binary files efficiently. Outputs include analysis prompts to aid in comprehensive repo evaluation

A single Python script that dumps a GitHub repo's README, file tree, and all non-binary file contents into one text blob for feeding to an LLM. Exactly what it says on the tin — no more, no less. Useful if you need a quick one-shot repo dump and don't want to set up something heavier.

Single-file, zero-config tool that does one thing — easy to audit and modify. Iterative traversal avoids Python recursion limits on deep trees. Skips binary files automatically, which matters when you're trying to stay under context limits. MIT license, no strings attached.

Last commit was May 2024 and the project looks abandoned — 789 stars but no recent activity. No filtering by file extension or directory, so you'll dump test fixtures, generated files, and lock files alongside the code you actually care about. No token counting or context-window budget management, so large repos will silently blow past your model's limit. The README still has a hardcoded 'YOUR TOKEN HERE' placeholder rather than proper env-var guidance, which is a bad look for a tool that handles GitHub tokens.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →