// the find
chiphuyen/sotawhat
Returns latest research results by crawling arxiv papers and summarizing abstracts. Helps you stay afloat with so many new papers everyday.
A CLI tool that searches arxiv, scrapes recent papers matching a keyword, and summarizes their abstracts using NLTK. Aimed at researchers or engineers who want a quick terminal-based way to scan what's been published on a topic without opening a browser.
Simple, focused interface — one command, one keyword, useful output with no ceremony. The idea of summarizing abstracts rather than just listing titles saves actual time. MIT licensed with a pip-installable package structure, so it's easy to drop into a workflow. Covers a genuinely useful niche that arxiv's own search UI handles poorly for quick lookups.
Last meaningful commit appears to be years old and the Heroku web UI it references is almost certainly dead (Heroku killed free dynos in 2022). The summarization is NLTK extractive summarization from 2018 — not LLM-based, so it's essentially sentence ranking from the abstract text, which is already short. No way to filter by date range, so results mixing 2018 and 2024 papers with the same keyword isn't surfaced. The SSL certificate workaround and Windows encoding hack documented in the README are early warning signs that setup friction was real even when this was actively maintained.