// the find
lilianweng/emoji-semantic-search
Search the most relevant emojis given a natural language query
A semantic search tool that finds relevant emojis from natural language queries using OpenAI embeddings. It's a demo/toy project by Liang Liang Weng (OpenAI researcher) — clever for what it is, useful if you need emoji autocomplete in a product.
Pre-built embedding index ships with the repo so you don't have to generate it yourself. The architecture is simple and easy to follow: embedding lookup, cosine similarity, done. The live demo at emojisearch.app shows it actually works. Good reference implementation for anyone learning how to build a semantic search over a small fixed corpus.
Last commit was January 2023, OpenAI embedding APIs have changed since then and it's unclear if the pre-built index still matches current API output. Hard dependency on OpenAI — no option to swap in a local embedding model. The corpus is static emoji-data.txt; there's no path to update it as new emoji are released. Flask backend with no auth, rate limiting, or any production hardening — copy the idea, don't deploy this as-is.