// the find

FreedomIntelligence/LLMZoo

★ 2,944 · Python · Apache-2.0 · updated Nov 2023

⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

A 2023-era research project from CUHK Shenzhen that fine-tuned BLOOMZ and LLaMA into multilingual chat models (Phoenix and Chimera), with an attached evaluation framework for comparing open-source LLMs. Aimed at researchers who wanted a reproducible ChatGPT-like baseline across non-English languages when nothing else covered that gap. That gap no longer exists.

The multilingual training data pipeline is well-documented — the two-step translation+generation approach for bootstrapping instruction data in 40+ languages is concrete and replicable. The evaluation framework ships with actual answer files and scoring scripts rather than just claiming numbers. INT4/INT8 quantization support was baked in early, and the delta-weight distribution workaround for LLaMA's license was the correct call at the time.

Last commit is November 2023 — this is a research artifact, not a maintained library. The base models (BLOOMZ-7b1-mt, LLaMA 7/13B) are two generations behind; there is no path to current architectures. Chimera requires delta-weight reconstruction from the original LLaMA weights, which is fiddly and depends on a third-party patched AutoGPTQ repo that may itself be dead. The evaluation methodology uses GPT-4 as a judge comparing against ChatGPT, which embeds significant bias and was already a questionable benchmark design when this shipped.

View on GitHub →