// the find

yangjianxin1/Firefly

★ 6,639 · Python · updated Oct 2024

Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Firefly is a Chinese-first LLM fine-tuning toolkit supporting full-parameter training, LoRA, and QLoRA across a wide range of open-source models. It targets researchers and practitioners who want to fine-tune models like Qwen2.5, Llama3, or Mixtral without writing training boilerplate from scratch. The project is heavily oriented toward Chinese NLP use cases but works for English too.

Config-file-driven training is a genuine usability win — you swap a JSON file to change model, mode, and task type without touching code. The per-model chat template alignment in component/template.py is the kind of tedious-but-necessary work that saves real debugging time. QLoRA results on Open LLM Leaderboard are actually published and competitive — firefly-mixtral-8x7b at 70.16 beats Yi-34B-Chat, which is a credible benchmark, not a cherry-picked internal number. Unsloth integration for memory reduction (42% on Llama3-8B) is practical for anyone training on consumer hardware.

Last commit October 2024 — no Llama 3.1/3.2, no Qwen2.5 72B, no Phi-4, no DeepSeek-V3 support. The dependency story is a mess: different model families require manually uninstalling packages (xformers for Baichuan2, flash-attn for Qwen), which will burn time on any fresh setup. No experiment tracking integration out of the box — TensorBoard only, no W&B or MLflow hooks. The 1024-token training length hard-coded across most configs will limit anyone doing long-context fine-tuning without manually editing every config file.

View on GitHub →