// the find

EricLBuehler/candle-lora

★ 174 · Rust · MIT · updated Apr 2025

Low rank adaptation (LoRA) for Candle.

LoRA fine-tuning for HuggingFace's Candle framework, written in Rust. Swaps Linear, Conv1d, Conv2d, and Embedding layers with LoRA counterparts via a proc-macro that keeps model struct changes minimal. Aimed at Rust ML practitioners who want to fine-tune models without leaving the Candle ecosystem.

The proc-macro approach is genuinely ergonomic — adding `#[derive(AutoLoraConvert)]` and `replace_layer_fields` to an existing Candle model struct is much less invasive than manually wiring adapter layers. Weight merging is implemented, so inference overhead is eliminated at deployment time. Pre-converted transformer examples (Llama, Mistral, BERT, Falcon, etc.) lower the barrier to entry considerably. The author also built mistral.rs on top of this, so at least one production-grade consumer exists as evidence it actually works.

Weight format is incompatible with peft's safetensors naming, which means you can't import adapters trained in Python or export to Python tooling — a significant portability problem for anyone in a mixed stack. The candle-lora-transformers models are forks of candle-transformers, so keeping them in sync with upstream Candle as architectures evolve is a maintenance burden that's already showing: last commit April 2025, but Candle itself moves fast. No training loop or optimizer is included — you still have to wire that yourself, so 'LoRA fine-tuning in Rust' is more aspirational than turnkey. 174 stars and 33 forks after two years suggests limited adoption outside the author's own mistral.rs project.

View on GitHub →