finds.dev← search

// the find

Epistates/pmetal

★ 296 · Rust · NOASSERTION · updated Jun 2026

PMetal: high-performance Apple Silicon framework for local LLM inference, LoRA/QLoRA fine-tuning, serving, quantization, and MLX/Metal acceleration.

PMetal is a Rust-native ML platform for Apple Silicon covering the full stack: local LLM inference, LoRA/QLoRA fine-tuning, GGUF quantization, model merging, knowledge distillation, and a multi-Mac distributed training cluster. It wraps MLX via a C++ FFI bridge and adds custom Metal GPU kernels on top. Aimed at Mac developers who want to run and fine-tune models locally without leaving the Apple ecosystem.

The TurboQuant KV cache compression is technically interesting — random rotation plus Lloyd-Max quantization with QJL residual correction is a real implementation of recent research, not just wrapping an existing library. The Thunderbolt-aware cluster formation (mDNS auto-discovery, fabric priority ring, automatic fallback on cable unplug) solves a real annoyance for multi-Mac setups without any configuration. The 20-crate workspace with feature-gated compilation means you can pull in only what you need, and the `easy` API gives a genuinely clean entry point for common tasks. Tier-based kernel tuning per GPU family (Base/Pro/Max/Ultra) with auto-detection at startup is the right design — no manual flags to get correct performance.

296 stars and 20 forks in June 2026 is a thin adoption signal for a project this ambitious in scope — the gap between what's listed and what's actually battle-tested is unknown and likely large. Several architectures (Pixtral, Qwen2-VL, Whisper, T5) have code in the tree but aren't wired into the dispatcher, which means the headline model support table is misleading for anyone who wants to actually use those. Per-architecture partial-layer execution for the distributed case is explicitly documented as not done yet — the cluster feature is real for gradient all-reduce but model parallelism (serving a model that doesn't fit on one Mac) is still a promise. The C++ MLX bridge adds a build-time dependency and a CMake step that will break for anyone without a working Xcode toolchain, and the README doesn't surface this as a gotcha.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →