// the find

underneathall/pinferencia

★ 543 · Python · Apache-2.0 · updated Feb 2023

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

Pinferencia wraps any Python model or function into a FastAPI REST endpoint plus a Streamlit GUI in about 5 lines of code. It's aimed at ML practitioners who want to demo or prototype a model without writing API boilerplate. Last commit was February 2023, so it's effectively unmaintained.

The register/serve abstraction is genuinely minimal — you hand it any callable and it works, no framework-specific adapters needed. KServe v1/v2 API compatibility is a real differentiator for teams already in a Kubeflow or Triton environment who want a lighter local dev story. 100% statement and branch test coverage is unusual for a project this size and makes the codebase trustworthy to fork. The built-in Streamlit frontend with pluggable templates (image-to-text, translation, camera input) saves meaningful time for demos.

Dead since February 2023 — no commits in over three years, open issues unattended, LGTM badge links to a service that itself shut down. No support for async inference, batching, or model versioning, which means it tops out at single-request prototypes and can't be pushed toward any real serving load. The Streamlit dependency is baked into the main install path, dragging in a heavy frontend stack for users who only want the REST API. No GPU-aware worker pool or queue — if your model takes 10 seconds, requests just block sequentially.

View on GitHub → Homepage ↗