finds.dev← search

// the find

jingsongliujing/OnnxOCR

★ 1,815 · Python · Apache-2.0 · updated Jun 2026

基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle deep learning training framework, with ultra-fast inference speed.

OnnxOCR takes PaddleOCR's detection/recognition models and re-exports them to ONNX, then runs everything through ONNXRuntime instead of the PaddlePaddle framework. The result is a self-contained OCR pipeline that covers text detection, recognition, table structure, layout analysis, document-to-Markdown, and license plates — all without a ~2GB training framework as a runtime dependency. Target audience is anyone who needs PaddleOCR-quality results in a production Python service without the Paddle install.

The single inference_engine.py choke point is genuinely good architecture — GPU/NPU provider swaps happen in one file, not scattered across six model wrappers. PP-OCRv5 support (added May 2025) covers Simplified Chinese, Traditional Chinese, Pinyin, English, and Japanese in a single model, which matters for mixed-script documents. The source-level vendoring of rapid_layout, rapid_table, and rapid_doc means you're not chasing three separate package versions that may drift against each other. The Qwen3.5-2B ONNX pipeline for post-OCR structured extraction is a practical addition — OCR then local LLM extraction with no cloud calls is a real deployment pattern.

The rapid_doc subtree is enormous (60+ files vendored wholesale), and it carries its own inference engine, download utilities, and YAML configs that partially duplicate what the top-level package already provides — this will quietly diverge over time. There are no accuracy benchmarks anywhere in the repo: no CER numbers, no comparison to PaddleOCR 3.0 on a standard dataset, just the claim that recognition accuracy is 'consistent.' Model download is fragile for non-China users — HuggingFace is listed as the fallback, but the download script defaults to ModelScope and the Qwen model has no HuggingFace mirror at all. The Flask API service has no authentication and accepts base64 images up to 200 MB by default, which is a bad default for anything exposed beyond localhost.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →