// the find

Defalt-Meh/Sentinel-C

C · MIT · updated Aug 2025

A tiny, CPU-only edge agent that locally: (1) detects/redscts PII & risky content, (2) classifies user intent to pick the right tool, and (3) decides whether to call the cloud LLM (Azure OpenAI) or answer locally—all routed by the blazing-fast C MLP.

A local PII detection sidecar that combines regex validators with a tiny C MLP to flag and redact emails, credit cards, JWTs, and similar sensitive content. The C extension (Framework-C) is the centerpiece — it's genuinely fast at sub-millisecond inference on CPU. Aimed at developers who want a lightweight pre-filter before sending text to a cloud LLM.

The regex+validator layer is the right design choice: deterministic detection for well-structured PII (Luhn-validated card numbers, mod-97 IBAN checks) before the model even runs. The featurizer is honest about what it's doing — char n-gram hashing with FNV-1a is simple, fast, and reproducible. The per-input model comparison panel (Framework-C vs PyTorch vs sklearn) is a useful sanity-check tool, not just a demo gimmick. Sub-ms C inference for batch=1 on commodity hardware is a real advantage if you're processing high-frequency streams.

The MLP trains on synthetic data at startup from a hardcoded generator — there's no mechanism to bring your own labeled data or improve recall on domain-specific PII patterns. Zero stars, no tests, and a single-file server means this is early prototype territory, not something you'd trust in a compliance-sensitive pipeline. The description promises intent classification and a cloud LLM routing gate but none of that exists in the repo — the actual code is just PII detection. Windows is unsupported (macOS/Linux only in the quickstart, no CI).

View on GitHub →