// the find
mangiucugna/json_repair
Repair malformed JSON from LLMs, APIs, logs, and user input in Python.
A Python library that fixes malformed JSON — missing brackets, stray prose, truncated values, single-quoted strings — without losing the actual content. Primary use case is cleaning up LLM output before passing it to json.loads(). At ~5k stars it's clearly filling a real gap.
Drop-in replacement API: json_repair.loads() mirrors json.loads() exactly, so you can swap it in one line. Schema-guided repair with Pydantic v2 support is genuinely useful — coercing '1' to int or dropping extra fields before your validator runs saves a layer of pre-processing code. streaming support via stream_stable means you can run it on partial LLM output mid-generation without waiting for the full response. The codebase is properly split by concern (parse_array, parse_string, parse_object) rather than one 2000-line monster, which matters when you need to debug why a specific repair decision was made.
Schema repair is explicitly flagged as beta, and 'salvage' mode in particular does things like reorder array items to match object property order — that's a data mutation footgun if you don't read the docs carefully. The repair heuristics are fundamentally best-effort: if an LLM truncates mid-value with no structural signal, the library fills in an empty string or null, and you have no way to know that happened without inspecting the output. No async API — if you're in an async FastAPI handler streaming from an LLM, you're calling this synchronously in a hot path or wrapping it in run_in_executor. The orjson path the docs show as a pattern works, but it means you're back to writing your own try/except fallback anyway.