// the find
AlexJReid/scribe
scribe turns healthcare X12 EDI into auditable events. 837 claims, 835 remits into versioned aggregates and ledger balance projections
scribe is a C binary that parses healthcare X12 EDI files (837 claims, 835 remits, 834 enrollment, 270/271 eligibility) and emits structured domain events with byte offsets and control numbers into an immutable journal. It then stitches those events into versioned claim aggregates with a SQLite read store and outbox for downstream fan-out. Aimed at backend engineers building healthcare data pipelines who are tired of writing fragile regex parsers against raw EDI.
The event sourcing model is well thought out — byte offsets and segment positions in every event mean you can trace any fact back to its exact source position in the original EDI file, which is genuinely useful for audit and debugging. PHI tokenization is a first-class design decision rather than an afterthought; raw PHI stays out of normal event flows and only resolves through a separate vault. The delta export as a self-contained SQLite file is a practical handoff mechanism that avoids building HTTP APIs for batch consumers. It ships as a single static binary with no runtime dependencies beyond SQLite, OpenSSL, and zstd — that's the right call for a tool that runs as a cron job or Lambda trigger.
One star, zero forks, and a very recent first push — this is early personal-project territory, and the disclaimer that it is not a full X12/TR3 validator is load-bearing; real claims adjudication environments will hit edge cases in payer-specific X12 dialects that a hobby parser won't cover. The parser is written in C, which is fine for performance but means the barrier to contribution and debugging is high, and memory safety issues in EDI parsing paths (variable-length segments, nested loops, external file input) are a real risk without evidence of fuzzing. The read store being SQLite is practical for demos but will become a bottleneck for any payer or clearinghouse volume — the README acknowledges this but punts it to 'add a managed database later,' which is not a plan. No test coverage numbers are visible and the test suite appears to be hand-written unit tests against synthetic fixtures, not anything resembling production EDI diversity.