// the find

bbcarchdev/twine

★ 10 · C · Apache-2.0 · updated Jan 2018

An RDF workflow engine

Twine is a C-based RDF processing pipeline engine from the BBC Archive team, built around a plugin architecture where input handlers parse incoming data into RDF graphs and processors transform or store them. It's designed for bulk RDF ingestion workflows — think loading GeoNames dumps or S3-hosted RDF into a SPARQL store. Last commit was January 2018; this is abandoned infrastructure code.

The workflow model is clean: a comma-separated list of named processors in a config file is genuinely simple for what it does. The MIME-type-driven input dispatch is a sensible design that keeps format-specific code isolated in plugins. The CLI and daemon share the same workflow engine, which means you can test a pipeline synchronously before deploying it as a queue-driven service. Debian packaging and init scripts are included, which was practical ops hygiene for 2017.

Dead since 2018 with 10 stars — this is internal BBC tooling that was open-sourced but never grew a community, and it shows. The build system is GNU autotools, which means a non-trivial bootstrap process before you can even compile it. It depends on five BBC-internal libraries (libcluster, libmq, libawsclient, etc.) that are themselves likely stale or unmaintained. The 'update modules' are mentioned in the README as 'currently undocumented', which is a red flag that the abstraction wasn't finished when development stopped.

View on GitHub → Homepage ↗