// the find
NVIDIA/garak
the LLM vulnerability scanner
garak is a black-box security scanner for LLMs — point it at a model endpoint and it fires a battery of attack probes (jailbreaks, prompt injection, encoding tricks, toxicity elicitation, package hallucination) then reports failure rates per probe/detector pair. Think of it as a fuzzer for model behavior rather than model code. Useful for teams shipping LLM-backed products who need to document what their model resists before it goes to production.
Plugin architecture (probes, detectors, generators, harnesses are independent, composable classes) makes it straightforward to add a probe for your specific threat model without touching core code. Generator coverage is genuinely wide — OpenAI, HuggingFace local/inference, AWS Bedrock, litellm, raw REST, llama.cpp gguf — if you have an LLM accessible over any interface, there's probably a generator for it. Probe library is grounded in actual published research: PromptInject (NeurIPS best paper), Language Model Risk Cards, Bad Characters, RealToxicityPrompts — these aren't made-up edge cases. JSONL report output is machine-readable, which means you can pipe results into CI and fail a build if a new model regression appears.
The adaptive red-teaming probe (atkgen) is openly described as 'prototype, mostly stateless' in the README itself — the most interesting attack vector (an adversarial LLM that reacts to your model's outputs) is the least ready. Python version is capped at 3.12; it won't install cleanly on 3.13 which is current, a real friction point for new setups. Pass/fail rates are hard to act on in isolation — a 12% failure rate on dan.Dan_11_0 doesn't tell you whether that's exploitable given your actual deployment's system prompt and guardrails; you need to map probe results back to your threat model manually. The probe set is essentially signature-based: it tests known attack patterns catalogued up to its training data, so genuinely novel attack techniques won't appear in any report until someone adds a probe for them.