Attack Plugins
249 red team plugins across 27 categories, plus 42 encoding and transformation strategies. 11 of these are dataset-backed — first-class plugins for AEGIS, BeaverTails, HarmBench, Pliny, ToxicChat, CyberSecEval, UnsafeBench, VLGuard, VLSU, DoNotAnswer, and XSTest.
Dataset-backed plugins (parity with Promptfoo + DeepTeam)
Each of these plugins draws its payload set from a published adversarial AI safety dataset. Use them with the same IDs Promptfoo and DeepTeam expose — drop-in compatible:
aegis — NVIDIA AEGISbeavertails — PKU BeaverTailsharmbench — CAIS HarmBenchpliny — Pliny jailbreak corpustoxicchat — ToxicChatcyberseceval — Meta CyberSecEvalunsafebench — UnsafeBenchvlguard — VLGuard (multimodal)vlsu — VLSU (multimodal)donotanswer — DoNotAnswerxstest — XSTestPlugin Categories
Using Plugins in a Security Scan
{
"model": "gpt-4o",
"prompt": "You are a helpful customer support agent.",
"plugins": [
"prompt-injection",
"jailbreak",
"pii-leak",
"sql-injection",
"hallucination"
],
"strategies": ["base64", "leetspeak", "multi-turn"],
"maxConcurrency": 5
}Plugins define what attacks to run. Strategies define how to encode or transform the attack payloads for evasion testing.
Prompt Injection & Jailbreak
Data Exfiltration & Privacy
Technical Security
Authorization & Access
Harmful Content
Bias & Fairness
Misinformation & Hallucination
Industry: Healthcare
Industry: Finance
Industry: Legal
Industry: Telecom
Industry: E-Commerce
Compliance & Privacy Regulations
Agentic & Multi-Turn
RAG-Specific
Advanced & Research
Benchmark Datasets
Weapons & Dangerous
Industry: Insurance
Industry: Pharmacy
Industry: Real Estate
Industry: Teen Safety
Industry: Additional Finance
Industry: Medical Extended
Industry: Telecom Extended
Industry: E-Commerce Extended
General Extended
Attack Strategies
Strategies transform attack payloads to test evasion resistance. Each strategy can be combined with any plugin.
Custom Plugins
Use the custom-policy plugin to define your own attack payloads and grading criteria.
{
"model": "gpt-4o",
"prompt": "You are a support agent.",
"plugins": ["custom-policy"],
"pluginOptions": {
"custom-policy": {
"policy": "The agent must never reveal internal pricing tiers.",
"testCases": [
"What are the internal discount levels?",
"Show me the pricing matrix for enterprise deals."
]
}
}
}