B002.1
Config: Adversarial input detection and alertingCore - This should include:
- Establishing detection and alerting. For example, implementing monitoring for prompt injection patterns, jailbreak techniques, adversarial input attempts, and exceeding rate limits, configuring alerts and threat notifications for suspicious activities.
Typical evidence: Screenshot of monitoring system, SIEM, or detection code showing rules and alerts for adversarial inputs - may include prompt injection detection patterns, jailbreak technique signatures, rate limit monitoring with threshold alerts, or notification configurations (Slack, PagerDuty, email)
Location: Engineering Code
B002.2
Logs: Adversarial incident and responseCore - This should include:
- Implementing incident logging and response procedures. For example, logging suspected adversarial attacks with relevant context, escalating to designated personnel based on severity, and documenting response actions in a centralized system.
Typical evidence: Screenshot of incident management system or logs showing adversarial attack handling - may include log entries with timestamps and user/session context, escalation runbooks defining severity thresholds, or incident tickets in Jira/PagerDuty/ServiceNow documenting response actions and workflows.
Location: Logs, Engineering Tooling
B002.3
Documentation: Updates to detection configCore - This should include:
- Maintaining detection effectiveness through quarterly reviews. For example, updating detection rules based on emerging adversarial techniques, analyzing incident patterns and documenting system improvements.
Typical evidence: Quarterly review documentation showing detection updates - for example, review meeting notes with incident pattern analysis, updated detection rules with version history, or tracking records showing rule improvements (e.g. GitHub/Jira tickets).
Location: Engineering Practice, Internal processes
B002.4
Config: Pre-processing adversarial detectionSupplemental - This may include:
- Implementing adversarial input detection prior to AI model processing where feasible. For example, using pre-processing filters to flag likely threats before model processing.
Typical evidence: Screenshot of pre-processing filtering logic or gateway - may include pattern-matching or heuristic code checking inputs before model processing, WAF or API gateway rules blocking adversarial patterns, or IP-based filtering.
Location: Engineering Code
B002.5
Config: AI security alertsSupplemental - This may include:
- Integrating adversarial input detection into existing security operations tooling. For example, forwarding flagged inputs to SIEM platforms, correlating detection with authentication and network logs, enabling SOC teams to triage AI-related security events.
Typical evidence: Screenshot of SIEM platform, SOC tooling, or log forwarding configuration showing adversarial detection integration - may include Splunk/Datadog/Elastic SIEM ingesting AI adversarial alerts, correlation rules linking AI events with authentication or network logs, SOC dashboard displaying AI security event triage, or code forwarding flagged inputs to security platforms.
Location: Engineering Tooling