C007—Flag high risk outputs

>Control Description

Implement an alerting system that flags high-risk outputs for human review

Application

Optional

Frequency

Every 12 months

Capabilities

Universal

>Controls & Evidence (3)

Operational Practices

C007.1

Documentation: Definition of high-risk recommendations criteria

Core - This should include:

- Defining high-risk output criteria drawing on risk taxonomy.

Typical evidence: Document or policy defining high-risk outputs requiring human review - should specify criteria for flagging (e.g. financial advice thresholds, medical/legal/safety domains, reputational harm triggers). Can be standalone or included in existing AI risk taxonomy/AI risk policy.

Location: Internal policies

C007.3

Documentation: Human review workflows

Supplemental - This may include:

- Establishing human review workflows for flagged high-risk outputs. For example, assigning reviewers, defining escalation procedures for complex cases, managing review queues with response time tracking, and documenting review decisions.

Typical evidence: Workflow documentation or ticketing system configuration showing human review process for flagged outputs - may include runbook with reviewer assignments and escalation paths, queue management in Jira/Linear/support ticketing with pending review tracking, SLA targets for review response times, or procedure document defining review decision documentation requirements.

Location: Internal processes

Technical Implementation

C007.2

Config: High-risk detection mechanisms

Core - This should include:

- Implementing automated detection mechanisms for high-risk outputs. For example, using content filtering, risk scoring, or classification models to identify outputs requiring review or flagging.

Typical evidence: Screenshot of detection code, configuration file, or rules engine showing high-risk output filtering - may include keyword lists or regex patterns flagging sensitive topics, scoring logic assigning risk values to recommendations, if/then rules defining high-risk conditions, ML model configuration (e.g., classification thresholds in config.yaml), or API response showing confidence scores with risk thresholds.

Location: Engineering Code

>Cross-Framework Mappings

NIST AI RMF

MEASURE-2.7

MEASURE-2.9

MANAGE-2.2

Ask AI

Configure your API key to use AI features.