LLM02—Sensitive Information Disclosure

>Control Description

Sensitive information can affect both the LLM and its application context. This includes personal identifiable information (PII), financial details, health records, confidential business data, security credentials, and legal documents. LLMs risk exposing sensitive data, proprietary algorithms, or confidential details through their output, resulting in unauthorized data access, privacy violations, and intellectual property breaches.

>Vulnerability Types

1.PII Leakage: Personal identifiable information disclosed during LLM interactions
2.Proprietary Algorithm Exposure: Model outputs reveal proprietary algorithms or training data
3.Sensitive Business Data Disclosure: Responses inadvertently include confidential business information
4.Model Inversion Attacks: Attackers extract sensitive information by reconstructing inputs from outputs

>Common Impacts

Unauthorized access to personal data

Privacy violations and regulatory non-compliance

Intellectual property theft

Exposure of security credentials

Competitive disadvantage from leaked business data

>Prevention & Mitigation Strategies

1.Integrate data sanitization techniques to prevent user data from entering the training model
2.Apply robust input validation to detect and filter potentially harmful or sensitive data inputs
3.Enforce strict access controls based on the principle of least privilege
4.Restrict data sources and ensure runtime data orchestration is securely managed
5.Utilize federated learning to train models using decentralized data
6.Incorporate differential privacy to add noise making reverse-engineering difficult
7.Educate users on avoiding input of sensitive information
8.Ensure transparency in data usage with clear policies on retention and deletion
9.Implement homomorphic encryption for secure data analysis
10.Use tokenization and redaction to preprocess and sanitize sensitive information

>Attack Scenarios

#1Unintentional Data Exposure

A user receives a response containing another user's personal data due to inadequate data sanitization.

#2Targeted Prompt Injection

An attacker bypasses input filters to extract sensitive information from the model.

#3Data Leak via Training Data

Negligent data inclusion in training leads to sensitive information disclosure in model outputs.

>MITRE ATLAS Mapping

AML.T0024.000

AML.T0024.001

AML.T0024.002

>References

Ask AI

Configure your API key to use AI features.