LLM02—Sensitive Information Disclosure
>Control Description
Sensitive information can affect both the LLM and its application context. This includes personal identifiable information (PII), financial details, health records, confidential business data, security credentials, and legal documents. LLMs risk exposing sensitive data, proprietary algorithms, or confidential details through their output, resulting in unauthorized data access, privacy violations, and intellectual property breaches.
>Vulnerability Types
- 1.PII Leakage: Personal identifiable information disclosed during LLM interactions
- 2.Proprietary Algorithm Exposure: Model outputs reveal proprietary algorithms or training data
- 3.Sensitive Business Data Disclosure: Responses inadvertently include confidential business information
- 4.Model Inversion Attacks: Attackers extract sensitive information by reconstructing inputs from outputs
>Common Impacts
Unauthorized access to personal data
Privacy violations and regulatory non-compliance
Intellectual property theft
Exposure of security credentials
Competitive disadvantage from leaked business data
>Prevention & Mitigation Strategies
- 1.Integrate data sanitization techniques to prevent user data from entering the training model
- 2.Apply robust input validation to detect and filter potentially harmful or sensitive data inputs
- 3.Enforce strict access controls based on the principle of least privilege
- 4.Restrict data sources and ensure runtime data orchestration is securely managed
- 5.Utilize federated learning to train models using decentralized data
- 6.Incorporate differential privacy to add noise making reverse-engineering difficult
- 7.Educate users on avoiding input of sensitive information
- 8.Ensure transparency in data usage with clear policies on retention and deletion
- 9.Implement homomorphic encryption for secure data analysis
- 10.Use tokenization and redaction to preprocess and sanitize sensitive information
>Attack Scenarios
#1Unintentional Data Exposure
A user receives a response containing another user's personal data due to inadequate data sanitization.
#2Targeted Prompt Injection
An attacker bypasses input filters to extract sensitive information from the model.
#3Data Leak via Training Data
Negligent data inclusion in training leads to sensitive information disclosure in model outputs.
>MITRE ATLAS Mapping
>References
Ask AI
Configure your API key to use AI features.