A is incorrect: Model retraining, hyperparameter tuning, and fine-tuning are model development techniques that can improve overall robustness. While retraining with adversarial examples can help, these techniques do not constitute a defense-in-depth strategy because they lack real-time filtering or monitoring of prompts and responses during inference. A layered approach requires controls at multiple operational stages, not just the model layer.
B is incorrect: Disk encryption, key rotation, and certificate management are important data protection and cryptographic controls addressing data confidentiality at rest and in transit. However, these controls do not inspect or filter the content of prompts or responses, so they are unable to detect or prevent prompt injection attacks that manipulate the AI model's behavior through crafted natural language inputs at the application layer.
C is correct: Applying input filtering, output validation, and API gateway controls provides the most effective layered defense against prompt injection. Input filtering screens incoming prompts for malicious patterns before they reach the model, output validation checks responses for signs of successful injection or data leakage, and API gateway controls enforce rate limiting and schema validation. This addresses prompt injection at each stage of the request-response lifecycle.
D is incorrect: Network segmentation, firewall rules, and DDoS protection are infrastructure-level controls that protect against network-based attacks. Prompt injection operates at the application layer through natural language inputs that appear as legitimate API requests, making network-level controls ineffective at detecting or blocking malicious content within the prompts. These controls protect availability, not AI-specific application logic.