Summary

Completed

In this module, you learned about the fundamental concepts of AI security. You explored how AI security differs from traditional cybersecurity—particularly because of the nondeterministic nature of generative AI and the expanded attack surface created by natural language interfaces. You also learned about the significance of responsible AI and industry-standard frameworks like OWASP Top 10 for LLM Applications and MITRE ATLAS.

You examined the three layers of AI architecture—usage, application, and platform—and the distinct security concerns at each layer. You then explored five categories of AI-specific attacks:

  • Jailbreaking: Techniques that bypass safety guardrails, including direct injection, crescendo attacks, and encoding tricks
  • Prompt injection: Direct and indirect (XPIA) attacks that manipulate model behavior through malicious instructions
  • Model manipulation: Model poisoning and data poisoning attacks that compromise the model during training
  • Data exfiltration: Unauthorized extraction of models, training data, or interaction data
  • Overreliance: The human behavioral risk of accepting AI output without verification

For each attack type, you learned about layered mitigation strategies that combine technical controls, monitoring, and human oversight. AI security is a rapidly evolving field—new attack techniques and countermeasures continue to emerge. Staying current with frameworks like OWASP, MITRE ATLAS, and NIST AI RMF is essential for maintaining effective security controls.

Other resources

To continue your learning journey, go to: