Common AI Failure Modes and How to Think About Them

AI systems can fail in ways that look very different from traditional software bugs. Instead of stack traces, you see hallucinations, biased decisions, unstable behavior under distribution shift, or adversarial inputs that quietly break the system.

Typical failure modes include:

  • Hallucinations – the model confidently makes things up.
  • Bias – outputs differ unfairly across demographic groups.
  • Adversarial attacks – crafted inputs cause surprising errors.
  • Data drift – input data shifts over time.
  • Concept drift – relationships between inputs and labels change.

The main AI QA page explains how to design tests and monitoring around these risks: visit the AI Quality Assurance pillar page for full details.