January 30, 2026

What is AI Safety and Why Should You Care?

By
Audrey Tim
January 30, 2026

Over the past year, as I’ve been spending more time speaking to builders, policymakers, and communities about AI, one thing has become increasingly clear: AI safety isn’t fringe anymore.

How many of you have seen this research from Marc Zao-Sanders? “Therapy / companionship” ranks as the top Gen AI use case in 2025, with “organising my life”, “finding purpose”, “enhanced learning”, and “generating code” occupying the other Top 5 spots.

This technology is no longer confined to research labs or experimental demos. As we embed more AI in our lives and it advances at a rapid pace, the question is no longer just what we can build, but how we ensure these systems are developed in ways that genuinely benefit society.

Why AI safety is a growing field

AI safety has gained attention not because of catastrophic risks alone (”What if AI takes over the world!”) but because failures now risk becoming systemic.

Today, AI is embedded in

  • content recommendation and moderation
  • hiring and performance evaluation
  • healthcare triage and decision support
  • mental health support
  • financial services
  • public-facing digital services

When systems like these fail, the impact isn’t limited to edge cases. It spreads at scale.

This is why AI safety industry labs and regulators are now actively grappling with how to govern and deploy these systems responsibly. Singapore, for example, introduced a new Model AI Governance Framework for Agentic AI to provide guidance on technical and non-technical measures for responsible deployment.

Why AI safety matters for developers

AI systems behave differently from traditional software. They’re probabilistic, which is why their failure modes are harder to predict. Their behaviour could also change when deployed in new contexts. If safety considerations are deferred until after deployment, debugging becomes expensive, complex, and its failures risk damaging reputations.

Incorporating safety thinking early, such as robust evaluation and appropriate human oversight, helps teams build systems that are more reliable and resilient as they scale. AI safety, in this sense, is not about slowing innovation, but about making innovation sustainable in the long-term.

Why AI safety matters for everyday people

You don’t need to build AI to be affected by it.

AI systems increasingly shape the information we see, the services we can access, and the decisions made about us. And here’s the kicker - these decisions usually happen in ways that are invisible or difficult to challenge. When these systems are poorly designed or insufficiently tested, consequences could include:

  • being misled by confident but incorrect outputs (hallucinations, as we’ve seen);
  • decisions made without transparency or recourse;
  • uneven impacts across different communities;
  • and more broadly, erosion of trust in digital systems.

Three AI safety concepts worth knowing

If you’re new to the field, these three concepts form the backbone of many AI safety discussions (I’ll dive deeper in future articles).

1) Alignment / Specification Failures: When an AI system optimises for a poorly defined or incomplete objective, leading it to behave in ways that satisfies its technical goals but conflict with human intent or values.

  • Example: A content moderation system is optimised to reduce the number of reported posts. It learns to suppress posts from certain groups or topics that are more likely to be flagged, using reports as a proxy for harm, not because they violate rules, but because they attract attention. The system meets its metric, but silences legitimate voices in the process.

2) Robustness: How well a system performs under real-world conditions, including edge cases, ambiguity, and misuse, not just in controlled testing environments.

  • Example: A healthcare decision-support system performs accurately during trials, but struggles when used across different hospitals with varying data quality and patient profiles. As a result, its recommendations become less reliable in certain contexts, potentially affecting clinical judgement and patient outcomes.

3) Interpretability: The ability for humans to understand or meaningfully inspect how an AI system produces its outputs.

  • Example: An automated loan approval system rejects an applicant, but neither the user nor the organisation deploying it can clearly explain why. Without insight into how the decision was made, it becomes difficult to assess whether the outcome was fair, correct, or compliant with policy. No explanation = no accountability.

Want to learn more? Start here

As I’ve been learning more about AI safety, a few beginner-friendly resources stood out:

- Paper: Key Concepts in AI Safety by Georgetown’s CSET, which provides a layman overview of key concepts.

- Book: The Alignment Problem by Brian Christian, an accessible introduction to how AI systems can behave in unintended ways, and why aligning them with human values is such a challenge.

- Podcast: For Humanity: An AI Risk Podcast by John Sherman, featuring thoughtful conversations on threats posed by AGI and key questions in the AI Safety space.

- Course: BlueDot Impact’s AI Alignment course. A structured, non-technical introduction to modern AI safety thinking, designed for people from diverse backgrounds. For technical folks, you could also check out its Technical AI Safety course.