AI safety is the broad field aimed at keeping AI systems beneficial — covering near-term harms like toxic output and hallucination, as well as longer-term risks like loss of control and Misalignment. The founding missions of Anthropic and OpenAI, the Google DeepMind safety team, and institutes like MIRI helped define the modern shape of the discipline. The day-to-day toolkit includes Red Teaming, evaluation Benchmark suites, Constitutional AI, and Interpretability research. As we move closer to Frontier Model territory, AI safety has graduated from a research niche into an active topic of government policy.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2014
AI Safety
The research and engineering field focused on making AI systems behave as intended and avoid causing unintended harm.
- EN — English term
- AI Safety
- TR — Turkish term
- Yapay Zeka Güvenliği