開発者が青少年向けに安全なAI体験を構築する支援
OpenAIは、開発者がgpt-oss-safeguardを使用してAIシステムにおける年齢固有のリスクを軽減するためのプロンプトベースのティーン安全ポリシーをリリースした。
キーポイント
ティーン向けAI安全ポリシーの提供
OpenAIが開発者向けに、ティーン(10代)の安全を確保するためのプロンプトベースのポリシーをリリースした。
gpt-oss-safeguardの活用
開発者はgpt-oss-safeguardを使用して、AIシステムにおける年齢固有のリスクをモデレートできるようになる。
開発者支援の目的
このリリースは、開発者がより安全なAI体験をティーン向けに構築するのを支援することを目的としている。
影響分析・編集コメントを表示
影響分析
この発表は、AI業界における未成年者保護の実践的な取り組みを示しており、開発者コミュニティに対して具体的な安全対策の提供という点で意義がある。ただし、技術的な革新性は限定的であり、既存の安全対策の拡張という位置付けである。
編集コメント
AIの安全対策が具体的なユースケース(ティーン保護)に焦点を当てて進化していることを示すニュース。開発者向けの実用的なガイドライン提供という点で評価できるが、技術的なブレークスルーというよりは既存枠組みの適用事例と言える。
OpenAI、GPT-OSS-Safeguardを利用する開発者向けにプロンプトベースのティーン安全ポリシーを公開。AIシステムにおける年齢層特有のリスクを軽減するモデレートを支援します。
原文を表示
Today, we’re releasing prompt-based safety policies(opens in a new window) to help developers create age-appropriate protections for teens. Built to work with our open-weight safety model, gpt-oss-safeguard(opens in a new window), these policies simplify how developers turn safety requirements into usable classifiers for real-world systems.We released open weight models to democratize access to powerful AI and support broad innovation. At the same time, we believe safety and innovation go hand in hand, and that developers should have access to capable models as well as the tools and policies to deploy them safely and responsibly. We developed these policies to support developers in their safety efforts to protect young users, with input from trusted external organizations including Common Sense Media(opens in a new window) and everyone.ai(opens in a new window).We recognize that teens and adults have different needs, and that teens need additional protections. These policies are designed to help developers account for those differences and build experiences that are both empowering and appropriate for younger users.Today’s release builds on that foundation. We’re making these safety policies available to developers to support them in deploying safety protections for teens and helping democratize access across the open weights ecosystem. While safety classifiers like gpt-oss-safeguard can detect harmful content, they depend on clear definitions of what that content is. In practice, one of the biggest challenges developers face is defining policies that accurately capture teen-specific risks and can be consistently applied in real systems. Even experienced teams often struggle to translate high-level safety goals into precise, operational rules, especially since it requires both subject matter expertise and deep AI knowledge. This can lead to gaps in protection, inconsistent enforcement, or overly broad filtering. Clear, well-scoped policies are a critical foundation for effective safety systems.To address this challenge, we are releasing a set of safety policies(opens in a new window), tailored to common risks faced by teens and informed by careful review of existing research about teens’ unique developmental differences. These policies are structured as prompts that can be directly used with gpt-oss-safeguard(opens in a new window) and other reasoning models, enabling developers to more easily apply consistent safety standards across their systems. The initial release includes policies covering:Graphic violent contentGraphic sexual contentHarmful body ideals and behaviorsDangerous activities and challengesRomantic or violent roleplayAge-restricted goods and servicesThese policies can be used for real-time content filtering, as well as offline analysis of user-generated content.By structuring policies as prompts, developers can more easily integrate them into existing workflows, adapt them to their use cases, and iterate over time.We worked with external organizations including Common Sense Media(opens in a new window) and everyone.ai(opens in a new window) to inform the development of these policies. Their expertise helped shape the scope of content to cover, strengthen the structure of the prompts, and refine the edge cases to consider when evaluating them. This work reflects an ongoing effort to collaborate with experts and the broader ecosystem to improve how AI systems support young people.“One of the biggest gaps in AI safety for teens has been the lack of clear, operational policies that developers can build from. Many times, developers are starting from scratch. These prompt-based policies help set a meaningful safety floor across the ecosystem, and because they're released as open source, they can be adapted and improved over time. We're encouraged to see this kind of infrastructure being made available broadly, and we hope it catalyzes more shared youth-safety starting points across the industry.” —Robbie Torney, Head of AI & Digital Assessments, Common Sense Media“Efforts like this that make youth safety policies more operational are valuable because they help translate expert knowledge into guidance that can be used in real systems. Content policies are an important first step, and they also open the door to broader work on how model behavior can shape youth-relevant risks over time. Inspired by this work and our own research, everyone.ai(opens in a new window) has also created an initial behavioral policy focused on risks like exclusivity and overreliance."—Dr. Mathilde Cerioli, Chief Scientist at everyone.AIThe policies are intended as a starting point, not as a comprehensive or final definition or guarantee of teen safety. Each application has unique risks, audiences and contexts, and developers are best positioned to understand the risks that their products and AI integrations may present. We strongly encourage developers to adapt and extend these policies based on their specific needs and combine them with other safeguards such as product design decisions, user controls, teen-friendly transparency, monitoring systems and thoughtful, age-appropriate responses. We believe a layered defense in depth approach is essential to building safer AI systems. These policies draw from our internal experience, but they do not reflect the full extent of OpenAI’s internal policies or safeguards. Developers and organizations can adapt these policies to their specific applications, translate them into different languages, and extend them to cover additional risk areas. Over time, we hope this contributes to a more robust and shared foundation for implementing safety policies in AI systems.
関連記事
今日のまとめ
AI日報で今日の重要ニュースをまとめ読み