Research
gpt-oss-safeguard technical report
The technical report introduces two open-weight reasoning models, gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, which have been post-trained from the gpt-oss models to enhance policy-driven content labeling. The report details baseline safety evaluations of these models, leveraging the gpt-oss models as a comparative baseline. This development is significant for practitioners as it provides new tools for implementing safety measures in AI systems, focusing on compliance with specific content policies.
gpt-oss-safeguardsafetyevaluationmodels