MediumCapability

Anthropic partners with NNSA to deploy AI classifier for nuclear proliferation risk detection

AI Impact Summary

Anthropic is partnering with the U.S. Department of Energy’s NNSA to develop AI-powered safeguards against the misuse of its Claude models for nuclear proliferation. The core of this effort is a new classifier that automatically identifies concerning nuclear-related conversations with 96% accuracy, deployed initially on Claude traffic. This proactive approach, combined with ongoing risk assessments, represents a critical step in mitigating potential national security threats posed by increasingly capable AI models, and establishes a model for other AI developers to follow.

Affected Systems

ClaudeAnthropic API

Date: Date not specified
Change type: capability
Severity: medium

Anthropic partners with NNSA to deploy AI classifier for nuclear proliferation risk detection

More from Anthropic

Get alerts for Anthropic