CriticalCapability

Anthropic detects and prevents AI labs from distilling Claude's capabilities

Action Required

The proliferation of unprotected AI models, enabled by distillation attacks, could compromise national security and undermine Anthropic's ability to maintain a competitive advantage.

AI Impact Summary

Anthropic is proactively addressing a significant security threat: industrial-scale distillation attacks by AI labs seeking to illicitly extract capabilities from Claude. Three labs (DeepSeek, Moonshot, and MiniMax) have been identified engaging in these attacks, generating millions of exchanges to train less capable models. This poses a national security risk due to the potential for unprotected, dangerous capabilities to proliferate, particularly through open-sourced distilled models.

Affected Systems

Claude

Date: Date not specified
Change type: capability
Severity: critical

Anthropic detects and prevents AI labs from distilling Claude's capabilities

More from Anthropic

Get alerts for Anthropic