US Federal News Bureau
Written by: CDO Magazine Bureau
Updated 5:43 PM UTC, Tue September 9, 2025
Anthropic is partnering with the National Nuclear Security Administration (NNSA) to develop a tool that detects potentially concerning nuclear-related conversations with AI systems. The effort is part of ongoing work with the small but critical agency responsible for safeguarding the U.S. nuclear stockpile.
While AI can accelerate scientific discovery, including new chemical compounds for clean energy, it also raises risks of inadvertently revealing information that could aid nuclear weapon design. To address this, Anthropic, NNSA, and the Energy Department’s national laboratories built a classifier that identifies whether nuclear conversations are benign or troubling with 96% accuracy.
The model was trained on an NNSA-curated list of nuclear risk indicators and tested against 300 synthetic prompts to preserve user privacy. NNSA experts validated and refined the system, expanding on earlier red-teaming collaborations.
Now deployed within Anthropic’s Claude, the tool may be adopted more widely by other AI companies, the firm said.