US Federal News Bureau
Written by: CDO Magazine Bureau
Updated 2:54 PM UTC, Wed February 26, 2025
Representative image by rawpixel on freepik.
Scale AI is collaborating with the U.S. AI Safety Institute (AISI) to enhance testing methods for frontier AI models. This partnership will introduce new evaluation techniques co-developed by AISI and Scale’s research division, the Safety, Evaluation, and Alignment Lab (SEAL).
Under this agreement, model developers of all sizes will have voluntary access to advanced testing methods co-developed by Scale and AISI.
These evaluations will provide reliable assessments of model performance across key domains, including math, reasoning, and AI coding. Companies can test their models efficiently with Scale and, if desired, share the results with a global network of AI safety institutes.
“AISI will use evaluation data to better understand AI technology and inform the creation of standards and policies. Similar to college readiness tests, these AI evaluations will provide AI developers with information on their models’ real world performance,” Scale AI said in a blog post.
Last year, Scale AI also partnered with the U.S. Department of Defence’s (DoD) Chief Digital and Artificial Intelligence Office (CDAO) as a test and evaluation (T&E) partner for AI companies, to create a comprehensive T&E framework for the responsible use of large language models (LLMs) within the DoD.
Scale AI is helping develop customized benchmark tests tailored to DoD use cases, integrate them into the T&E platform, and aid the CDAO’s T&E strategy for LLMs.
The move aims to furnish a safety framework for AI deployment, assess model performance, offer real-time feedback to warfighters, and develop specialized evaluation sets for military applications.