This week, authorities from the UK, EU, US and seven other countries gathered in San Francisco to launch the “International Network of AI Safety Institutes.”
The meeting, which took place at the Presidio Golden Gate Club, covered managing the risks of AI-generated content, testing basic models, and conducting risk assessments for advanced AI systems. AI safety institutes from Australia, Canada, France, Japan, Kenya, Republic of Korea and Singapore also officially joined the Network.
In addition to signing a mission statement, more than $11 million in funding was allocated for AI-generated content research and the results of the Network's first joint security testing exercise were reviewed. Attendees included regulatory officials , AI developers, academics and civil society leaders to assist in the debate on emerging AI challenges and their potential safeguards.
The call was based on progress made at the previous AI Safety Summit in May, which took place in Seoul. The 10 nations agreed to foster “international cooperation and dialogue on artificial intelligence in the face of its unprecedented advances and impact on our economies and societies.”
“The International Network of AI Safety Institutes will serve as a collaborative forum, bringing together technical expertise to address AI safety risks and best practices,” according to the European Commission. “By recognizing the importance of cultural and linguistic diversity, the Network will work towards a unified understanding of AI security risks and mitigation strategies.”
Member AI safety institutes will have to demonstrate their progress in AI safety testing and assessments at the Paris AI Impact Summit in February 2025 so they can advance regulatory discussions.
Key results from the conference
Signed mission statement
The mission statement commits Network members to collaborate in four areas:
- Investigation: Collaborate with the AI safety research community and share findings.
- Evidence: Develop and share best practices for testing advanced AI systems.
- Guide: Facilitate shared approaches to interpreting AI safety test results.
- Inclusion: Share information and technical tools to expand participation in AI safety science.
More than $11 million allocated to AI safety research
In total, Network members and several nonprofit organizations announced more than $11 million in funding for research to mitigate the risk of AI-generated content. Child sexual abuse material, non-consensual sexual images and the use of AI for fraud and impersonation were highlighted as key areas of concern.
Funding will be allocated as a priority to researchers who investigate digital content transparency techniques and protection models to prevent the generation and distribution of harmful content. Grants will be considered for scientists who develop technical mitigations and socio-scientific and humanistic assessments.
The American institute also published a series of voluntary approaches to address the risks of AI-generated content.
The results of a joint testing exercise are discussed
The network has completed its first joint testing exercise on Meta's Llama 3.1 405B, analyzing its general knowledge, multilingual capabilities, and closed-domain hallucinations, where a model provides information from outside the realm of what it was ordered to report. to.
The exercise raised several considerations for how AI safety testing could be improved across languages, cultures and contexts. For example, the impact that minor methodological differences and model optimization techniques may have on evaluation results. Larger joint test exercises will take place ahead of the Paris AI Action Summit.
Shared basis for risk assessments agreed
The network has agreed on a shared scientific basis for AI risk assessments, including that they should be actionable, transparent, comprehensive, multi-stakeholder, iterative and reproducible. Members discussed how this could be implemented.
US Task Force 'Testing AI Risks to National Security' Established
Eventually, the new TRAINS working group was established, led by the US AI Security Institute, and included experts from other US agencies, including Commerce, Defense, Energy, and Homeland Security. All members will test AI models to manage national security risks in areas such as radiation and nuclear security, chemical and biological security, cybersecurity, critical infrastructure and military capabilities.
SEE: Apple joins the US government's voluntary commitment to AI safety
This reinforces how important the intersection of AI and the military is in the U.S. Last month, the White House released the first National Security Memorandum on Artificial Intelligence, which directed the Department of Defense and U.S. intelligence agencies US to accelerate its adoption of AI in national security missions.
Speakers addressed the balance between AI innovation and security
US Secretary of Commerce Gina Raimondo gave the keynote address on Wednesday. He told attendees that “advancing AI is the right thing to do, but moving forward as quickly as possible, just because we can, without thinking about the consequences, is not the smartest thing to do,” according to TIME.
The battle between progress and security in AI has been a point of contention between governments and technology companies in recent months. While the intent is to keep consumers safe, regulators risk limiting their access to the latest technologies, which could bring tangible benefits. Google and Meta have openly criticized European AI regulation, referring to the region's AI Law, suggesting it will nullify their innovation potential.
Raimondo said the US AI Safety Institute “is not in the business of stifling innovation,” according to the AP. “But here's the thing. Security is good for innovation. Security generates trust. Trust accelerates adoption. “Adoption leads to greater innovation.”
He also emphasized that nations have an “obligation” to manage risks that could negatively impact society, for example by causing unemployment and security breaches. “Let us not let our ambition blind us and allow ourselves to sleepwalk towards our own ruin,” he said via AP.
Dario Amodei, CEO of Anthropic, also gave a talk highlighting the need for security testing. He said that while “people today laugh when chatbots say something a little unpredictable,” this indicates how essential it is to control AI before it takes on more nefarious capabilities, according to Fortune.
Global AI safety institutes have been popping up over the past year.
The first meeting of AI authorities took place at Bletchley Park in Buckinghamshire, UK, about a year ago. The UK AI Safety Institute was launched, which has three main objectives:
- Evaluation of existing AI systems.
- Conduct fundamental research on AI security.
- Share information with other national and international actors.
The United States has its own AI Safety Institute, formally established by NIST in February 2024, which has been appointed chair of the network. It was created to work on priority actions outlined in the AI Executive Order issued in October 2023. These actions include developing standards for the security of AI systems.
SEE: OpenAI and Anthropic Sign agreements with the US AI Safety Institute
In April, the UK government formally agreed to collaborate with the US on developing tests for advanced AI models, largely sharing developments made by their respective AI Safety Institutes. An agreement concluded in Seoul saw the creation of similar institutes in other nations joining the collaboration.
It was especially important to clarify the US position towards AI safety at the San Francisco conference, as the nation as a whole is not currently overwhelmingly supportive. President-elect Donald Trump has promised to repeal the Executive Order when he returns to the White House. California Governor Gavin Newsom, who was present, also vetoed the controversial AI regulation bill SB 1047 in late September.