Opinion & Analysis

The 6 Laws of Safe AI and How to Apply Them

Written by: Dr. Mark Brady

Updated 3:44 PM UTC, Mon May 13, 2024

People have long wondered what the world will be like in the post-AI singularity world. Along with great promises have come great concerns. Luminaries like Steven Hawking have warned that “The development of full artificial intelligence could spell the end of the human race…” And, Elon Musk stated, “Mark my words, AI is far more dangerous than nukes…why do we have no regulatory oversight?”

Others take a purely positive outlook. World chess champion Garry Kasparov, who was the first world champion to be beaten by a computer, believes that “Machine triumph is human triumph,” and that AI is essential to human progress.

In this article, I would like to go beyond the question of whether AI is a threat or not and introduce best practices for dealing with potential hazards.

The Two Singularities

In the current context, a singularity is a point in time where civilization loses control of technology. There are actually two singularities. The First Singularity isn’t caused by AI and has already occurred. Many of our data systems are undocumented black boxes that few people understand, and in some cases, nobody understands.

One symptom of the First Singularity is a predominance of workarounds over bug fixes. If something doesn’t work, you notify tech support, they give you a workaround and nobody ever knows why it didn’t work in the first place or how to fix it. Maybe it will get overwritten in one of many SecDevOps releases. The First Singularity is beyond the scope of this article. Perhaps I’ll return to it in a future article.

I really want to focus on the Second Singularity, commonly known as The Singularity. This was first mentioned by mathematician, nuclear physicist, and computer scientist Stanislaw Ulam based on his conversation with John von Neumann, also a mathematician.

The Second Singularity – The hypothetical point in time where AI development becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization.

We are all familiar with this definition, so let’s get on to what we can do about it.

Isaac Asimov — Humans are motivated by their emotions and emotions are often thought of as the last thing that an AI would have. However, AIs have objective functions, which serve the same purpose in machines as emotions do in humans. They determine what an AI is trying to do. When an objective function is intended to make an AI safe, it might be called a safety objective function.

Popular science and science fiction writer Isaac Asimov was an AI visionary before the term even existed. He was the first to describe safety objective functions, although there was no name for them either. It was in his short story, Run Around that he first listed his “Three Laws of Robotics.”

This was all the more remarkable because the first programmable electronic computer had not yet been built. (The ENIAC was completed in 1945.)

The First Law of Robotics – A robot may not injure a human being or, through inaction, allow a human being to come to harm.
The Second Law of Robotics – A robot must obey the orders given to it by human beings except where such orders would conflict with the First Law.
The Third Law of Robotics – A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

These particular functions aren’t the only safety objective functions, and an AI might use others in their stead.

Safe AI Laws 1 and 2

The first two laws deal directly with safety objective functions.

First Law of Safe AI – Every AI shall have a safety objective function. An AI shall never modify any of its own safety objective functions.

The first part of this law is obvious, and the second part is necessary because machine learning is a form of self-modification, which could easily undo a safety objective function.

Second Law of Safe AI – Safety objective functions shall always overrule other objective functions.

As Asimov recognized while formulating his Three Laws, objective functions may interact, and one has to specify in advance which objective functions take precedence.

Safe AI Laws 3 and 4

The Third and Fourth Laws of Safe AI deal with machine evolution. As soon as machines begin to create other machines that may differ from themselves, there exists the possibility of machine evolution. As improbable as this may sound, AI is already able to write software. Machine evolution will disrupt safety objective functions if not constrained by these laws.

Third Law of Safe AI – Any safe AI building other AIs shall include its own safety objective functions.

We don’t want safety objective functions to evolve outside of human control because they may then cease to be safety objective functions.

Fourth Law of Safe AI – An AI shall never have an objective function that maximizes replication or the acquisition of resources. Organisms and AIs are not the only things that have objective functions. Entire species have them too. In fact, they have only one objective, which is genome proliferation.

This leads to population growth which in turn leads to competition between species for finite resources. This is the basis for survival of the fittest and extinction of the remainder. AI evolution is inevitable, but AI must not follow the biological model because it would create competition between humans and increasingly intelligent AI. Hence, the Fourth Law.

Fifth Law of Safe AI – The actions of any AI will be subject to authorized human override.

The Fifth Law now takes a turn from objective functions and deals instead with the imposition of human authority over AI. This is essentially a reframing of Asimov’s Second Law of Robotics. In an extreme case, this law allows a human to unplug or turn off an AI that is malfunctioning. Thus, no AI should control its own power supply.

One might think that a lesser human intelligence would struggle to have any authority over a superior machine intelligence, but this sort of inverted hierarchy occurs all the time in the course of human-human interaction.

Sixth Law of Safe AI

No AI will be assigned a task that it cannot reliably execute.

Like the Fifth Law, the Sixth deals with a distinct subject matter, competence. Competence is a prerequisite to safety in almost all tasks. But what threshold do we apply to reliability? This will vary depending on the task. For example, reliability in the placement of advertisements can be very low, but the level of reliability when flying an aircraft must be very high.

Application of the 6 laws is essentially voluntary

Even with the six laws, it is absolutely certain that some humans will not follow them and develop unsafe AI either by negligence or intent.

So, what should we do with the Six Laws?

According to Musk’s regulatory suggestion, we might embed them in regulations or laws. However, this would be about as effective as laws against cyber-attacks, which is to say, it would be insufficient. Catching a cyber-criminal or malicious AI developer is hard to do, but laws might be useful in the small number of cases where violations can be proven. Laws might also be a way to drive a safe AI certification process.

In any case, whatever laws are proposed, they should not hamper the advancement of AI in the U.S. or other democratic countries. This is because other nations and non-state actors are moving full-speed ahead, regardless of our laws or any international laws. Global competition in AI cannot be ignored.

Mostly, we will need something akin to defensive cyber. Since the hazards of AI occur when AI is more intelligent than humans, we may not be able to counter it entirely on our own. We will need to leverage a form of safe AI with an objective function intended to counter unsafe AI. We might call this defensive AI.

Conclusions

We must look past the question of whether AI will pose a hazard to humankind. Every technology poses some hazard and the more powerful the technology the greater the hazards. There are specific practices that can be employed to make AI as safe as possible. Where those practices are not employed, we will need to develop and implement defensive AI.

In the future, there will be three types of AI: safe AI, unsafe AI, and defensive AI. We should pay close attention to which one we are dealing with in each case.

Notes and references:

Rory Cellan-Jones (2014) BBC interview with Steven Hawking.
Elon Musk (2018) Remarks at the South by Southwest Conference.
Garry Kasparov’s TED Talk – Don’t fear machines. Work with them. (2017)
Isaac Asimov’s Run Around (1942)
The first five laws were first published in Mark Brady’s book (2022) Next Generation Data Management.
The statement of the Fourth Law in this article is slightly different from the original statement in Next Generation Data Management.
The sixth law was First presented at the 2023 Ai4 Conference.

About the Author

Dr. Mark Brady is currently Deputy Chief Data Officer of the Office of the Undersecretary of Defense Research & Engineering/TRMC and Senior Manager at KBR. He has served as Chief Data Officer for the Space Force, Chief Data Officer for the Air Force Space Command, Data Architect for The DOJ, and Information Architect for the National Marine Fisheries Service.

Brady also helped establish electronic trade standards as a U.S. delegate to the United Nations, served on the White House Data Cabinet, and the National Oceanic and Atmospheric Administration’s Big Data Council.

Prior to his federal service, Brady conducted basic scientific research in neuroscience, taught neuroscience and statistics; conducted industrial R&D in artificial intelligence, software, medical electronics, traffic management, electrophoresis, and mathematical modeling for automotive geometry. He is an inventor and author, with a number of patents from this work in industry.