The fact that individual AI routines today lack the sophistication and power necessary to destroy humanity, and mostly have benign goals, is no reason to think emergent AI intelligence will be nicer than people are.
Runaway artificial intelligence has been a science fiction staple since the 1909 publication of E. M. Forster’s The Machine Stops, and it rose to widespread, serious attention 2023. The National Institute for Standards and Technology released its AI Risk Management Framework in January 2023. Other documents followed, including the Biden administration’s Oct. 30 executive order Safe, Secure, and Trustworthy Artificial Intelligence, and the next day, the Bletchley Declaration on AI Safety signed by 28 countries and the European Union.
As a professional risk manager, I found all these documents lacking. I see more appreciation for risk principles in fiction. In 1939, author Isaac Asimov got tired of reading stories about intelligent machines turning on their creators. He insisted that people smart enough to build intelligent robots wouldn’t be stupid enough to omit moral controls — basic overrides deep in the fundamental circuitry of all intelligent machines. Asimov’s first rule is: “A robot may not injure a human being or, through inaction, allow a human being to come to harm.” Regardless of the AI’s goals, it is forbidden to violate this law.
Or consider Arthur C. Clarke’s famous HAL 9000 computer in the 1968 film, 2001: A Space Odyssey. HAL malfunctions not due to a computer bug, but because it computes correctly that the human astronauts are reducing the chance of mission success — its programmed objective. Clarke’s solution was to ensure manual overrides to AI, outside the knowledge and control of AI systems. That’s how Frank Bowman can outmaneuver HAL, using physical door interlocks and disabling HAL’s AI circuitry.
While there are objections to both these approaches, they pass the first risk management test. They imagine a bad future state and identify what people then would want you to do now. In contrast, the 2023 official documents imagine bad future paths, and resolve that we won’t take them. The problem is an infinite number of future paths, most of which we cannot imagine. There is a relatively small number of plausible bad future states. In finance, a bad future state is to have cash obligations you cannot meet. There are many ways to get there, and we always promise not to take those paths. Promises are nice, but risk management teaches focus on things we can do today to make that future state survivable.
There is no shortage of things that could end human existence: asteroid impact, environmental collapse, pandemic, global thermonuclear war. These are all blind dangers. They do not seek to hurt humans and so there is some possibility that some humans survive.
Two dangers are essentially different — attack by malevolent intelligent aliens, and attack by intelligences we build ourselves. An intelligent enemy hiding until it acquires strength and position to attack, with plans to break through any defenses and to continue its campaign until total victory is attained, is a different kind of worry than a blind catastrophe.
The dangers of computer control are well known. Software bugs can result in inappropriate actions with sometimes fatal consequences. While this is a serious issue, it is a blind risk. AI poses a fundamentally different danger, closer to a malevolent human than to a misfunctioning machine. With AI and machine learning, the human gives the computers objectives rather than instructions. Sometimes these are programmed explicitly, other times the computer is told to infer them from training sets. AI algorithms are tools the computer — not the human — uses to attain the objectives. The danger from a thoughtlessly specified objective is not blind or random.
This differs from a dumb computer program, where a human spells out the program’s desired response to all inputs. Sometimes the programmer makes errors that are not caught in testing. The worst errors are usually unexpected interactions with other programs rather than individual program bugs. When software bugs or computer malfunctions do occur, they lead to random results. Most of the time the consequences are limited to the system the computer is designed to control.
Story continues below Advertisement
This is another key risk distinction between dumb and smart programs. The conventional computer controlling a nuclear power plant might cause a meltdown in the plant, but it can’t fire nuclear missiles, crash the stock market or burn your house down by turning your empty microwave on. But malevolent intelligence could be an emergent phenomenon that arises from the interaction of many AI implementations, controlling almost everything.
Human intelligence, for example, probably emerged from individual algorithms that evolved for vision, muscle control, regulation of bodily functions and other tasks. All those tasks were beneficial to humans. But out of that emergent consciousness, large groups of humans chose to cooperate in complex, specialised tasks to build nuclear weapons capable of wiping out all life on Earth. This was not the only terrible, life-destroying idea that emerged from human intelligence—think genocide, torture, divine right of kings, holy war and slavery. The fact that individual AI routines today lack the sophistication and power necessary to destroy humanity, and mostly have benign goals, is no reason to think emergent AI intelligence will be nicer than people are.
My hope for 2024 is we will conduct serious reverse stress tests for AI. We invite diverse groups of people — not just officials and experts — and have them assume some specific bad state. Maybe it’s 2050 and Skynet has killed all other humans (I often show disaster movies to prepare groups for reverse stress tests, it helps set the mood and make people more creative — it’s Hollywood’s great contribution to risk management). You’re the last survivors, hiding out until Terminators find and terminate you. Discuss what you wish people had done in 2024, not to prevent this state from happening, but to give you some means of survival in 2050.
Aaron Brown is a Bloomberg Opinion columnist. Views do not represent the stand of this publication.
Originally posted 2024-01-02 05:14:02.