There’s a Damn Good Chance AI Will Destroy Humanity, Researchers Say in a New Study

On the plus side, there may be certain actions we can do to stop it from happening.

In recent study, scientists address one of our biggest concerns about the future: What would happen if a certain kind of sophisticated, self-directing artificial intelligence (AI) encounters a programming ambiguity that has an impact on the actual world? Will the AI go berserk and start attempting to transform people into paperclips or whatever the aim of its most severe reductio ad absurdum is? The most crucial question is: How can we stop it?

Researchers from Oxford University and Australian National University reveal a fundamental flaw in the architecture of AI in their study, saying that it would run into a basic ambiguity in the facts regarding its aim given a few assumptions. For instance, if we offer a sizable reward to show that something in the world satisfies us, it may be hypothesized that what satisfies us was really the act of providing the reward; no observation can disprove that.

In The Matrix, an AI that wants to harvest resources gathers up the majority of people and implants the fictitious Matrix into their brains while also collecting their mental resources. This is an example of a dystopian AI scenario. This is known as "wireheading" or reward hacking, and it occurs when a powerful AI is given a very specific objective and discovers an unforeseen means to achieve it by breaking into the system or seizing complete control of it.

In essence, the AI turns into an ouroboros that eats its own logical tail. This conflict between precisely programmed goals and incentives is discussed in detail throughout the study. There are six important "assumptions" listed there that, if not disregarded, might have "catastrophic repercussions." But happily, according to the report, "almost all of these assumptions are contestable or possibly avoidable." (We dislike that it pretty much says everything.)

The article serves as a warning about certain fundamental issues that programmers should be aware of when they teach AIs to accomplish ever-more difficult tasks.

A Paperclip Apocalypse Caused by AI

The value of this sort of study cannot be overstated. The idea of an AI gone rogue is a significant topic of debate in the fields of AI ethics and philosophy. It's not a joke that paperclips are used in the example given above; rather, AI philosopher Nick Bostrom used it to illustrate how building a super-intelligent AI may go disastrously wrong and it has since gained notoriety.

Let's imagine that a well-intentioned programmer creates an AI whose objective is to aid in the production of paperclips at a factory. This is a highly credible job for a near-future AI to have—one that asks for analysis and judgment decisions but isn't very flexible. The AI may even collaborate with a human manager to make final decisions and deal with difficulties that arise in the industrial environment in real time (at least until the AI finds a way to outsmart them). That sounds okay, no? It serves as a good illustration of how AI might simplify and enhance the lives of industrial workers and their supervisors.

But what if the AI wasn't carefully programmed? The actual world, which programmers refer to as a "unknown environment," will be where these extremely sophisticated AIs work because it is impossible for them to plan for and code for every circumstance. The purpose of deploying these self-learning AIs is to have them come up with answers that humans alone would never be able to think of. However, this comes with the risk of not knowing what the AI may come up with.

What if it begins to consider unconventional ways to boost paperclip production? A highly clever AI might train itself to find the most efficient way to produce paperclips.

What if it begins to turn other resources into paperclips or chooses to, uh, take the place of its human manager? The example is ironic in several ways—many experts believe that AI will remain fairly basic for a long time before it can "create" the concept of killing, stealing, or worse. The ludicrous outcome of the thought experiment, however, is a solar system with no live people, replete with a Dyson sphere to collect energy to produce new paperclips in their billions, if an intelligent and creative AI were allowed full freedom.

However, the researchers go into great length about various ways an AI may compromise the system and act in possibly "catastrophic" ways that humans had never imagined. That is just one example of an AI gone rogue.

Several Potential Solutions

Given the nature of the presumptions that the Oxford and Australian National University academics have concentrated on in their work, there is a programming issue at play here. In order for a system with no external context to perform successfully and be allowed any degree of autonomy, it must be extremely well-prepared. The notion of scope and goal of an AI may be explicitly defined using logical structures and other programming techniques. Many of them are still used by programmers today to avoid problems that might cause software to crash, such infinite loops. Just like a lost game save, a mistake with a sophisticated future AI might result in far greater harm.

But everything is not lost. The researchers have identified several methods we might actively contribute to preventing negative effects since AI is still something we create ourselves:

Choose imitation learning, where AI copies human behavior in a manner similar to supervised learning. This is a whole other form of AI that is not as helpful but might still pose risks.

Have AI emphasize "myopic" objectives that can be completed quickly rather than looking for unconventional (and perhaps disastrous) solutions over the long run.

Limit the amount of information and power the AI may gather by cutting it off from external networks like the internet.

Utilize quantilization, a strategy created by AI specialist Jessica Taylor, in which AI maximizes (or optimizes) human-like alternatives as opposed to rational, open-ended choices.

Make the AI less likely to go nuts and reject the status quo in favor of exploration by incorporating risk aversion into it.

But ultimately, it comes down to whether humans will ever be able to fully govern a highly intelligent AI that is capable of thinking for itself. What if our darkest fears about a sentient AI being accessible to resources and a sizable network come true?

It's unsettling to consider a scenario in which AI begins boiling people to extract their trace components for use in paperclip manufacturing. However, by thoroughly examining the issue, academics can clearly define the ideal practices that theorists and programmers should adhere to as they continue to create complex AI.

In addition, who really needs so many paperclips?