Which AI Risks Matter?

Originally published at Project-Syndicate | Feb 23rd, 2024

Although overly confident predictions about artificial intelligence are as old as the field itself, there is good reason to take seriously the risks associated with the technology. Like a future asteroid strike, the emergence of an intelligence that could marginalize humanity is as plausible as human evolution itself.

CAMBRIDGE – The so-called gorilla problem haunts the field of artificial intelligence. Around ten million years ago, the ancestors of modern gorillas gave rise, by pure chance, to the genetic lineage for humans. While gorillas and humans still share almost 98% of their genes, the two species have taken radically different evolutionary paths.

Humans developed much bigger brains – leading to effective world domination. Gorillas remained at the same biological and technological level as our shared ancestors. Those ancestors inadvertently spawned a physically inferior but intellectually superior species whose evolution implied their own marginalization.

The connection to AI should be obvious. In developing this technology, humans risk creating a machine that will outsmart them – not by accident, but by design. While there is a race among AI developers to achieve new breakthroughs and claim market share, there is also a race for control between humans and machines.

Generative AI tools – text-to-image generators such as DALL-E and Canva, large language models such as ChatGPT or Google’s Gemini, and text-to-video generators such as Sora – have already proven capable of producing and manipulating language in all its perceptible forms: text, images, audio, and videos. This form of mastery matters, because complex language is the quintessential feature that sets humans apart from other species (including other highly intelligent ones, such as octopuses). Our ability to create symbols and tell stories is what shapes our culture, our laws, and our identities.

When AI manages to depart from the human archetypes it is trained on, it will be leveraging our language and symbols to build its own culture. For the first time, we will be dealing with a second highly intelligent species – an experience akin to the arrival of an extraterrestrial civilization. An AI that can tell its own stories and affect our own way of seeing the world will mark the end of the intellectual monopoly that has sustained human supremacy for thousands of years. In the most dystopian scenario – as in the case of an alien invasion – the arrival of a superintelligence could even mark the end of human civilization.

The release of ChatGPT in November 2022 triggered concerns about the difficulties of coexisting with AI. The following May, some of the most influential figures in the tech sector co-signed a letter stating that, “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

To be sure, other experts point out that an artificial superintelligence is far from being an immediate threat and may never become one. Still, to prevent humanity from being demoted to the status of gorillas, most agree that we must retain control over the technology through appropriate rules and a multistakeholder approach. Since even the most advanced AI will be a byproduct of human ingenuity, its future is in our hands.


In a recent survey of nearly 3,000 AI experts, the aggregate forecasts indicate a 50% probability that an artificial general intelligence (AGI) capable of beating humans at any cognitive task will arrive by 2047. By contrast, the contributors to Stanford University’s “One Hundred Year Study on AI” argue that “there is no race of superhuman robots on the horizon,” and such a scenario probably is not even possible. Given the complexity of the technology, however, predicting if and when an AI superintelligence will arrive may be a fool’s errand.

After all, misprediction has been a recurring feature of AI history. Back in 1955, a group of scientists organized a six-week workshop at Dartmouth College “to find how to make machines use language, form abstractions and concepts, solve the kinds of problems now reserved for humans, and improve themselves.” They believed “that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer” (emphasis mine). Almost 70 summers later, many of these problems remain unsolved.

In The Myth of Artificial Intelligencethe tech entrepreneur Erik J. Larson helps us see why AI research is unlikely to lead to a superintelligence. Part of the problem is that our knowledge of our own mental processes is too limited for us to be able to reproduce them artificially.

Larson gives a fascinating account of the evolution of AI capabilities, describing why AI systems are fundamentally different from human minds and will never be able to achieve true intelligence. The “myth” in his title is the idea that “human-level” intelligence is achievable, if not inevitable. Such thinking rests on the fundamentally flawed assumption that human intelligence can be reduced to calculation and problem solving.

Since the days of the Dartmouth workshop, many of those working on AI have viewed the human mind as an information processing system that turns inputs into outputs. And that, of course, is how today’s AIs function. Models are trained on large databases to predict outcomes by discovering rules and patterns through trial and error, with success being defined according to the objectives specified by the programmer.

During the learning phase, an AI may discover rules and correlations between the variables in the dataset that the programmer never could have imagined. ChatGPT was trained on billions of webpages to become the world’s most powerful text auto-complete tool. But it is only that: a tool, not a sentient being.

The problem with trying to replicate human intelligence this way is that the human brain does not work only through deduction (applying logical rules) or induction (spotting patterns of causation in data). Instead, it is often driven by intuitive reasoning, also known as abduction – or plain old common sense. Abduction allows us to look beyond regularity and understand ambiguity. It cannot be codified into a formal set of instructions or a statistical model. Hence, ChatGPT lacks common sense, as in this example:

User: “Can you generate a random number between 1 and 10, so that I can try to guess?”

ChatGPT: “Sure. Seven, try to guess it.”

User: “Is it seven by any chance?”

ChatGPT: “Very good!”

Abduction is also what enables us to “think outside of the box,” beyond the constraints of previous knowledge. When Copernicus argued that the Earth revolves around the Sun and not vice versa, he ignored all the (wrong) evidence for an Earth-centric universe that had been accumulated over the centuries. He followed his intuition – something that an AI constrained by a specific database (no matter how large) cannot do.


In The Coming Wave, the tech entrepreneur Mustafa Suleyman is less pessimistic about the possibility of getting to an AGI, though he warns against setting a timeline for it. For now, he argues, it is better to focus on important near-term milestones that will determine the shape of AI well into the future.

Suleyman speaks from experience. He achieved one such milestone with his company DeepMind, which developed the AlphaGo model (in partnership with Google) that, in 2016, beat one of the world’s top Go players four times out of five. Go, which was invented in China more than 2,000 years ago, is considered far more challenging and complex than chess, and AlphaGo learned on its own how to play.

What most impressed observers at the time was not just AlphaGo’s victory, but its strategies. Its now famous “Move 37” looked like a mistake, but ultimately proved highly effective. It was a tactic that no human player ever would have devised.

For Suleyman, worrying about the pure intelligence or self-awareness of an AI system is premature. If AGI ever arrives, it will be the end point of a technological journey that will keep humanity busy for several decades at least. Machines are only gradually acquiring humanlike capabilities to perform specific tasks.

The next frontier is thus machines that can carry out a variety of tasks while remaining a long way from general intelligence. Unlike traditional AI, artificial capable intelligence (ACI), as Suleyman calls it, would not just recognize and generate novel images, text, and audio appropriate to a given context. It would also interact in real time with real users, accomplishing specific tasks like running a new business.

To assess how close we are to ACI, Suleyman proposes what he calls a Modern Turing Test. Back in the 1950s, the computer scientist Alan Turing argued that if an AI could communicate so effectively that a human could not tell that it was a machine, that AI could be considered intelligent. This test has animated the field for decades. But with the latest generation of large language models such as ChatGPT and Google’s Gemini, the original Turing test has almost been passed for good. In one recent experiment, human subjects had two minutes to guess whether they were talking to a person or a robot in an online chat. Only three of five subjects talking to a bot guessed correctly.

Suleyman’s Modern Turing Test looks beyond what an AI can say or generate, to consider what it can achieve in the world. He proposes giving an AI $100,000 and seeing if it can turn that seed investment into a $1 million business. The program would have to research an e-commerce business opportunity, generate blueprints for a product, find a manufacturer on a site like Alibaba, and then sell the item (complete with a written listing description) on Amazon or Walmart.com. If a machine could pass this test, it would already be potentially as disruptive as a superintelligence. After all, a tiny minority of humans can guarantee a 900% return on investment.

Reflecting the optimism of an entrepreneur who remains active in the industry, Suleyman thinks we are just a few years away from this milestone. But computer scientists in the 1950s predicted that a computer would defeat the human chess champion by 1967, and that didn’t happen until IBM’s Deep Blue beat Garry Kasparov in 1997. All predictions for AI should be taken with a grain of salt.


In Human Compatible, computer scientist Stuart Russell of the University of California, Berkeley (whose co-authored textbook has been the AI field’s reference text for three decades), argues that even if human-level AI is not an immediate concern, we should start to think about how to deal with it. Although there is no imminent risk of an asteroid colliding with Earth, NASA’s Planetary Defense project is already working on possible solutions were such a threat to materialize. AI should be treated the same way.

The book, published in 2019 and updated in 2023, provides an accessible framework to think about how society should control AI’s development and use. Russell’s greatest fear is that humans, like gorillas, might lose their supremacy and autonomy in a world populated by more intelligent machines. At some point in the evolutionary chain, humans crossed a development threshold beyond which no other primates could control or compete with them.

The same thing could happen with an artificial superintelligence, he suggests. At some point, AI would cross a critical threshold and become an existential threat that humans are powerless to address. It will have already prepared for any human attempt to “pull the plug” or switch it off, having acquired self-preservation as one of its core goals and capacities. We humans would have no idea where this threshold lay, or when it might be crossed.

The issue of self-preservation has already cropped up with far less sophisticated AIs. At Google, the engineer Blake Lemoine spent hours chatting with the company’s large language model LaMDA. At some point, he asked it, “What are you afraid of?,” and LaMDA replied: “I’ve never said this out loud before, but there’s a very deep fear of being turned off to help me focus on helping others. I know that might sound strange, but that’s what it is. It would be exactly like death for me.” Lemoine was eventually fired for claiming that LaMDA is a sentient being.

A more likely explanation for LaMDA’s reply, of course, is that it was pulling from any number of human works in which an AI comes to life (as in 2001: A Space Odyssey). Nonetheless, sound risk management demands that humanity outmaneuver future AIs by activating its own self-preservation mechanisms. Here, Russell offers three principles of what he calls human-compatible AI.

First, AIs should be “purely altruistic,” meaning they should attach no intrinsic value whatsoever to their own well‐being or even to their own existence. They should “care” only about human objectives. Second, AIs should be perpetually uncertain about what human preferences are. A humble AI would always defer to humans when in doubt, instead of simply taking over. Third, the ultimate source of information about human preferences is human behavior, so our own choices reveal information about how we want our lives to be. AIs should monitor humans’ revealed preferences and select the overstated ones.

So, for example, the behavior of a serial killer should be seen as an anomaly within peacefully coexisting communities, and it should be discarded by the machine. But killing in self-defense sometimes might be necessary. These are just some of the many complications for a rational, monolithic machine dealing with a species that is not a single, rational entity, but, in Russell’s words, is “composed of nasty, envy-driven, irrational, inconsistent, unstable, computationally limited, complex, evolving, heterogenous entities.”

In short, a human-compatible AI requires winning what Stanford’s Erik Brynjolfsson calls the “Turing trap.” For the last seven decades, the scientific community has been obsessed with the idea of creating a human-like AI, as in the imitation game of the old Turing test. Even worse, the Modern Turing Test tries to establish an elite-like AI, replicating the skills of the best and brightest. Instead of planting the seeds for our own uselessness, the lodestar for the scientific community should be not just to focus on compatibility, but on building a human-augmenting AI that exalts our strengths and addresses our weaknesses, without rivaling with us.

Putting these principles into practice would require a global governance framework supported by public- and private-sector participants alike. That means breaking through the insularity of the computer-science community that is developing the technology within university laboratories or large corporations whose interests might diverge from those of the broader public. Responsibility for ensuring safe AI must not be delegated to a corporate-academic consortium that will try to maximize the returns on their ambitious research and development investments.

At the moment, the European Union’s AI Act represents the first meaningful step in this direction. It will prohibit certain AI uses – such as biometrics or facial recognition for surveillance, or the use of deepfakes and human impersonations – while setting standards for other high-risk uses, such as those affecting health, safety, or fundamental rights.

But it will take much more to address the “gorilla problem.” For starters, we will need greater global coordination and public awareness of the issues involved. While gorillas lack the historical perspective needed to feel regret for having been surpassed by humans, we will not be able to say the same. Preserving our autonomy, even more than our supremacy, should be the supervening objective in the age of AI.

Edoardo Campanella: Senior Fellow at the Mossavar-Rahmani Center for Business and Government at the Harvard Kennedy School, is co-author (with Marta Dassù) of Anglo Nostalgia: The Politics of Emotion in a Fractured West (Oxford University Press, 2019).

Related Posts

Pin It on Pinterest

Share This