Years ago I was honored to share this blog with Eliezer Yudkowsky. One of his main topics then was AI Risk; he was one of the few people talking about it back then. We debated this topic here, and while we disagreed I felt we made progress in understanding each other and exploring the issues. I assigned a much lower probability than he to his key “foom” scenario.
Recently AI risk has become something of an industry, with far more going on than I can keep track of. Many call working on it one of the most effectively altruistic things one can possibly do. But I’ve searched a bit and as far as I can tell that foom scenario is still the main reason for society to be concerned about AI risk now. Yet there is almost no recent discussion evaluating its likelihood, and certainly nothing that goes into as much depth as did Eliezer and I. Even Bostrom’s book length treatment basically just assumes the scenario. Many seem to think it obvious that if one group lets one AI get out of control, the whole world is at risk. It’s not (obvious).
As I just revisited the topic while revising Age of Em for paperback, let me try to summarize part of my position again here.
For at least a century, every decade or two we’ve seen a burst of activity and concern about automation. The last few years have seen another such burst, with increasing activity in AI research and commerce, and also increasing concern expressed that future smart machines might get out of control and destroy humanity. Some argue that these concerns justify great efforts today to figure out how to keep future AI under control, and to more closely watch and constrain AI research efforts. Approaches considered include kill switches, requiring prior approval for AI actions, and designing AI motivational system to make AIs want to help, and not destroy, humanity.
Consider, however, an analogy with organizations. Today, the individuals and groups who create organizations and their complex technical systems are often well-advised to pay close attention to how to maintain control of such organizations and systems. A loss of control can lead not only to a loss of the resources invested in creating and maintaining such systems, but also to liability and retaliation from the rest of the world.
But exactly because individuals usually have incentives to manage their organizations and systems reasonably well, the rest of us needn’t pay much attention to the internal management of others’ organizations. In our world, most firms, cities, nations, and other organizations are much more powerful and yes smarter than are most individuals, and yet they remain largely under control in most important ways. For example, none have so far destroyed the world. Smaller than average organizations can typically exist and even thrive without being forcefully absorbed into larger ones. And outsiders can often influence and gain from organization activities via control institutions like elections, board of directors, and voting stock.
Mostly, this is all achieved neither via outside action approval nor via detailed knowledge and control of motivations. We instead rely on law, competition, social norms, and politics. If a rogue organization seems to harm others, it can be accused of legal violations, as can its official owners and managers. Those who feel hurt can choose to interact with it less. Others who are not hurt may choose to punish the rogue informally for violating informal norms, and get rewarded by associates for such efforts. And rogues may be excluded from political coalitions, who hurt it via the policies of governments and other large organizations.
AI and other advanced technologies may eventually give future organizations new options for internal structures, and those introducing such innovations should indeed consider their risks for increased chances of losing control. But it isn’t at all clear why the rest of us should be much concerned about this, especially many decades or centuries before such innovations may appear. Why can’t our usual mechanisms for keeping organizations under control, outlined above, keep working? Yes, innovations might perhaps create new external consequences, ones with which those outside of the innovating organization would need to deal. But given how little we now understand about the issues, architectures, and motivations of future AI systems, why not mostly wait and deal with any such problems later?
Yes, our usual methods do fail at times; we’ve had wars, revolutions, theft, and lies. In particular, each generation has had to accept slowly losing control of the world to succeeding generations. While prior generations can typically accumulate and then spend savings to ensure a comfortable retirement, they no longer rule the world. Wills, contracts, and other organizational commitments have not been enough to prevent this. Some find this unacceptable, and seek ways to enable a current generation, e.g., humans today, to maintain strong control over all future generations, be they biological, robotic or something else, even after such future generations have become far more capable than the current generation. To me this problem seems both very hard, and not obviously worth solving.
Returning to the basic problem of rogue systems, some forsee a rapid local “intelligence explosion”, sometimes called “foom”, wherein one initially small system quickly becomes vastly more powerful than the entire rest of the world put together. And, yes, if such a local explosion might happen soon, then it could make more sense for the rest of us today, not just those most directly involved, to worry about how to keep control of future rogue AI.
In a prototypical “foom,” or local intelligence explosion, a single AI system starts with a small supporting team. Both the team and its AI have resources and abilities that are tiny on a global scale. This team finds and then applies a big innovation in system architecture to its AI system, which as a result greatly improves in performance. (An “architectural” change is just a discrete change with big consequences.) Performance becomes so much better that this team plus AI combination can now quickly find several more related innovations, which further improve system performance. (Alternatively, instead of finding architectural innovations the system might enter a capability regime which contains a large natural threshold effect or scale economy, allowing a larger system to have capabilities well out of proportion to its relative size.)
During this short period of improvement, other parts of the world, including other AI teams and systems, improve much less. Once all of this team’s innovations are integrated into its AI system, that system is now more effective than the entire rest of the world put together, at least at one key task. That key task might be theft, i.e., stealing resources from the rest of the world. Or that key task might be innovation, i.e., improving its own abilities across a wide range of useful tasks.
That is, even though an entire world economy outside of this team, including other AIs, works to innovate, steal, and protect itself from theft, this one small AI team becomes vastly better at some combination of (1) stealing resources from others while preventing others from stealing from it, and (2) innovating to make this AI “smarter,” in the sense of being better able to do a wide range of mental tasks given fixed resources. As a result of being better at these things, this AI quickly grows the resources under its control and becomes in effect more powerful than the entire rest of the world economy put together. So, in effect it takes over the world. All of this happens within a space of hours to months.
(The hypothesized power advantage here is perhaps analogous that of the first team to make an atomic bomb, if that team had had enough other supporting resources to enable it to use the bomb to take over the world.)
Note that to believe in such a local explosion scenario, it is not enough to believe that eventually machines will be very smart, even much smarter than are humans today. Or that this will happen soon. It is also not enough to believe that a world of smart machines can overall grow and innovate much faster than we do today. One must in addition believe that an AI team that is initially small on a global scale could quickly become vastly better than the rest of the world put together, including other similar teams, at improving its internal abilities.
If a foom-like explosion can quickly make a once-small system more powerful than the rest of the world put together, the rest of the world might not be able to use law, competition, social norms, or politics to keep it in check. Safety can then depend more on making sure that such exploding systems start from safe initial designs.
In another post I may review arguments for and against the likelihood of foom. But in this one I’m content to just point out that the main reason for society, as opposed to particular projects, to be concerned about AI risk is either foom, or an ambition to place all future generations under the tight control of a current generation. So a low estimate of the probability of foom can imply a much lower social value from working on AI risk now.
Added Aug 4: I made a twitter poll on motives for AI risk concern:
If AI Risk is priority now, why? Foom: 1 AI takes over world, Value Drift: default future has bad values, or Collapse: property rights fail
— robin hanson (@robinhanson) August 4, 2017
That's not how I've been using the word for many years, or hear many others use the word.
Surely we will be in a much better position to learn to control such things when actual versions exist around us. They will start small and weak and gradually increase in ability. If they can control each other to keep a peace among themselves, then we can use friendlier ones to help us with our control problems. The party the deploys a version of which they lose control directly loses value, that is not a common pool problem.