Stewart and Dawkins on the unintended consequences of powerful technologies

Richard Dawkins and Jon Stewart discussed existential risk on the Sept. 24, 2013 edition of The Daily Show. Here’s how it went down:

STEWART: Here’s my proposal… for the discussion tonight. Do you believe that the end of our civilization will be through religious strife or scientific advancement? What do you think in the long run will be more damaging to our prospects as a human race?

In reply, Dawkins said that Martin Rees (of CSER) thinks humanity has a 50% chance of surviving the 21st century, and one cause for such worry is that powerful technologies could get into the hands of religious fanatics. Stewart replied:

STEWART: … [But] isn’t there a strong probability that we are not necessarily in control of the unintended consequences of our scientific advancement?… Don’t you think it’s even more likely that we will create something [for which] the unintended consequence… is worldwide catastrophe?

DAWKINS: That is possible. It’s something we have to worry about… Science is the most powerful way to do whatever you want to do. If you want to do good, it’s the most powerful way to do good. If you want to do evil, it’s the most powerful way to do evil.

STEWART: … You have nuclear energy and you go this way and you can light the world, but you go this [other] way, and you can blow up the world. It seems like we always try [the blow up the world path] first.

DAWKINS: There is a suggestion that one of the reasons that we don’t detect extraterrestrial civilizations is that when a civilization reaches the point where it could broadcast radio waves that we could pick up, there’s only a brief window before it blows itself up… It takes many billions of years for evolution to reach the point where technology takes off, but once technology takes off, it’s then an eye-blink — by the standards of geological time — before…

STEWART: … It’s very easy to look at the dark side of fundamentalism… [but] sometimes I think we have to look at the dark side of achievement… because I believe the final words that man utters on this Earth will be: “It worked!” It’ll be an experiment that isn’t misused, but will be a rolling catastrophe.

DAWKINS: It’s a possibility, and I can’t deny it. I’m more optimistic than that.

STEWART: … [I think] curiosity killed the cat, and the cat never saw it coming… So how do we put the brakes on our ability to achieve, or our curiosity?

DAWKINS: I don’t think you can ever really stop the march of science in the sense of saying “You’re forbidden to exercise your natural curiosity in science.” You can certainly put the brakes on certain applications. You could stop manufacturing certain weapons. You could have… international agreements not to manufacture certain types of weapons…

Assorted links

Bostrom’s unfinished fable of the sparrows

SuperintelligenceNick Bostrom’s new book — Superintelligence: Paths, Dangers, Strategies — was published today in the UK by Oxford University Press. It opens with a fable about some sparrows and an owl:

It was the nest-building season, but after days of long hard work, the sparrows sat in the evening glow, relaxing and chirping away.

“We are all so small and weak. Imagine how easy life would be if we had an owl who could help us build our nests!”

“Yes!” said another. “And we could use it to look after our elderly and our young.”

“It could give us advice and keep an eye out for the neighborhood cat,” added a third.

Then Pastus, the elder-bird, spoke: “Let us send out scouts in all directions and try to find an abandoned owlet somewhere, or maybe an egg. A crow chick might also do, or a baby weasel. This could be the best thing that ever happened to us, at least since the opening of the Pavilion of Unlimited Grain in yonder backyard.”

The flock was exhilarated, and sparrows everywhere started chirping at the top of their lungs.

Only Scronkfinkle, a one-eyed sparrow with a fretful temperament, was unconvinced of the wisdom of the endeavor. Quoth he: “This will surely be our undoing. Should we not give some thought to the art of owl-domestication and owl-taming first, before we bring such a creature into our midst?”

Replied Pastus: “Taming an owl sounds like an exceedingly difficult thing to do. It will be difficult enough to find an owl egg. So let us start there. After we have succeeded in raising an owl, then we can think about taking on this other challenge.”

“There is a flaw in that plan!” squeaked Scronkfinkle; but his protests were in vain as the flock had already lifted off to start implementing the directives set out by Pastus.

Just two or three sparrows remained behind. Together they began to try to work out how owls might be tamed or domesticated. They soon realized that Pastus had been right: this was an exceedingly difficult challenge, especially in the absence of an actual owl to practice on. Nevertheless they pressed on as best they could, constantly fearing that the flock might return with an owl egg before a solution to the control problem had been found.

It is not known how the story ends, but the author dedicates this book to Scronkfinkle and his followers.

For some ideas of what Scronkfinkle and his friends can work on before the others return with an owl egg, see here and here.

How to study superintelligence strategy

(Last updated Feb. 11, 2015.)

What could an economics graduate student do to improve our strategic picture of superintelligence? What about a computer science professor? A policy analyst at RAND? A program director at IARPA?

In the last chapter of Superintelligence, Nick Bostrom writes:

We find ourselves in a thicket of strategic complexity and surrounded by a dense mist of uncertainty. Though many considerations have been discerned, their details and interrelationships remain unclear and iffy — and there might be other factors we have not thought of yet. How should we act in this predicament?

… Against a backdrop of perplexity and uncertainty, [strategic] analysis stands out as being of particularly high expected value. Illumination of our strategic situation would help us target subsequent interventions more effectively. Strategic analysis is especially needful when we are radically uncertain not just about some detail of some peripheral matter but about the cardinal qualities of the central things. For many key parameters, we are radically uncertain even about their sign…

The hunt for crucial considerations… will often require crisscrossing the boundaries between different academic disciplines and other fields of knowledge.

Bostrom does not, however, provide a list of specific research projects that could illuminate our strategic situation and thereby “help us target subsequent interventions more effectively.”

Below is my personal list of studies which could illuminate our strategic situation with regard to superintelligence. I’m hosting it on my personal site rather than MIRI’s blog to make it clear that this is not “MIRI’s official list of project ideas.” Other researchers at MIRI would, I’m sure, put together a different list. [Read more…]

Books I finished reading in June 2014

I read Thiel’s Zero to One (2014) in May, but forgot to mention it in the May books post. I enjoyed it very much. His key argument is that progress comes from monopolies, not from strong competition, so we should encourage certain kinds of monopolies. I generally agree. I also agree with Thiel that technological progress has slowed since the 70s, with the (lone?) exception of IT.

The Info Mesa (2003), by Ed Regis, is fine but less interesting than Great Mambo Chicken (which I’m currently reading) and Nano (which I finished last month).

The Atomic Bazaar (2003), by William Langewiesche, tells the story of nuclear trafficking and the rise of poor countries with nuclear weapons programs, and especially the activities of Abdul Qadeer Khan. It was pretty good, though I wish it had done a better job of explaining the limits, opportunities, and incentives at play in the nuclear arms trade.

Age of Ambition (2014), by Evan Osnos, is a fantastically rich portrait of modern China. Highly recommended.

Human Accomplishment (2003), by Charles Murray, is a fine specimen of quantitative historical analysis. The final chapters are less persuasive than the rest of the book, but despite this terms like magisterial and tour de force come to mind. Murray does an excellent job walking the reader through his methodology, its pros and cons, the reasons for it, and the conclusions that can and can’t be drawn from it. You’ll probably like this if you enjoyed Pinker’s Better Angels of Our Nature.

Superintelligence (2014), by Nick Bostrom, is a fantastic summary of the last ~15 years of strategic thinking about machine superintelligence from (largely) FHI and MIRI, the two institutes focused most directly on the issue. If you want to get a sense of what’s been learned during that time, first read Bostrom’s 1997 paper on superintelligence (and other topics), and then read his new book. It comes out in the UK on July 3rd and in the USA on September 3rd. Highly recommended.

The Honest Truth about Dishonesty (2013), by Dan Ariely, is as fun and practical as the other Ariely books. Recommended.

Expertise vs. intelligence and rationality

When you’re not sure what to think about something, or what to do in a certain situation, do you instinctively turn to a successful domain expert, or to someone you know who seems generally very smart?

I think most people don’t respect individual differences in intelligence and rationality enough. But some people in my local community tend to exhibit the opposite failure mode. They put too much weight on a person’s signals of explicit rationality (“Are they Bayesian?”), and place too little weight on domain expertise (and the domain-specific tacit rationality that often comes with it).

This comes up pretty often during my work for MIRI. We’re considering how to communicate effectively with academics, or how to win grants, or how to build a team of researchers, and some people (not necessarily MIRI staff) will tend to lean heavily on the opinions of the most generally smart people they know, even though those smart people have no demonstrated expertise or success on the issue being considered. In contrast, I usually collect the opinions of some smart people I know, and then mostly just do what people with a long track record of success on the issue say to do. And that dumb heuristic seems to work pretty well.

Yes, there are nuanced judgment calls I have to make about who has expertise on what, exactly, and whether MIRI’s situation is sufficiently analogous for the expert’s advice to work at MIRI. And I must be careful to distinguish credentials-expertise from success-expertise (aka RSPRT-expertise). And this process doesn’t work for decisions on which there are no success-experts, like long-term AI forecasting. But I think it’s easier for smart people to overestimate their ability to model problems outside their domains of expertise, and easier to underestimate all the subtle things domain experts know, than vice-versa.

Will AGI surprise the world?

Yudkowsky writes:

In general and across all instances I can think of so far, I do not agree with the part of your futurological forecast in which you reason, “After event W happens, everyone will see the truth of proposition X, leading them to endorse Y and agree with me about policy decision Z.”

Example 2: “As AI gets more sophisticated, everyone will realize that real AI is on the way and then they’ll start taking Friendly AI development seriously.”

Alternative projection: As AI gets more sophisticated, the rest of society can’t see any difference between the latest breakthrough reported in a press release and that business earlier with Watson beating Ken Jennings or Deep Blue beating Kasparov; it seems like the same sort of press release to them. The same people who were talking about robot overlords earlier continue to talk about robot overlords. The same people who were talking about human irreproducibility continue to talk about human specialness. Concern is expressed over technological unemployment the same as today or Keynes in 1930, and this is used to fuel someone’s previous ideological commitment to a basic income guarantee, inequality reduction, or whatever. The same tiny segment of unusually consequentialist people are concerned about Friendly AI as before. If anyone in the science community does start thinking that superintelligent AI is on the way, they exhibit the same distribution of performance as modern scientists who think it’s on the way, e.g. Hugo de Garis, Ben Goertzel, etc.

My own projection goes more like this:

As AI gets more sophisticated, and as more prestigious AI scientists begin to publicly acknowledge that AI is plausibly only 2-6 decades away, policy-makers and research funders will begin to respond to the AGI safety challenge, just like they began to respond to CFC damages in the late 70s, to global warming in the late 80s, and to synbio developments in the 2010s. As for society at large, I dunno. They’ll think all kinds of random stuff for random reasons, and in some cases this will seriously impede effective policy, as it does in the USA for science education and immigration reform. Because AGI lends itself to arms races and is harder to handle adequately than global warming or nuclear security are, policy-makers and industry leaders will generally know AGI is coming but be unable to fund the needed efforts and coordinate effectively enough to ensure good outcomes.

At least one clear difference between my projection and Yudkowsky’s is that I expect AI-expert performance on the problem to improve substantially as a greater fraction of elite AI scientists begin to think about the issue in Near mode rather than Far mode.

As a friend of mine suggested recently, current elite awareness of the AGI safety challenge is roughly where elite awareness of the global warming challenge was in the early 80s. Except, I expect elite acknowledgement of the AGI safety challenge to spread more slowly than it did for global warming or nuclear security, because AGI is tougher to forecast in general, and involves trickier philosophical nuances. (Nobody was ever tempted to say, “But as the nuclear chain reaction grows in power, it will necessarily become more moral!”)

Still, there is a worryingly non-negligible chance that AGI explodes “out of nowhere.” Sometimes important theorems are proved suddenly after decades of failed attempts by other mathematicians, and sometimes a computational procedure is sped up by 20 orders of magnitude with a single breakthrough.

Some alternatives to “Friendly AI”

What does MIRI’s research program study?

The most established term for this was coined by MIRI founder Eliezer Yudkowsky: “Friendly AI.” The term has some advantages, but it might suggest that MIRI is trying to build C-3PO, and it sounds a bit whimsical for a serious research program.

What about safe AGI or AGI safety? These terms are probably easier to interpret than Friendly AI. Also, people like being safe, and governments like saying they’re funding initiatives to keep the public safe.

A friend of mine worries that these terms could provoke a defensive response (in AI researchers) of “Oh, so you think me and everybody else in AI is working on unsafe AI?” But I’ve never actually heard that response to “AGI safety” in the wild, and AI safety researchers regularly discuss “software system safety” and “AI safety” and “agent safety” and more specific topics like “safe reinforcement learning” without provoking negative reactions from people doing regular AI research.

I’m more worried that a term like “safe AGI” could provoke a response of “So you’re trying to make sure that a system which is smarter than humans, and able to operate in arbitrary real-world environments, and able to invent new technologies to achieve its goals, will be safe? Let me save you some time and tell you right now that’s impossible. Your research program is a pipe dream.”

My reply goes something like “Yeah, it’s way beyond our current capabilities, but lots of things that once looked impossible are now feasible because people worked really hard on them for a long time, and we don’t think we can get the whole world to promise never to build AGI just because it’s hard to make safe, so we’re going to give AGI safety a solid try for a few decades and see what can be discovered.” But that’s probably not all that reassuring. [Read more…]

The Antikythera Mechanism

From Murray’s Human Accomplishment:

The problem with the standard archaeological account of human accomplishment from [the ancient world] is not that the picture is incomplete (which is inevitable), but that the data available to us leave so many puzzles.

The Antikythera Mechanism is a case in point… The Antikythera Mechanism is a bronze device about the size of a brick. It was recovered in 1901 from the wreck of a trading vessel that had sunk near the southern tip of Greece sometime around –65. Upon examination, archaeologists were startled to discover imprints of gears in the corroded metal. So began a half-century of speculation about what purpose the device might have served.

Finally, in 1959, science historian Derek de Solla Price figured it out: the Antikythera Mechanism was a mechanical device for calculating the positions of the sun and moon. A few years later, improvements in archaeological technology led to gamma radiographs of the Mechanism, revealing 22 gears in four layers, capable of simulating several major solar and lunar cycles, including the 19-year Metonic cycle that brings the phases of the moon back to the same calendar date. What made this latter feat especially astonishing was not just that the Mechanism could reproduce the 235 lunations in the Metonic cycle, but that it used a differential gear to do so. Until then, it was thought that the differential gear had been invented in 1575.

See also Wikipedia.