Reply to Tabarrok on AI risk

At Marginal Revolution, economist Alex Tabarrok writes:

Stephen Hawking fears that “the development of full artificial intelligence could spell the end of the human race.” Elon Musk and Bill Gates offer similar warnings. Many researchers in artificial intelligence are less concerned primarily because they think that the technology is not advancing as quickly as doom scenarios imagine, as Ramez Naam discussed.

First, remember that Naam quoted only the prestigious AI scientists who agree with him, and conspicuously failed to mention that many prestigious AI scientists past and present have taken AI risk seriously.

Second, the common disagreement is not, primarily, about the timing of AGI. As I’ve explained many times before, the AI timelines of those talking about the long-term risk are not noticeably different from those of the mainstream AI community. (Indeed, both Nick Bostrom and myself, and many others in the risk-worrying camp, have later timelines than the mainstream AI community does.)

But the main argument of Tabarrok’s post is this:

Why should we be worried about the end of the human race? Oh sure, there are some Terminator like scenarios in which many future-people die in horrible ways and I’d feel good if we avoided those scenarios. The more likely scenario, however, is a glide path to extinction in which most people adopt a variety of bionic and germ-line modifications that over-time evolve them into post-human cyborgs. A few holdouts to the old ways would remain but birth rates would be low and the non-adapted would be regarded as quaint, as we regard the Amish today. Eventually the last humans would go extinct and 46andMe customers would kid each other over how much of their DNA was of the primitive kind while holo-commercials advertised products “so easy a homo sapiens could do it.”  I see nothing objectionable in this scenario.

The people who write about existential risk at FHI, MIRI, CSER, FLI, etc. tend not to be worried about Tabarrok’s “glide” scenario. Speaking for myself, at least, that scenario seems pretty desirable. I just don’t think it’s very likely, for reasons partially explained in books like SuperintelligenceGlobal Catastrophic Risks, and others.

(Note that although I work as a GiveWell research analyst, I do not study global catastrophic risks or AI for GiveWell, and my view on this is not necessarily GiveWell’s view.)

Reply to Buchanan on AI risk

Back in February, The Washington Post posted an opinion article by David Buchanan of the IBM Watson team: “No, the robots are not going to rise up and kill you.”

From the title, you might assume “Okay, I guess this isn’t about the AI risk concerns raised by MIRI, FHI, Elon Musk, etc.” But in the opening paragraph, Buchanan makes clear he is trying to respond to those concerns, by linking here and here.

I am often suspicious that many people in the “nothing to worry about” camp think they are replying to MIRI & company but are actually replying to Hollywood.

And lo, when Buchanan explains the supposed concern about AI, he doesn’t link to anything by MIRI & company, but instead he literally links to IMDB pages for movies/TV about AI:

Science fiction is partly responsible for these fears. A common trope works as follows: Step 1: Humans create AI to perform some unpleasant or difficult task. Step 2: The AI becomes conscious. Step 3: The AI decides to kill us all. As science fiction, such stories can be great fun. As science fact, the narrative is suspect, especially around Step 2, which assumes that by synthesizing intelligence, we will somehow automatically, or accidentally, create consciousness. I call this the consciousness fallacy…

The entire rest of the article is about the consciousness fallacy. But of course, everyone at MIRI and FHI, and probably Musk as well, agrees that intelligence doesn’t automatically create consciousness, and that has never been what MIRI & company are worried about.

(Note that although I work as a GiveWell research analyst, I do not study AI impacts for GiveWell, and my view on this is not necessarily GiveWell’s view.)

Videogames as art

I still basically agree with this 4-minute video essay I produced way back in 2008:


When the motion picture was invented, critics considered it an amusing toy. They didn’t see its potential to be an art form like painting or music. But only a few decades later, film was in some ways the ultimate art — capable of passion, lyricism, symbolism, subtlety, and beauty. Film could combine the elements of all other arts — music, literature, poetry, dance, staging, fashion, and even architecture — into it single, awesome work. Of course, film will always be used for silly amusements, but it can also express the highest of art. Film has come of age.

In the 1960s, computer programmers invented another amusing toy: the videogame. Nobody thought it could be a serious art form, and who could blame them? Super Mario Brothers didn’t have much in common with Citizen Kane. And, nobody was even trying to make artistic games. Companies just wanted to make fun play things that would sell lots of copies.

But recently, games have started to look a lot more like the movies, and people wondered: “Could this become a serious art form, like film?” In fact, some games basically were films with tiny better gameplay snuck in.

Of course, there is one major difference between films and games. Film critic Roger Ebert thinks games can never be an art form because

Videogames by their nature require player choices, which is the opposite the strategy of serious film and literature, which requires at authorial control.

But wait a minute. Aren’t there already serious art forms that allow for flexibility, improvisation, and player choices? Bach and Mozart and other composers famously left room for improvisation in their classical compositions. And of course jazz music is an art form based almost entirely on improvisation within a set of scales or modes or ideas. Avant-garde composers Christian Wolff and John Zorn write “game pieces” in which there are no prearranged notes at all. Performers play according to an unfolding set of rules exactly as in baseball or Mario. So gameplay can be art.

Maybe the real reason some people don’t think games are an art form is that they don’t know any artistic video games. Even the games with impressive graphic design and good music have pretty hokey stories and unoriginal drive-jump-shoot gameplay. And for the most part they’re right: there aren’t many artistic games. Games are only just becoming an art form. It took film a while to become art, too.

But maybe the skeptics haven’t played the right games, either. Have they played Shadow of the Colossus, a minimalist epic of beauty and philosophy? Have they played Façade, a one-act play in which the player tries to keep a couple together by listening to their dialogue, reading their facial expressions, and responding in natural language? Have they seen The Night Journey, by respected video artist Bill Viola, which intends to symbolize a mystic’s path towards enlightenment?

It’s an exciting time for video games. They will continue to deliver simple fun and blockbuster entertainment, but there is also an avant-garde movement of serious artists who are about to launch the medium to new heights of expression, and I for one can’t wait to see what they come up with.

F.A.Q. about my transition to GiveWell

Lots of people are asking for more details about my decision to take a job at GiveWell, so I figured I should publish answers to the most common questions I’ve gotten, though I’m happy to also talk about it in person or by email.

Why did you take a job at GiveWell?

Apparently some people think I must have changed my mind about what I think Earth’s most urgent priorities are. So let me be clear: Nothing has changed about what I think Earth’s most urgent priorities are.

I still buy the basic argument in Friendly AI research as effective altruism.

I still think that growing a field of technical AI alignment research, one which takes the future seriously, is plausibly the most urgent task for those seeking a desirable long-term future for Earth-originating life.

And I still think that MIRI has an incredibly important role to play in growing that field of technical AI alignment research.

I decided to take a research position at GiveWell mostly for personal reasons.

I have always preferred research over management. As many of you who know me in person already know, I’ve been looking for my replacement at MIRI since the day I took the Executive Director role, so that I could return to research. When doing research I very easily get into a flow state; I basically never get into a flow state doing management. I’m pretty proud of what the MIRI team accomplished during my tenure, and I could see myself being an executive somewhere again some day, but I want to do something else for a while.

Why not switch to a research role at MIRI? First, I continue to think MIRI should specialize in computer science research that I don’t have the training to do myself. Second, I look forward to upgrading my research skills while working in domains where I don’t already have lots of pre-existing bias.

[Read more…]

Krauss on long-term AI impacts

Physicist Lawrence Krauss says he’s not worried about long-term AI impacts, but he doesn’t respond to any of the standard arguments for concern, so it’s unclear whether he knows much about the topic.

The only argument he gives in any detail has to do with AGI timing:

Given current power consumption by electronic computers, a computer with the storage and processing capability of the human mind would require in excess of 10 Terawatts of power, within a factor of two of the current power consumption of all of humanity. However, the human brain uses about 10 watts of power. This means a mismatch of a factor of 1012, or a million million. Over the past decade the doubling time for Megaflops/watt has been about 3 years. Even assuming Moore’s Law continues unabated, this means it will take about 40 doubling times, or about 120 years, to reach a comparable power dissipation. Moreover, each doubling in efficiency requires a relatively radical change in technology, and it is extremely unlikely that 40 such doublings could be achieved without essentially changing the way computers compute.

Krauss doesn’t say where he got his numbers for the power requirements of “a computer with the storage and processing capability of the human mind,” but there are a few things I can say even leaving that aside.

First, few AI scientists think AGI will be built so similarly to the human brain that having “the storage and processing capability of the human mind” is all that relevant. We didn’t build planes like birds.

Second, Krauss warns that “each doubling in efficiency requires a relatively radical change in technology…” But Koomey’s law — the Moore’s law of computing power efficiency — has been stable since about 1946, which runs through several radical changes in computing technology. Somehow we manage, when there is tremendous economic incentive to do so.

Third, just because the human brain achieves general intelligence with ~10 watts of energy doesn’t mean a computer has to. A machine superintelligence the size of a warehouse is still a challenge to be reckoned with!

Added 08-28-15: Also see Anders Sandberg’s comments on Krauss’ calculations.

Added 02-18-16: Sandberg wrote a version of his comments for arxiv, here.

GSS Tutorial #1: Basic trends over time

Part of the series: How to research stuff.

Today I join Razib Khan’s quest to get bloggers to use the General Social Survey (GSS) more often.

The GSS is a huge collection of data on the demographics and attitudes of non-institutional adults (18+) living in the US. The data were collected by NORC via face-to-face, 90-minute interviews in randomly selected households, every year (almost) from 1972–1994, and every other year since then.

You can download the data and analyze it in R or SPSS or whatever, but the data can also be analyzed very easily via two easy-to-use web interfaces: the UC Berkeley SDA site and the GSS Data Explorer.

[Read more…]

Reply to Hawkins on AI risk

Jeff Hawkins, inventor of the Palm Pilot, has since turned his attention to neuro-inspired AI. In response to Elon Musk’s and Stephen Hawking’s recent comments on long-term AI risk, Hawkins argued that AI risk worriers suffer from three misconceptions:

  1. Intelligent machines will be capable of [physical] self-replication.
  2. Intelligent machines will be like humans and have human-like desires.
  3. Machines that are smarter than humans will lead to an intelligence explosion.

If you’ve been following this topic for a while, you might notice that Hawkins seems to be responding to something other than the standard arguments (now collected in Nick Bostrom’s Superintelligence) that are the source of Musk et al.’s concerns. Maybe Hawkins is responding to AI concerns as they are presented in Hollywood movies? I don’t know.

First, the Bostrom-Yudkowsky school of concern is not premised on physical self-replication by AIs. Self-replication does seem likely in the long run, but that’s not where the risk comes from. (As such, Superintelligence barely mentions physical self-replication at all.)

Second, these standard Bostrom-Yudkowsky arguments specifically deny that AIs will have human-like psychologies or desires. Certainly, the risk is not premised on such an expectation.

Third, Hawkins doesn’t seem to understand the concept of intelligence explosion being used by Musk and others, as I explain below.

[Read more…]

A reply to Wait But Why on machine superintelligence

Tim Urban of the wonderful Wait But Why blog recently wrote two posts on machine superintelligence: The Road to Superintelligence and Our Immortality or Extinction. These posts are probably now among the most-read introductions to the topic since Ray Kurzweil’s 2006 book.

In general I agree with Tim’s posts, but I think lots of details in his summary of the topic deserve to be corrected or clarified. Below, I’ll quote passages from his two posts, roughly in the order they appear, and then give my own brief reactions. Some of my comments are fairly nit-picky but I decided to share them anyway; perhaps my most important clarification comes at the end.

[Read more…]

Clarifying the Nathan Collins article on MIRI

FLI now has a News page. One of its first articles is an article on MIRI by Nathan Collins. I’d like to clarify one passage that’s not necessarily incorrect, but which could lead to misunderstanding:

… consider assigning a robot with superhuman intelligence the task of making paper clips. The robot has a great deal of computational power and general intelligence at its disposal, so it ought to have an easy time figuring out how to fulfill its purpose, right?

Not really. Human reasoning is based on an understanding derived from a combination of personal experience and collective knowledge derived over generations, explains MIRI researcher Nate Soares, who trained in computer science in college. For example, you don’t have to tell managers not to risk their employees’ lives or strip mine the planet to make more paper clips. But AI paper-clip makers are vulnerable to making such mistakes because they do not share our wealth of knowledge. Even if they did, there’s no guarantee that human-engineered intelligent systems would process that knowledge the same way we would.

MIRI’s worry is not that a superhuman AI will find it difficult to fulfill its programmed goal of — to use a silly, arbitrary example — making paperclips. Our worry is that a superhuman AI will be very, very good at achieving its programmed goals, and that unfortunately, the best way to make lots of paperclips (or achieve just about any other goal) involves killing all humans, so that we can’t interfere with the AI’s paperclip making, and so that the AI can use the resources on which our lives depend to make more paperclips. See Bostrom’s “The Superintelligent Will” for a primer on this.

Moreover, a superhuman AI may very well share “our wealth of knowledge.” It will likely be able to read and understand all of Wikipedia, and every history book on Google Books, and the Facebook timeline of more than a billion humans, and so on. It may very well realize that when we programmed it with the goal to make paperclips (or whatever), we didn’t intend for it to kill us all as a side effect.

But that doesn’t matter. In this scenario, we didn’t program the AI to do as we intended. We programmed it to make paperclips. The AI knows we don’t want it to use up all our resources, but it doesn’t care, because we didn’t program it to care about what we intended. We only programmed it to make paperclips, so that’s what it does — very effectively.

“Okay, so then just make sure we program the superhuman AI to do what we intend!”

Yes, exactly. That is the entire point of MIRI’s research program. The problem is that the instruction “do what we intend, in every situation including ones we couldn’t have anticipated, and even as you reprogram yourself to improve your ability to achieve your goals” is incredibly difficult to specify in computer code.

Nobody on Earth knows how to do that, not even close. So our attitude is: we’d better get crackin’.

Theories of artistic evolution

Carlo Gesualdo was a 16th century prince famous for murdering his wife and her lover in his own bed when he caught them in flagrante delicto there. He left their bodies in front of the palace and moved away to Ferrara, where the most progressive composers of his era lived. After learning from them, he returned to his palace — as a nobleman he was immune to prosecution — and isolated himself there for the rest of his life, doing little else but composing music and hiring singers to perform it for him.

Tortured by guilt, his mental health deteriorated. He had his own servants beat him, and tried to obtain his uncle’s bones as magical relics that might cure his mental problems. As time passed, his music became increasingly emotional and desperate, as well as more experimental. His last book of madrigals is probably the most insane music of its era, so insane that no significant composer followed in his chromatic footsteps for about 300 years, and only then in very different styles. Gesualdo represents a fascinating dead end in musical history.

The painter Van Gogh underwent a similar descent into madness, famously cutting off his own ear and later being admitted to an asylum. Simultaneously, his paintings became increasingly wild and impressionistic.

More recently, The Beatles’ introduction to LSD corresponded with the experimental turn in their music kicked off with Revolver tracks like “Tomorrow Never Knows,” then Sgt. Pepper, then “Revolution 9” and others from the white album.

In all three cases, my brain wants to tell a story about how the artists’ madness (either chronic or temporarily induced) was the driving force behind their increasing creativity.

But whenever your brain wants to tell a story about something, it’s a good idea to take a breath and generate some alternate hypotheses. What else could explain the above cases of artistic change, and indeed artistic change in general?

The theory I favor is that psychological and sociological pressures drive artists toward increasing novelty. No artist makes a name for herself by doing what Beethoven or Turner or Eliot have already done in music, painting, and poetry. No, the artist must do something new, like Igor Stravinsky did when he invented heavy metal in 1913, or like John Adams did when he fused Beethovenian symphonic romanticism with minimalism, or — sigh — like John Cage did when he composed a piece of music instructing the pianist to sit silently at a piano for four and half minutes.

I do know of at least one case where drugs were directly responsible for inventing a new artistic style. In 1956, blues singer Jay Hawkins entered the studio to record a refined love ballad called “I Put a Spell on You.” But first the producer brought in ribs and chicken and beer and got everybody drunk. Jay was so drunk he couldn’t remember the recording session, but when they played back the tape it turned out he had just screamed the song to death like a demented madman, thus accidentally inventing goth rock. After that he performed the song in a long cape after rising out of an onstage coffin, and became known as “Screamin’ Jay Hawkins.”

Scaruffi’s rock criticism

Sometimes I do blatantly useless things so I can flaunt my rejection of the often unhealthy “always optimize” pressures within the effective altruism community. So today, I’m going to write about rock music criticism.

Specifically, I would like to introduce you to the wonder of the world that is Piero Scaruffi. Or, better, I’ll let Holden Karnofsky introduce him:

We can start with his writings on music, since that seems to be what he is known for. He has helpfully ranked the best 100 rock albums of all time in order…

If that’s too broad for you, he also provides his top albums year by year … every single year from 1967 to 2012. He also gives genre-specific rankings for psychedelic music, Canterbury, glam-rock, punk-rock, dream-pop, triphop, jungle … 32 genres in all. Try punching “scaruffi [band]” into Google; I defy you to find a major musician he hasn’t written a review of. These are all just part of the massive online appendix to his self-published two-volume history of rock music. But he’s not just into rock; he’s also written a history of popular music specifically prior to rock-n-roll and a history of jazz music, and he has a similarly deep set of rankings for jazz (best albums, best jazz music for each of 17 different instrument categories, best jazz from each decade). While he hasn’t written a book about classical music, he has put out a timeline of classical music from 1098 to the present day and lists his essential classical music selections in each of ~10 categories

So who is this guy, a music critic? Nope, he is some sort of mostly retired software consultant and I want you to know that his interests go far beyond music. Take literature, for example. He has given both a chronological timeline and a best-novel-ever ranking for each of 36 languages. No I’m serious. Have you been wanting this fellow’s opinion of the 37 best works of Albanian literature, in order? Here you go. Turkish? Right here. Hebrew, Arabic, Armenian, Ethiopian, ancient Egyptian, and Finnish? Got those too.

Naturally, Mr. Scaruffi has not neglected film (he’s given his top 40 films and favorites for each decade starting with the 1910s, along with a history of cinema written in Italian) or visual art (see his 3-part visual history of the visual arts, his list of the greatest paintings of all time and his own collages and photographs) but let’s move on from this fluffy stuff. Because it’s important for you to know about his:

Does this guy just like sit inside and read and write 24 hours a day? Not to hear his travel page tell it: he’s visited 159 countries and is happy to give you guides to several of them along with his “greatest places in the world” rankings. He also has an entirely separate “hiking” section of his website that I haven’t clicked on and am determined not to.

But let’s focus on his rock music criticism, which I think is alternately silly, wrong, and brilliant.

[Read more…]

Computer science writers wanted

My apologies in advance to the computer science journalists I haven’t found yet, but…

Why is there so little good long-form computer science journalism? (Tech journalism doesn’t count.)

When there’s an interesting development in biology, Ed Yong will explain it beautifully in 4,000 words, or Richard Dawkins in 80,000. Or Carl Zimmer, Jonathan Weiner, David Quammen, etc.

Several others sciences attract plenty of writing talent as well. Physics has Sean CarrollStephen Hawking, Brian Greene, Kip Thorne, Lawrence KraussNeil deGrasse Tyson, etc. Psychology has Steven Pinker, Richard Wiseman, Oliver Sacks, V.S. Ramachandran, etc. Medical science has Atul Gawande, Ben GoldacreSiddhartha Mukherjee, etc.

Computer science has Scott Aaronson (e.g. The Limits of Quantum, The Quest for Randomness), Brian Hayes (e.g. The Invention of Genetic Code, The Easiest Hard Problem), and… who else?

Outside Aaronson and Hayes, I mostly see tech journalism, very brief CS news articles, mediocre CS writing, and occasional CS articles and books from good writers who cover a range of scientific disciplines, such as

Maybe CS is too mathematical to attract general readers? Too abstract? Too dry? Or simply not taught in high school like the other sciences? Or maybe there are problems on the supply side?

Key Lessons from Lobbying and Policy Change

Lobbying and Policy Change by Baumgartner et al. is the best book on policy change I’ve read. Hat tip to Holden Karnofsky for recommending this and also Poor Economics, the best book on global poverty reduction I’ve read.

LaPC is perhaps the most data-intensive study of “Who wins in Washington and why?” ever conducted, and the data (and many follow-up studies) are available from the UNC project website here. One review summarized the study design like this:

To start, [the researchers] sample from a comprehensive list of House and Senate lobbying disclosure reports to identify a random universe of participants. After initial interviews with their sample population, the authors assemble a list of 98 issues on which each organizational representative had worked most recently [from 1999-2002, i.e. during two Presidents of opposite parties and two Congresses]. These range from patent extension to chiropractic coverage under Medicare, some very broad and some very specific. Interviewers endeavored to determine the relevant sides of each issue and identify its key players. Separate subsequent interviews were then arranged where possible with representatives from each side of the issue…

With this starting point, the researchers followed their sample of issues for several more years to track who got what they wanted and who didn’t.

Note that their issue sampling method favors issues in which Congress was involved, so “issues relating to the judiciary and that are solely agency-related may be undercounted.”

LaPC is a difficult book to summarize, but below is one attempt. Some findings were surprising, others were not.

  1. One of the best predictors of lobbying success is simply whether one is trying to preserve the status quo, and in fact the single most common lobbying goal is to preserve the status quo.
  2. Some issues had as many as 7 sides, but most had just two.
  3. Most lobbying is targeted at a small percentage of issues.
  4. Very few neutral decision-makers are involved. Where government officials are involved, they are almost always actively lobbying for one side or another. 40% of advocates in this study were government officials; only 60% were lobbyists.
  5. Which kinds of groups were represented by the lobbyists? 26% were citizen groups, 21% were trade/business associations, 14% were corporations, 11% were professional associations, 7% were coalitions specific to an issue, 6% were unions, and 6% were think tanks.
  6. The most common lobbying issues were, in descending order: health (21%), environment (13%), transportation (8%), science and technology (7%), finance and commerce (7%), defense (7%), foreign trade (6%), energy (5%), law, crime, and family policy (5%), and education (5%).
  7. When lobbying, it’s better to be wealthy than poor, but there’s only a weak link between resources and policy-change success.
  8. Policy change tends not to be incremental except in a few areas such as the budget. For most issues, a “building tension then sudden substantial change” model predicts best.
  9. There is substantial correlation between electoral change and policy change, and advocates have increasingly focused on electoral efforts.

If you’re interested in this area, the next book to read is probably Godwin et al’s Lobbying and Policymaking, another decade-long study of policymaking that is largely framed as a reply to LaPC, and was recommended by Baumgartner.

How to study superintelligence strategy

(Last updated Feb. 11, 2015.)

What could an economics graduate student do to improve our strategic picture of superintelligence? What about a computer science professor? A policy analyst at RAND? A program director at IARPA?

In the last chapter of Superintelligence, Nick Bostrom writes:

We find ourselves in a thicket of strategic complexity and surrounded by a dense mist of uncertainty. Though many considerations have been discerned, their details and interrelationships remain unclear and iffy — and there might be other factors we have not thought of yet. How should we act in this predicament?

… Against a backdrop of perplexity and uncertainty, [strategic] analysis stands out as being of particularly high expected value. Illumination of our strategic situation would help us target subsequent interventions more effectively. Strategic analysis is especially needful when we are radically uncertain not just about some detail of some peripheral matter but about the cardinal qualities of the central things. For many key parameters, we are radically uncertain even about their sign…

The hunt for crucial considerations… will often require crisscrossing the boundaries between different academic disciplines and other fields of knowledge.

Bostrom does not, however, provide a list of specific research projects that could illuminate our strategic situation and thereby “help us target subsequent interventions more effectively.”

Below is my personal list of studies which could illuminate our strategic situation with regard to superintelligence. I’m hosting it on my personal site rather than MIRI’s blog to make it clear that this is not “MIRI’s official list of project ideas.” Other researchers at MIRI would, I’m sure, put together a different list. [Read more…]

Expertise vs. intelligence and rationality

When you’re not sure what to think about something, or what to do in a certain situation, do you instinctively turn to a successful domain expert, or to someone you know who seems generally very smart?

I think most people don’t respect individual differences in intelligence and rationality enough. But some people in my local community tend to exhibit the opposite failure mode. They put too much weight on a person’s signals of explicit rationality (“Are they Bayesian?”), and place too little weight on domain expertise (and the domain-specific tacit rationality that often comes with it).

This comes up pretty often during my work for MIRI. We’re considering how to communicate effectively with academics, or how to win grants, or how to build a team of researchers, and some people (not necessarily MIRI staff) will tend to lean heavily on the opinions of the most generally smart people they know, even though those smart people have no demonstrated expertise or success on the issue being considered. In contrast, I usually collect the opinions of some smart people I know, and then mostly just do what people with a long track record of success on the issue say to do. And that dumb heuristic seems to work pretty well.

Yes, there are nuanced judgment calls I have to make about who has expertise on what, exactly, and whether MIRI’s situation is sufficiently analogous for the expert’s advice to work at MIRI. And I must be careful to distinguish credentials-expertise from success-expertise (aka RSPRT-expertise). And this process doesn’t work for decisions on which there are no success-experts, like long-term AI forecasting. But I think it’s easier for smart people to overestimate their ability to model problems outside their domains of expertise, and easier to underestimate all the subtle things domain experts know, than vice-versa.

Will AGI surprise the world?

Yudkowsky writes:

In general and across all instances I can think of so far, I do not agree with the part of your futurological forecast in which you reason, “After event W happens, everyone will see the truth of proposition X, leading them to endorse Y and agree with me about policy decision Z.”

Example 2: “As AI gets more sophisticated, everyone will realize that real AI is on the way and then they’ll start taking Friendly AI development seriously.”

Alternative projection: As AI gets more sophisticated, the rest of society can’t see any difference between the latest breakthrough reported in a press release and that business earlier with Watson beating Ken Jennings or Deep Blue beating Kasparov; it seems like the same sort of press release to them. The same people who were talking about robot overlords earlier continue to talk about robot overlords. The same people who were talking about human irreproducibility continue to talk about human specialness. Concern is expressed over technological unemployment the same as today or Keynes in 1930, and this is used to fuel someone’s previous ideological commitment to a basic income guarantee, inequality reduction, or whatever. The same tiny segment of unusually consequentialist people are concerned about Friendly AI as before. If anyone in the science community does start thinking that superintelligent AI is on the way, they exhibit the same distribution of performance as modern scientists who think it’s on the way, e.g. Hugo de Garis, Ben Goertzel, etc.

My own projection goes more like this:

As AI gets more sophisticated, and as more prestigious AI scientists begin to publicly acknowledge that AI is plausibly only 2-6 decades away, policy-makers and research funders will begin to respond to the AGI safety challenge, just like they began to respond to CFC damages in the late 70s, to global warming in the late 80s, and to synbio developments in the 2010s. As for society at large, I dunno. They’ll think all kinds of random stuff for random reasons, and in some cases this will seriously impede effective policy, as it does in the USA for science education and immigration reform. Because AGI lends itself to arms races and is harder to handle adequately than global warming or nuclear security are, policy-makers and industry leaders will generally know AGI is coming but be unable to fund the needed efforts and coordinate effectively enough to ensure good outcomes.

At least one clear difference between my projection and Yudkowsky’s is that I expect AI-expert performance on the problem to improve substantially as a greater fraction of elite AI scientists begin to think about the issue in Near mode rather than Far mode.

As a friend of mine suggested recently, current elite awareness of the AGI safety challenge is roughly where elite awareness of the global warming challenge was in the early 80s. Except, I expect elite acknowledgement of the AGI safety challenge to spread more slowly than it did for global warming or nuclear security, because AGI is tougher to forecast in general, and involves trickier philosophical nuances. (Nobody was ever tempted to say, “But as the nuclear chain reaction grows in power, it will necessarily become more moral!”)

Still, there is a worryingly non-negligible chance that AGI explodes “out of nowhere.” Sometimes important theorems are proved suddenly after decades of failed attempts by other mathematicians, and sometimes a computational procedure is sped up by 20 orders of magnitude with a single breakthrough.

Some alternatives to “Friendly AI”

What does MIRI’s research program study?

The most established term for this was coined by MIRI founder Eliezer Yudkowsky: “Friendly AI.” The term has some advantages, but it might suggest that MIRI is trying to build C-3PO, and it sounds a bit whimsical for a serious research program.

What about safe AGI or AGI safety? These terms are probably easier to interpret than Friendly AI. Also, people like being safe, and governments like saying they’re funding initiatives to keep the public safe.

A friend of mine worries that these terms could provoke a defensive response (in AI researchers) of “Oh, so you think me and everybody else in AI is working on unsafe AI?” But I’ve never actually heard that response to “AGI safety” in the wild, and AI safety researchers regularly discuss “software system safety” and “AI safety” and “agent safety” and more specific topics like “safe reinforcement learning” without provoking negative reactions from people doing regular AI research.

I’m more worried that a term like “safe AGI” could provoke a response of “So you’re trying to make sure that a system which is smarter than humans, and able to operate in arbitrary real-world environments, and able to invent new technologies to achieve its goals, will be safe? Let me save you some time and tell you right now that’s impossible. Your research program is a pipe dream.”

My reply goes something like “Yeah, it’s way beyond our current capabilities, but lots of things that once looked impossible are now feasible because people worked really hard on them for a long time, and we don’t think we can get the whole world to promise never to build AGI just because it’s hard to make safe, so we’re going to give AGI safety a solid try for a few decades and see what can be discovered.” But that’s probably not all that reassuring. [Read more…]

Don’t neglect the fundamentals

My sports coaches always emphasized “the fundamentals.” For example at basketball practice they spent no time whatsoever teaching us “advanced” moves like behind-the-back passes and alley-oops. They knew that even if advanced moves were memorable, and could allow the team to score 5-15 extra points per game, this effect would be dominated by whether we made our free throws, grabbed our rebounds, and kept our turnovers to a minimum.

When I began my internship at what was then called SIAI, I thought, “Wow. SIAI has implemented few business/non-profit fundamentals, and is surviving almost entirely via advanced moves.” So, Louie Helm and I spent much of our first two years at MIRI mastering the (kinda boring) fundamentals, and my impression is that doing so paid off handsomely in organizational robustness and productivity.

On Less Wrong, some kinds of “advanced moves” are sometimes called “Munchkin ideas”:

A Munchkin is the sort of person who, faced with a role-playing game, reads through the rulebooks over and over until he finds a way to combine three innocuous-seeming magical items into a cycle of infinite wish spells. Or who, in real life, composes a surprisingly effective diet out of drinking a quarter-cup of extra-light olive oil at least one hour before and after tasting anything else. Or combines liquid nitrogen and antifreeze and life-insurance policies into a ridiculously cheap method of defeating the invincible specter of unavoidable Death.

Munchkin ideas are more valuable in life than advanced moves are in a basketball game because the upsides in life are much greater. The outcome of a basketball game is binary (win/lose), and advanced moves can’t increase your odds of winning by that much. But in life in general, a good Munchkin idea might find your life partner or make you a billion dollars or maybe even optimize literally everything.

But Munchkin ideas work best when you’ve mastered the fundamentals first. Behind-the-back passes won’t save you if you make lots of turnovers due to poor dribbling skills. Your innovative startup idea won’t do you much good if you sign unusual contracts that make your startup grossly unattractive to investors. And a Munchkin-ish nonprofit can only grow so much without bookkeeping, financial controls, and a donor database.

My guess is that when you’re launching a new startup or organization, the fundamentals can wait. “Do things that don’t scale,” as Paul Graham says. But after you’ve got some momentum then yes, get your shit together, master the fundamentals, and do things in ways that can scale.

This advice is audience-specific. To an audience of Protestant Midwesterners, I would emphasize the importance of Munchkinism. To my actual audience of high-IQ entrepreneurial world-changers, who want to signal their intelligence and Munchkinism to each other, I say “Don’t neglect the fundamentals.” Executing the fundamentals competently doesn’t particularly signal high intelligence, but it’s worth doing anyway.

The Riddle of Being or Nothingness

being or nothingnessJon Ronson’s The Psycopath Test (2011) opens with the strange story of Being or Nothingness:

Last July, Deborah received a strange package in the mail…  The package contained a book. It was only forty-two pages long, twenty-one of which—every other page—were completely blank, but everything about it—the paper, the illustrations, the typeface—looked very expensively produced. The cover was a delicate, eerie picture of two disembodied hands drawing each other. Deborah recognized it to be a reproduction of M. C. Escher’s Drawing Hands.

The author was a “Joe K” (a reference to Kafka’s Josef K., maybe, or an anagram of “joke”?) and the title was Being or Nothingness, which was some kind of allusion to Sartre’s 1943 essay, Being and Nothingness. Someone had carefully cut out with scissors the page that would have listed the publishing and copyright details, the ISBN, etc., so there were no clues there. A sticker read: Warning! Please study the letter to Professor Hofstadter before you read the book. Good Luck!

Deborah leafed through it. It was obviously some kind of puzzle waiting to be solved, with cryptic verse and pages where words had been cut out, and so on.

Everyone at MIRI was pretty amused when a copy of Being or Nothingness arrived at our offices last year, addressed to Eliezer.

Everyone except Eliezer, anyway. He just rolled his eyes and said, “Do what you want with it; I’ve been getting crazy stuff like that for years.”

[Read more…]

An onion strategy for AGI discussion

The stabilization of environments” is a paper about AIs that reshape their environments to make it easier to achieve their goals. This is typically called enforcement, but they prefer the term stabilization because it “sounds less hostile.”

“I’ll open the pod bay doors, Dave, but then I’m going to stabilize the ship… ”

Sparrow (2013) takes the opposite approach to plain vs. dramatic language. Rather than using a modest term like iterated embryo selection, Sparrow prefers the phrase in vitro eugenics. Jeepers.

I suppose that’s more likely to provoke public discussion, but…  will much good will come of that public discussion? The public had a needless freak-out about in vitro fertilization back in the 60s and 70s and then, as soon as the first IVF baby was born in 1978, decided they were in favor of it.

Someone recently suggested I use an “onion strategy” for the discussion of novel technological risks. The outermost layer of the communication onion would be aimed at the general public, and focus on benefits rather than risks, so as not to provoke an unproductive panic. A second layer for a specialist audience could include a more detailed elaboration of the risks. The most complete discussion of risks and mitigation options would be reserved for technical publications that are read only by professionals.

Eric Drexler seems to wish he had more successfully used an onion strategy when writing about nanotechnology. Engines of Creation included frank discussions of both the benefits and risks of nanotechnology, including the “grey goo” scenario that was discussed widely in the media and used as the premise for the bestselling novel Prey.

Ray Kurzweil may be using an onion strategy, or at least keeping his writing in the outermost layer. If you look carefully, chapter 8 of The Singularity is Near takes technological risks pretty seriously, and yet it’s written in such a way that most people who read the book seem to come away with an overwhelmingly optimistic perspective on technological change.

George Church may be following an onion strategy. Regenesis also contains a chapter on the risks of advanced bioengineering, but it’s presented as an “epilogue” that many readers will skip.

Perhaps those of us writing about AGI for the general public should try to discuss:

  • astronomical stakes rather than existential risk
  • Friendly AI rather than AGI risk or the superintelligence control problem
  • the orthogonality thesis and convergent instrumental values and complexity of values rather than “doom by default”
  • etc.

MIRI doesn’t have any official recommendations on the matter, but these days I find myself leaning toward an onion strategy.