March 2015 links, part 2

It’ll never work! A collection of experts being wrong about what is technologically feasible.

GiveWell update on its investigations into global catastrophic risks. My biggest disagreement is that I think nano-risk deserves more attention, if someone competent can be found to analyze the risks in more detail. GiveWell’s prioritization of biosecurity makes complete sense given their criteria.

Online calibration test with database of 150,000+ questions.

An ambitious Fermi estimate exercise: Estimating the energy cost of artificial evolution.

Nautilus publishes an excellent and wide-ranging interview with Scott Aaronson.

Gelman on a great old paper by Meehl.

 

AI stuff

Video: robot autonomously folds pile of 5 previously unseen towels.

Somehow I had previously missed the Dietterich-Horvitz letter on Benefits and Risks of AI.

Robin Hanson reviews Martin Ford’s new book on tech unemployment.

Heh. That “stop the robots” campaign at SXSW was a marketing stunt for a dating app.

Winfield, Towards an Ethical Robot. They actually bothered to build simple consequentialist robots that obey a kind-of Asimovian rule.

Clarifying the Nathan Collins article on MIRI

FLI now has a News page. One of its first articles is an article on MIRI by Nathan Collins. I’d like to clarify one passage that’s not necessarily incorrect, but which could lead to misunderstanding:

…consider assigning a robot with superhuman intelligence the task of making paper clips. The robot has a great deal of computational power and general intelligence at its disposal, so it ought to have an easy time figuring out how to fulfill its purpose, right?

Not really. Human reasoning is based on an understanding derived from a combination of personal experience and collective knowledge derived over generations, explains MIRI researcher Nate Soares, who trained in computer science in college. For example, you don’t have to tell managers not to risk their employees’ lives or strip mine the planet to make more paper clips. But AI paper-clip makers are vulnerable to making such mistakes because they do not share our wealth of knowledge. Even if they did, there’s no guarantee that human-engineered intelligent systems would process that knowledge the same way we would.

MIRI’s worry is not that a superhuman AI will find it difficult to fulfill its programmed goal of — to use a silly, arbitrary example — making paperclips. Our worry is that a superhuman AI will be very, very good at achieving its programmed goals, and that unfortunately, the best way to make lots of paperclips (or achieve just about any other goal) involves killing all humans, so that we can’t interfere with the AI’s paperclip making, and so that the AI can use the resources on which our lives depend to make more paperclips. See Bostrom’s “The Superintelligent Will” for a primer on this.

Moreover, a superhuman AI may very well share “our wealth of knowledge.” It will likely be able to read and understand all of Wikipedia, and every history book on Google Books, and the Facebook timeline of more than a billion humans, and so on. It may very well realize that when we programmed it with the goal to make paperclips (or whatever), we didn’t intend for it to kill us all as a side effect.

But that doesn’t matter. In this scenario, we didn’t program the AI to do as we intended. We programmed it to make paperclips. The AI knows we don’t want it to use up all our resources, but it doesn’t care, because we didn’t program it to care about what we intended. We only programmed it to make paperclips, so that’s what it does — very effectively.

“Okay, so then just make sure we program the superhuman AI to do what we intend!”

Yes, exactly. That is the entire point of MIRI’s research program. The problem is that the instruction “do what we intend, in every situation including ones we couldn’t have anticipated, and even as you reprogram yourself to improve your ability to achieve your goals” is incredibly difficult to specify in computer code.

Nobody on Earth knows how to do that, not even close. So our attitude is: we’d better get crackin’.

March 2015 links

Cotton-Barratt, Allocating risk mitigation across time.

The new Ian Morris book sounds very Hansonian, which probably means it’ll end up being one of my favorite books of 2015 when I have a chance to read it.

Why do we pay pure mathematicians? A dialogue.

Watch a FiveThirtyEight article get written, keystroke by keystroke. Scott Alexander, will you please record yourself writing one blog post?

Grace, The economy of weirdness.

Kahneman interviews Harari about the future.

On March 14th, there will be wrap parties for Harry Potter and the Methods of Rationality in at least 15 different countries. I’m assuming this is another first for a fanfic.

 

AI Stuff

YC President Sam Altman on superhuman AI: part 1, part 2. I agree with most of what he writes, the biggest exceptions being that I think (1) AGI probably isn’t the Great Filter, (2) AI progress isn’t a double exponential, and (3) I don’t have much of an opinion on the role of regulation, as it’s not something I’ve tried hard to figure out.

Stuart Russell and Rodney Brooks debated the value alignment problem at Davos 2015. (Watch at 2x speed.)

Pretty good coverage of MIRI’s value learning paper at Nautilus.

Books, music, etc. from February 2015

Decent books:

As Bryan Caplan wroteThe Moral Case for Fossil Fuels was surprisingly good. I think the book is factually inaccurate and cherry-picked in several places, and it seems fairly motivated throughout, but nevertheless I think the big picture argument basically goes through, and it’s an enjoyable read.

I didn’t discover any albums or movies I loved in February 2015, but I did finish Breaking Bad, which probably beats out The Sopranos and The Wire as the most consistently great TV drama ever.