Our concern is that in the future when systems become more and more intelligent and we become more and more dependent on their capabilities, we may find that they get beyond our control because a system that’s, for example, more intelligent than a human being is very hard to predict and even though we might think we’re asking it to do something fairly innocuous and helpful like find a cure for cancer, unless you’re very, very specific, you get the problem of giving the three wishes to the genie where usually the genie finds a loophole which ends up putting you in the soup, and at best, you get back to where you started, but where the genie that is far more intelligent than the human race put together, it might be very difficult for us to get back to where we started.
What we want to understand is how to develop intelligent technology without losing control over it and put very simply, what we want to do is to say that artificial intelligence is not just about making systems more intelligent, it’s making them more intelligent and beneficial to the human race — which sounds like apple pie, but is actually something that other fields have taken seriously for a long time.
For example, if you look at fusion research, the original goal of fusion research was: can we generate energy by bashing together small atoms to make big ones? And they figured out in the ‘40s that yes, you can, but you get an enormous explosion and you destroy millions of people. Ever since then, fusion research has had the goal of generating energy and benefiting the human race by controlling the generation of energy so that you don’t have a huge explosion. It’s just taken for granted that that’s what fusion research is about.
… There have been people on the fringes of the field, some philosophers, some futurists, who have been warning for quite some time. And you can go back and look at, for example, a paper by I.J. Good. In 1965, he warned that if we did succeed in building superintelligent systems, it would be the last invention that man ever need make. I think in the last few years, the change has been that we’re just seeing this acceleration in the rate of progress and technology and in the level of investment in the commercial sector so that we are starting to feel that it’s time to take ourselves more seriously than we have been.
I think five years ago if you asked most AI people what are they doing, they’d say, “Oh, we’re trying to build systems that are as intelligent as people,” and if you asked them, “When are you going to succeed?” they’d say, “We have no idea. Decades and decades or centuries away.” They just didn’t worry about it, but now I think it seems as if it’s a little bit closer and the fear is moving in from the fringe into the main stream and the feeling that we do need to grow up is growing.
… I think we have to do, in some sense, the same thing the physicists do, to understand the physics of how we contain this process of intelligent decision making and also, I think, to understand what it is that we want as a human race because if we do succeed in building these kinds of capabilities, then in many ways the human race can have anything it wants. If it wants to have eternal life, if we want to cure all diseases, if we want to resolve all conflicts between countries or groups of people, all of those things might be available to us, so we really do have a genie and we have as many wishes as we want, as long as we don’t make the wrong wish.
That’s really the question. What does it mean to say we made the wrong wish? If you ask the system to cure cancer and nothing else, you don’t ask for any more or any less than that, then you’ve forgotten something very important which is come up with a cure for cancer and in so doing, don’t destroy the planet, don’t turn the whole planet into a giant server farm so that you can improve your cancer discovery algorithms. Whenever you think about giving a goal to a super intelligent system, it’s very easy as all the stories show to find a loophole whereby achieving that goal to the greatest possible extent ends up having very undesirable side effects and various philosophers have tried various ideas and so far, we don’t yet have a solution.
To my mind, one of the most promising avenues is not to tell a system what we want it to do because if we don’t tell it what we want, then it can’t take any rash actions because those actions might be deleterious to what we do actually want, and so its only reasonable course of action at that point is to enter into a conversation with the human race to try to figure out what it is we do want, and the better it can figure that out, the more it’s able to take actions to help us, but in the absence of fairly definite knowledge about what it is that humans want, it shouldn’t really be taking any serious actions at all.
… To me, it seems inevitable that the capabilities of intelligent machines will increase not just because of physics but because we’re starting to understand better and better how to get those machines to behave intelligently, so it wouldn’t surprise me if it happened in my lifetime and I could easily imagine it happening in the lifetime of my children, so I think the sooner we start solving this problem, the better because if we don’t solve it and the technology starts to become more and more integral to our society, it’s going to be very difficult to reverse the process of technology development and improvement.
Russell has a small page on his website about the long-term future of AI, with additional links. Of course, this is precisely MIRI’s core research focus.