This backflipping noodle has a lot to teach us about AI safety – The Verge

AI isnt going to be a threat to humanity because its evil or cruel, AI will be a threat to humanity because we havent properly explained what it is we want it to do. Consider the classic paperclip maximizer thought experiment, in which an all-powerful AI is told, simply, make paperclips. The AI, not constrained by any human morality or reason, does so, eventually transforming all resources on Earth into paperclips, and wiping out our species in the process. As with any relationship, when talking to our computers, communication is key.

Thats why a new piece of research published yesterday by Googles DeepMind and the Elon Musk-funded OpenAI institute is so interesting. It offers a simple way for humans to give feedback to AI systems crucially, without the instructor needing to know anything about programming or artificial intelligence.

The method is a variation of whats known as reinforcement learning or RL. With RL systems, a computer learns by trial-and-error, repeating the same task over and over, while programmers direct its actions by setting certain reward criteria. For example, if you want a computer to learn how to play Atari games (something DeepMind has done in the past) you might make the games point system the reward criteria. Over time, the algorithm will learn to play in a way that best accrues points, often leading to super-human performance.

What DeepMind and OpenAIs researchers have done is replace this predefined reward criteria with a much simpler feedback system. Humans are shown an AI performing two versions of the same task and simply tell it which is better. This happens again and again, and eventually the systems learns what is expected of it. Think of it like getting an eye test, when youre looking through different lenses, and being asked over and over: better... or worse? Heres what that looks like when teaching a computer to play the classic Atari game Q*bert:

This method of feedback is surprisingly effective, and researchers were able to use it to train an AI to play a number of Atari video games, as well perform simulated robot tasks (like picking telling an arm to pick up a ball). This better / worse reward function could even be used to program trickier behavior, like teaching a very basic virtual robot how to backflip. Thats how we get to the GIF at the top of the page. The behavior you see has been created by watching the Hopper bot jump up and down, and telling it well done when it gets a bit closer to doing a backflip. Over time, it learns how.

Of course, no one is suggesting this method is a cure-all for teaching AI. There are a number of big downsides and limitations in using this sort of feedback. The first being that although it doesnt take much skill on behalf of the human operator, it does take time. For example, in teaching the Hopper bot to backflip, a human was asked to judge its behavior some 900 times a process that took about an hour. The bot itself had to work through 70 hours of simulated training time, which was sped up artificially.

For some simple tasks, says Oxford Robotics researcher Markus Wulfmeier (who was not involved in this research), it would be quicker for a programmer to simply define what it is they wanted. But, says Wulfmeier, its increasingly important to render human supervision more effective for AI systems, and this paper represents a small step in the right direction.

DeepMind and OpenAI say pretty much the same its a small step, but a promising one, and in the future, theyre looking to apply it to more and more complex scenarios. Speaking to The Verge over email, DeepMind researcher Jan Leike said: The setup described in [our paper] already scales from robotic simulations to more complex Atari games, which suggests that the system will scale further. Leike suggests the next step is to test it in more varied 3D environments. You can read the full paper describing the work here.

See more here:

This backflipping noodle has a lot to teach us about AI safety - The Verge

Related Posts

Comments are closed.