blog posts

DeepMind AI Masters it without knowing the Rules of the Game

In 2016, AI DeepMind repeatedly defeated the best Go players. A year later, the company went further. This AI learned it by watching amateur and professional Go games, and simply mastered the ancient game by playing against itself! 

Then DeepMind AlphaZero was built that could play Go, Chess and Shogi games with a single algorithm.

The common denominator of all these AIs is that they already know the rules of the game they need to master. But the latest DeepMind AI called MuZero does not require the rules of the game, chess, shogi, etc. to master these games. Instead, he learned them all on his own and is just as capable of the previous DeepMind algorithms.

How did DeepMind AI achieve this feature?

DeepMind AI solves this problem using a method called lookahead search. Using this method, an algorithm considers future states for planning an action.

The best way to do this is to think about how to play a strategy game (such as chess or Starcraft II). Before taking action, you will consider how your opponent will react and try to plan accordingly.

In much the same way, AI, which uses the lookahead method, tries to plan several moves in advance. Even with a relatively simple game like chess, it is impossible to consider any possible situation in the future, so this AI prioritizes those that are more likely to occur.

DeepMind AI masters it without knowing the rules of the game 01

Problems ahead

The problem with this method is that in most real situations and even some games, there are no simple rules governing how they work. So some researchers have tried to solve the problem by using an approach that tries to figure out how a particular game or scenario environment affects an outcome and then use that knowledge to program.

The disadvantage of this system is that some domains are so complex that modeling any aspect is almost impossible. For example, this has been proven to be the case in most Atari games.

Now this artificial intelligence (called MuZero) instead of modeling all the states, only tries to examine those states that are important for decision making. In fact, this is what you do as a human being. When most people look out the window and see that dark clouds are forming on the horizon, they generally do not think about things like dense fronts and pressure.

Rather, they think about how they should dress to stay warm if they go out. MuZero does the same thing.

DeepMind artificial intelligence algorithm

This artificial intelligence also considers three important factors for its decision. The result of your previous decision, your current position and the best course of action to take next!

This seemingly simple approach has made MuZero the most effective DeepMind algorithm ever built. MuZero is as good as AlphaZero in chess, Go and shogi, and better than all its previous algorithms (including Agent57) in Atari games.

On the other hand, the more time MuZero spends reviewing an action, the better the result.

High scores on Atari games are also interesting, but what about the practical applications of the latest DeepMind research? In a word, they can be pioneering.

It has been said that MuZero’s learning abilities could one day help us solve complex problems in areas such as robotics that do not have simple rules.