4hw.org: Recently, Corzine's battle against alpha dogs has attracted much attention. Alpha dogs swept the world of go last year, and all the great theocracies were defeated by them. Although alpha dogs are well-known, many people still don't know what they are. They think alpha dogs are dogs and how they work. Let's have a look with Xiaobian.
Alphago is a go AI program. Its main working principle is' deep learning '. "Deep learning" refers to multi-layer artificial neural network and its training methods. One layer neural network will take a large number of matrix numbers as input, take the weight through nonlinear activation method, and then generate another data set as output. This is just like the working mechanism of biological neural brain. Through the appropriate number of matrices, multi-layer organizations are linked together to form a neural network 'brain' for precise and complex processing, just like people recognize objects and label pictures.
Alpha go uses many new technologies, such as neural network, deep learning, Monte Carlo tree search and so on, which makes its strength have a substantial leap. Tian Yuandong, the developer of Facebook's "dark forest" go software, published an analysis article on the Internet and said: "alpha go" this system is mainly composed of several parts: first, the policy network, given the current situation, forecast / sample the next step of go; second, the fast roll, the goal is the same as the go network, but at the appropriate sacrifice of go Under the condition of quality, the speed is 1000 times faster than that of the chess network; 3. Value network, given the current situation, is estimated to be a white or black winner; 4. Monte Carlo tree search, which connects the above three parts to form a complete system. '
The two brain
Alphago is to improve playing chess through the cooperation of two different neural networks' brains'. These brains are multilayer neural networks that are similar in structure to those of Google Image search engines that recognize images. They start with multi-level heuristic two-dimensional filter to deal with the location of go board, just like the image classifier network to deal with the image. After filtering, 13 fully connected neural network layers generate judgment on the situation they see. These layers can do classification and logical reasoning.
These networks check the results through repeated training, and then check and adjust the parameters to make the next execution better. This processor has a lot of random elements, so it's impossible to know exactly how the network 'thinks', but more training can make it better.
First brain: move picker
Alphago's first neural network brain is the 'policy network' to monitor learning, observing the layout of the board in an attempt to find the best next step. In fact, it predicts the best probability of each legal next step, so the first guess is the one with the highest probability. This can be interpreted as a 'drop selector'.
Second brain: position evaluator
The second brain of alphago answers another question relative to the drop selector. Instead of guessing the specific next step, it predicts the possibility for each player to win the game, and then gives the position of the pieces. This "situation evaluator" is the "value network", which assists the sub selector by judging the overall situation. This judgment is only approximate, but it is very helpful to improve the reading speed. By classifying the 'good' and 'bad' of potential future situations, alphago is able to decide whether to read further through a particular variant. If the situation evaluator says this particular variant doesn't work, AI skips reading.