lichess.org
Donate

OMFG! STOCKFISH NOOO! ( AlphaZero vs Stockfish ) Thoughts

Neural networks is a really fascinating area. This match made me read a book on the basics of neural networks.

Training a neural network is mathematically equivalent to minimizing a cost function ( the cost being roughly speaking the difference being the actual output of the network and the desired output ). This is a question of matrix equations and is well understood.

Also there are some general results, an important one being that neural networks are computationally universal.

However many things about the essence of neural networks are still not understood.

First with so many free parameters ( every neuron of a layer is connected to every other neuron of the next layer, introducing a weight for each connection ) it could be that the network learns only the particularity of the training set ( overfitting ) and fails to recognize samples that are not in the training set ( lack of generalization ). Why neural networks are that good at generalization, is not known.

It has been observed that if you add a little systematic disturbance ( very much like white noise ) to an image, then image classification networks that could correctly identify the original image, give wildly wrong classification for the disturbed image. At the same time the two images look exactly the same for a human observer, and a human observer could not even tell which is the original and which is the disturbed one. This is despite the network being continuous.

Then there is the problem of meta parameters. There is no exact rule how to organize the neurons, what is a good learning rate and a bunch of other meta parameters. So designing a neural network is more of an art than a science right now.

It is unfortunate that the training is so computationally expensive, nevertheless I'm currently trying to create a minimal neural network with graphical representation of weights, biases and activations to get a feel for it. I'm not there yet but the first thing I will try when the machinery is there is to train the simplest possible function: a logical gate with two binary inputs and a binary output ( possibly an XOR gate ) using a network of two input neurons, a hidden layer of two neurons and one output neuron.

This topic has been archived and can no longer be replied to.