By Daniel K., Age 14
The NEAT neuroevolution algorithm is a more advanced method of machine learning. Rather than creating multiple organisms and attempting to create newer versions of them until one succeeds by chance (like in evolution), NEAT adds a reward value to desirable actions and attempts to emulate human learning by slightly altering the neural network to do more actions that result in a reward. Unlike evolution-based learning, NEAT only has to simulate one neural network at a time.
To understand the way NEAT works, we first need to talk about neural networks. Neural networks are the basis of machine learning and try to simulate a simplified human brain. Neural networks consist of a single input layer and a single output layer of neurons. The input neurons are connected to the second layer of neurons by several connections with different weights (how much of the signal the connection transmits). The second layer connects with a third layer and so on until the network reaches the output neurons. The input neurons are activated depending on the input data (image HSV values, text codes, and so on). The neural network then processes the data using its hidden layers of neurons until all neuron values are processed. The values of the output neurons then get processed. If a neuron is semi-activated, the activation can be interpreted as the certainty of the neural network that its output is correct.
The evolution method starts out with multiple neural networks and removes those that underperform, ‘breeding’ those that succeed. However, NEAT only includes one neural network and starts out with all connection weights activated. Whenever the AI performs a task that is seen as successful, it gets a reward or punishment represented by a number. A positive number represents a reward and a negative number represents a punishment. The magnitude of the number determines how good the reward is, kind of like a points system. Usually when humans or animals receive a reward, whatever it might be, they want to do more of the thing that led to them getting a reward. When they get punished, the opposite happens and the human or animal does less of the thing that led to them getting punished. We can simulate this using NEAT.
Everything I just described happens live as soon as the AI gets rewarded or punished, which means that the AI can get better at doing things without having to restart it or the simulation. This also means that the AI can progressively get better at whatever it was made to do in production without having to pause to receive updates.