1. Introduction
Lucy is a virtual robot and a further development of Emo. She is able to learn by supervised machine learning and express emotions by an according emotional agent, which I have already introduced here. I might evolve her more, add an Arduino and a sensor board (gas sensor, temperature sensor, pressure sensor and color sensor).
Lucy has a linear-motion telescopic arm with 2 DOF. She can pick and place small colored cubes. The learning target is to place the top cube over the equally colored ground cube. The colors of the 4 ground cubes are randomly permuted after each learning and execution cycle.
2. Variable structure stochastic learning automaton
Supervised machine learning is done by a variable structure stochastic learning automaton. You can imagine a variable structure stochastic learning automaton (VSLA) as a kind of dice. When a single fair dice is thrown, there are six possible outcomes: 1, 2, 3, 4, 5, 6. The probability of any one of them is 1/6. Let us now map the outcomes to actions the robot should perform. For instance
if (dice outcome == 1) action_1();
or
if (dice outcome == 2) action_2();
Now imagine we could manipulate the dice, so it's not a fair dice anymore. Let's say, we manipulate the dice that the outcome is more likely to be 1 than 2, 3, 4, 5 or 6. The probability to throw a '1' is then for instance 1/3 and the probability to throw any other of the 5 remaining numbers is then 2/3. The probability to throw a '2' is 2/15, the probability to throw a '3' is 2/15, the probability to throw a '4' is 2/15 and so on. If we sum up all probabilities, the result is 1: 1/3 + 2/15 + 2/15 + 2/15+ 2/15 + 2/15 = 1, because the probability that we throw any of the 6 numbers is 1 of course.
Let's consider a kind of teacher who observes the process and evalutes after every throw if the outcome was favorable or unvavorable. If the outcome was favorable we manipulate the dice further with a special updating rule so this outcome will be even more likely at the next throw. If the outcome was unfavorable we manipulate the dice with the same updating rule so that this outcome will be not so likely as before at the next throw. And another part of this updating rule is to take care that the sum of all probabilities will always be 1.
After a while one outcome has potentially reached a much higher probability than the other 5, if only 1 of 6 outcomes is favorable. There might be also two ore more outcomes with higher probabilities than the rest if more than one outcome is favorable. In any case, the VSLA has learned from its teacher after a defined number of steps or a defined probability threshold is reached.
3. Fisher-Yates shuffle
As mentioned earlier the colors of the 4 ground cubes are randomly permuted after each cycle. This is realized with the so called Fisher-Yates shuffle. The code looks as follows:
int n; // number of permutation elements 1 to n int permuation[n]; // initialize array for(int i = 0; i < n; i ++) { permuation[i] = i + 1; } for (int i = n - 1; i >= 0; i --) { // generate a random number from 0 to n - 1 int j = random(0, n); //swap the last element with the element at random index int temp = permuation[i]; permuation[i] = permuation[j]; permuation[j] = temp; }https://www.youtube.com/watch?v=AKVZ2hgF2ZE