I have cleaned up the interactive Stochastic Learning Automaton code I once wrote and attached it in this short blog. Just send '0' or '1' via serial monitor, if a chosen action was favorable or unfavorable and see how the automaton learns to chose the right action over time.
I have been working through the reading you posted on the Baby Bot project, and I changed your code to “standard” C++ to run on my PC in Visual Studio. A bit easier to figure out what is happening when you can step through it!
I like what you have done here. I was thinking that it would be great if there was a way to use this as a base and be able to map the favorable and unfavorable inputs to it. For instance, if the bot runs into something teach it to read the ultrasonic sensor and figure out what is the closest value for the sensor at which the bot should stop.
This lends itself to an event based architecture. Running into something, changes in the ultrasonic sensors would all be events that occur and then actions such as go forward, turn left, turn right etc would be actions to respond to those events. I would have logic attached which makes up the event (if either limit switch is depressed, we ran into something) and then map actions based on whether the event occurred or not.
Anyways, you have fired my imagination on this. This is a hard problem to solve and you definitely have some building blocks here to solving it. I will post what I come up with. In the mean time I can tell people I am working on a stochastic learning automaton.