Artificial Intelligence framework

LearningAutomaton.zip (42506Bytes)

MarcusB had recently posted his Variable Stochastic Learning Automaton code. 

https://www.robotshop.com/letsmakerobots/node/39298

Basically, what his code did was to do the math to randomly select an action when an event occurs. The robot would wait for input from the user as to whether it was a good action or not.  If it was good, the next time the event occurs, it is a higher chance for that event while the chances for the others go lower.  Eventually, the best course of action to an event bubbles up as a higher priority. The idea is similar to how a baby responds to stimuli in their environment.

The algorithm I thought was fascinating.  His code really caught my imagination because it should be relatively easy to fit his algorithm into something that would be more reusable than the straight C it is written in.  So, I did the easy stuff and created base classes that create reentrant code etc.  This is a framework that developers can hook into to define their own events and actions.  I only give a framework; to make something like this work still requires a lot of work and code but is a good start to a difficult problem.

Event - abstract base class - must be overridden

bool RunLogic() - overridden  method must return true when the event occurs - this signals to call OnStart() method
bool OnStart() - while RunLogic() returns true, OnStart() is called on each scan until it successfully returns true - success triggers framework to get a random action and then try it.
bool OnComplete() - last method called after an event has been triggered - it runs with each scan until it returns true, then calls CheckIfActionIsFavorable() and updates action probabilites
bool CheckIfActionIsFavorable( ) - overridden method determines whether action was favorable or not - what this method returns is what is used in the probability matrix

Action - class can be overridden or use the object as a "container" for child actions

bool OnStart() - overridden method is called as the first method on a custom action - this code will be run with every scan until method returns true
bool OnComplete() - last method called after an action is run - this code will be run with every scan until method returns true

Actions can also have child actions as well which won't start until the previous action in the list is complete.  The code is designed to be cooperative multi tasking meaning that it yields if it has nothing to do.  There should never be a delay(...) in the code that overrides my code.

Example, using a very basic robot with two motors and bumpers on the front - I create a custom class for each event and potential actions in an Arduino.  I have an Arduino so that is what I am testing with, but could use anything really.  All code to drive the robot is in the action's OnStart() methods.  These probabilities drive the robot's personality.  Ultimately, I see a database to drive events, actions and save probabilities on a separate "big" controller which communicates to another "small" processor that does the fiddly bits like encoders.  Perhaps a RasPi with a Dagu Mini driver doing the IO.

Events:

CollisionEvent (if one or both bumper limit switches are closed)

Actions:

GoBackwardsAction
GoForwardsAction
TurnLeftAction
TurnRightAction
GoBackwardsAndTurnRightAction
GoBackwardsAndTurnLeftAction
StopAction

On each loop() call, I then check my list of events to see if an event occurred.  If it occurs, I then randomly select an action, try it, decide if it is favorable or not and then update probabilities.  Eventually, GoBackwards, GoBackwardsAndTurnLeft, GoBackwardsAndTurnRight will each be close to 33% which is what you would expect.  I can write the event to wait 5 seconds and then if I haven't run into anything in those 5 seconds, decide whether it is a favorable or unfavorable action that was chosen.  Whatever.  The robot has trained itself how best to respond to an event.  It is up to the developer to build in the success or failure criteria for an event.

Note:  this program uses around 11k of 32k of SRAM and around 300 bytes of global memory on my test Arduino.  That is actually fairly small and a lot more could be done on an Arduino, but a real world problem is soon going to be way too big.

I was going to show a video, but the bot looks like a drunk moose during mating season as it rams walls, smashes into things and then the video camera ran out of room on its SD card. If people really want a video I can oblige.

These events could be anything.  An event could be the robot sees a human face, it hears someone speak, etc.  Using this algorithm, a robot could learn how to best keep people engaged.  Or what statements to say when it sees someone and gets that person to spend more time with them.  What would be very cool is if a robot can look at its environment or patterns in what events are occurring and from that create its own custom events and actions. 

I also include the class LearningEvent which basically allows one to dynamically learn criteria for an event.  I haven't thought this through as well yet, so this idea may get trashed but I like the idea of being able to teach a robot what criteria can be used to generate an event.  For instance, if one puts an ultrasonic sensor on a bot that previously only had bumpers one could use this class to "learn" what minimum distance is before a bumper is touched.  Based on this value, it then can "learn" what the best strategy is to deal with the event when it occurs.  It might also be used to tell how successful using a hand to grab a soda can is.  After each try which isn't 100%, it can retrain by trying different numbers.  This could also be used to figure out the P, I and D on a loop.  I am sure this will change; I appreciate input and ideas on this.

This is a very simple example, and the code works fine with only one event.  With larger event queues, there will be problems that have to be dealt with such as two events happening simultaneously.  Whose actions do we choose?  The bot will also need emergency events that override all other events (over amperage on motors - stop!), default actions (ok, no events are occurring, what should the robot be doing?).  I am really going to need another class to arbitrate these issues, so more cuts coming.  I lso need tto change my code to use better random number generation per what MarcusB had. 

Thanks MarcusB for the idea and for doing the math.  I changed your code a little bit, but I think it keeps the spirit if not the exact math of what you were trying to do.  I know this is an interim cut of the code, so any ideas or suggestions are welcome.  I can also do more documentation if people seem interested in pursuing this idea since I only have a brief description of the classes.

I do my development in MS Visual Studio and then download to the Arduino when ready.  So there are some #define NOARDUINO etc which allow me to seamlessly go from PC to Arduino worlds with the same code base. 

Regards,

Bill

 

 

I am using the algorithm in

I am using the algorithm in the moment for a chatbot. Instead of actions the robot chooses randomly topics. According to your response the robot learns after a while, about which topics you want to talk and about which topics not so much.

Thanks for your interests and thanks for describing the algorithm in an easy understandable not mathematical way, which is quite difficult for me.