Self learning robot

this is self learning robot using reinforcement learning (Q-learning, Watkins 1989);

it learns to choose actions from state rewards - no other support controller required (PID, Adaptive system ...) to store Q values for small state space table can be used;

for large state space some approximation is necessary - Iam using assocciative neural network; state for line following test are three last line possitions S(n) = (L(n), L(n-1), L(n-2)), where each L can have 128 possible line possitions : 128^3 memory places for storing in table - too much ^_^ with using pure table, using assocciative neural network - no problem

 

software is written in Atom text editor, compiled with arm-none-eabi-g++, for flashing Iam using dfu-util (mcu has USB bootloader in ROM), all program is running on robot, result data are transmitted via USART

 


This is a companion discussion topic for the original entry at https://community.robotshop.com/robots/show/self-learning-robot

Congratulations!

 

Hey I love this project! I think it’s a great one to introduce people in both AI and robotics, because the robot is a basic one, with a simple application.

I see you use Ubuntu right? Can you detail the software you use for programming the robot? Is the software all in the robot or in the PC? Please explain the hardware configuration.

Regarding the neural part, which software do you use?

In some contests the lines cross each other, so the robot has to select the correct one, do you have that in the algorithm? I guess it’s not maybe difficult.

The video is very good. A tip is to create a detailed video for the ones to see the neural network more in deep, explaining in slow motion the different steps of calculation of the neural network, maybe one or two iterations…

Finally, it will be great to have a comparison between traditional PID control solution vs Neural one, with different variables of study: resilience of the robot to variability of the environment, speed to solve the circuit, power consumption…

Anyway, a great project and I look forward to see more.

it is just test of

it is just test of associative neural network for storing Q values; convetional PID is much better for this application;

neural network works similar like kohonen self organizing map, with few improvement -> better weights initialization

Iam working on presentation about used neural network now, maybe I write some paper too :slight_smile:

Nice test

 

Well, it it’s just a test, it’s a nice one. I would like to see more details as commented before. And I’m looking forward for your paper too :wink: