ReinforcementLearning for FourLegged Robots

Hey there,

i am currently working on a Project (SpotMicroAI) where i want to use ReinforcementLearning to make a four-legged Robot walk (or maybe just support its motions with RL, not sure yet).

I like the idea of this Paper and try to implement it with some modifications (different Action/Observation-Space, 12 Servos instead of 8 and other things).

Some other cool Papers:

I was wondering if some of you guys are working on similar Projects and want to share some Experiences. Or maybe you find it interesting and just want to join the Project :slight_smile: Go ahead, you’re very welcome:)

cu guys

1 Like

Hello @fwilk !
I hope you are well. I haven’t had the opportunity to work with FourLegged Robots and DRL, but I have implemented several control algorithms. For that, I used ROS and OpenAI. Also, depending on the simulator that you use OpenAI allows you to create your own work environments and then train them with the main algorithms DRL offers:
OpenAI legged example
I am particularly interested in those topics, I’ll be glad if we could talk a little more about your ideas.
Hope that help you.
See you :grin:

1 Like

Hey @RoboCS

thanks for your reply! i was using ROS (Gazebo), too. Right now i am using PyBullet to verify the Kinematics and Control, and because i have faster dev-cycles here. Gazebo hates stops/starts soo much :slight_smile: I am working with ROS for a while now and i really like it. Most of the time :slight_smile:
Regarding the SpotMicroAI Project i want to try to stay on a slim custom implementation as long as it makes sense. And after working with PyBullet for 3-4 Weeks now, i must say… i really like it.
And yes, OpenAI! I am currently building a Gym-Env for/with SpotMicroAI. But no time to finish it yet :confused:

I was mailing with Erwin Coumans, who is one of the Authors of Bullet3 and one of the arXiv Papers (see above) about Action- and ObservationSpaces. Very interesting, such a smart guy. I guess what is important here is to choose the right representation of the ActionSpace.

Right now i use a custom and fully parameterized Kinematic-Motion-Function to control the Legs. So this function gets the current time and outputs values for all Servo-Angles.
One idea would be to have this motion/timeindex as part of the ObservationSpace, together with IMU (roll, pitch) and Ground-Distance. The ActionSpace could then be a roll/pitch/yaw/position of the “MainBody”, so RL could help the bot to Balance while it is doing the “static Kinematic”-Stuff.
But i am not sure if i like the idea, because the bot does not really “learn how to walk” but only how to “balance if it walks”

Not sure…

Regarding the Legged Examples (OpenAI, PyBullet etc)
yes, they are a good start. I already used PyBullets Neural Network 3D-Walkers (see PyBullet Example) as a Template for SpotMicroAI. But all this requires a lot of computation power and/or time.

And i really like the Idea of “Using Physics as Model Prior for Deep Learning” as in:

So not just doing “random training” and wait for “any nice results to come up”, but using physics (the kinematic Model of Body and Legs) to have some guidance for the Network.

What do you think?

Hello @fwilk !

The ActionSpace could then be a roll/pitch/yaw/position of the “MainBody”, so RL could help the bot to Balance while it is doing the “static Kinematic”-Stuff.

Yeah, I actually think that’s the hardest part, modeling the action space. I think you could start trying some “easy” behavior. Something like: lower your head, move one leg forward or backward, etc.

The other thing is to think about how to design the simulator to know when and how to achieve the goal, when to assign a reward or not. I will try to investigate a bit about this type of robots and how DRLs currently relate to body movements.

Hey @RoboCS,

very cool. looking forward to see what you found out.
regarding the rewards one solution might be to give rewards when

  • pitch+roll < 10° / 20° …
  • distance to the ground is between x and y (makes no sense to try crawling or walking “high up in the air”)
  • time without crash
    and then just let it walk using the Motion-Function and let the DRL control the BodyPose.
    I hope i will find some time on the weekend to have a first try here. I THINK this could produce nice results…but not sure :slight_smile:

Here is a YouTube-Video of the current state without RL:


Hey @fwilk !
That seems pretty nice already! Did you try to run or test that simulation in your own robot?

Hey @RoboCS

You mean the physical Bot? Yes i tested the kinematics before and it looked pretty cool. But i did not test the gait yet. Will do after i reassembled the Bot. I am currently measuring weights of the Parts for more realism in the Sim.


pretty cool dance :wink:

1 Like