CCSR

byerley · October 5, 2014, 2:23am

Updates Feb 15th '14

Inspired by MarkusB and discussions with mtripplet and DT, added basic emotion simulation and mapping of emotional state on facial and verbal expressions. Positive verbal conversations and doing tasks will increase CCSR's happiness and arousal. Adverse environment and negative conversations (e.g. insults, disapproval) will decrease happiness. Happiness and arousal will naturally degrade over time, simulating sleep and boredom.
Added 2 8x8 matrix displays otional state.
Added 10mm RGB LED representing CCSR efor facial expression emulation. Eye movements and shapes are updates based on current emmotional state. The LED dynamically updates an HSV color value, the 'Hue' is proportional to the current 'happiness' state of CCSR, and the 'Value' is proportional to the current state of 'arousal'
Added continuous voice-recognition capabilities. CCSR now autonomously litstens and reacts to valid verbal communication.

CCSR (pronounced 'Caesar') is a prototype robot to play around with and learn about the main fields of robotics such as computer vision, sensors and actuators, navigation, natural language processing, artificial intelligence and machine learning. All previous CCSR updates, as well as older videos are tracked on this blog. The sourcecode of the main CCSR application and Natural Language Processing (NLP) module (nlpxCCSR) can be found on github.

CCSR can understand basic human language, and is integrated with a web-based knowledge engine (Wolfram Alpha), so it can answer questions on various topics. CCSR strives to be a social robot by modeling and expressing basic emotions through speech and facial expressions such as eye shapes and movements and an HSV color representing real-time emotional state.

CCSR is based on a Minnowboard SBC using an Intel Atom processor. It runs Angstrom Linux using the Yocto kernel, and a set of packages such as OpenCV for computer vision, espeak for speech synthesis and mjpeg-streamer for video streaming over ip. Compiling Angstrom for Minnowboard to add desired kernel modules such as wifi is described in detail here. Gyro, accelerometer and compass are used for basic navigation, a current/voltage sensor keeps track of stalled motors and low battery. IR sensors are used for basic obstacle avoidance, and a sonar range finder is used to map the environment and to provide depth to visual observations.

There's an ambient light sensor, as well as temperature and pressure sensors. USB wifi on the Minnowboard allows telemetry and remote control simply by ssh (e.g. Putty) from any other computer, I use a simple cmd-line interface to interact with the robot process running on the Minnowboard. I use linux pthreads to model basic independent 'brain' processes such as vision, navigation, hearing, motor and speech. I used this great opencv example to implement object tracking using color separation. There's an LCD display and keyboard running through some menu options for control and status.

CCSR has a 4-DOF (degree of freedom) robotic arm, built with Hitec servos, Lynxmotion servo brackets and a 'Little Grip Kit'. Shoulder servo is a HS-645MG, elbow, wrist and hand servos are HS-422. All servos are driven by an Adafruit I2C Servo Controller. CCSR can use the arm to pick up small objects.

CCSR is built on a Dagu Rover 5 chassis and a Polycase project box. Most of the sensors are super-convenient Adafruit breakout boards, various components come from Pololu, Robotshop, Lyncmotion and other places. Details on the components used in CCSR can be found in the diagram below.

Voice Recognition capabilities are added using the Google Speech to Text API. A custom Natural Language Processing (NLP) module (nlpxCCSR) in Python gives CCSR some basic capabilities to interact with human langage, kind of like Siri, as well as some basic Machine Learning. nlpxCCSR is based on pattern.en from CLIPS, a very cool python package for sentence analysis. Pattern.en can do POS tagging, chunking, parsing, etc, but also contains stuff like verb conjugation, pluralization, and contains WordNet, a lexical database you can query for word definitions and other useful info.

To interact with human speech, CCSR records a sentence using Alsa, and writes this out as a .wav file to disk using libsndfile. A bash script posts this file to the Google speech2text API using curl. Google returns a .json file containing (several guesses of) text. This text is passed to nlpxCCSR, which interprets the sentence and synthesizes a response. This response is passed back to the main CCSR process, which uses espeak to vocalize the response. nlpxCCSR can generate pure verbal answer to voice input (e.g. 'how are you', or 'what is the weather today'), or it can generate a CCSR action if the speech is interpreted as a robot command (e.g. 'turn 180 degrees' or 'pick up blue object'). nlpxCCSR will maintain an internal memory, and will do a simple form of Machine Learning by storing properties that it learns about (e.g. when interpreting 'the cat is in the garden', nlpxCCSR will memorize that the concept 'cat' is located 'in the 'garden'), and will try to answer queries based on its own knowledge. But if unable to do so, it will pass the full query to WolframAlpha API (cloud service), and passes the most appropriate 'pod' (wolfram answer), back to the CCSR process.

CCSR models basic emotions using a 2D happiness/arousal space based on this work. CCSR expresses emotional state through facial expressions (eye shape, 'nose' color, head movement, etc) and verbal expressions.

Eye movements and shapes (sad, angry, excited, etc) are dynamically updated based on current emotional state, and an RGB LED (it's 'nose') is driven by PWM signals from the I2C servo controller and dynamically updates an HSV color value; the 'Hue' is proportional to the current 'happiness' state of CCSR, and the 'Value' is proportional to the current state of 'arousal'. Positive events such as successfully completing tasks and receiving encouraging verbal feedback will increase CCSR's happiness and arousal. Adverse environment and negative verbal statements (e.g. insults, disapproval) will decrease happiness. Happiness and arousal will naturally degrade over time, simulating sleep, boredom, and an innate urge to initiate activities to increase happiness/arousal.

A detailed overview of the CCSR software architecture is shown in the picture below:

The CCSR main board is show in the picture below. I used Adafruit perma-proto PCBs for the custom electronics.

A simple Linux-based robot platform to learn about AI and ML

This is a companion discussion topic for the original entry at https://community.robotshop.com/robots/show/ccsr

silux · October 7, 2014, 1:05pm

So interesting!

How about adding an arm?

tomasp · October 7, 2014, 3:49pm

Dagu Rover has 4 motors, and

Dagu Rover has 4 motors, and you are using 2 motor drivers. Do you run two motors on each side parallel on the single driver?

dangerousthing · October 7, 2014, 5:12pm

Neat!
The Minnowboard Max looks very nice. They seem to be on pre-order for now though, and I can’t find a page with the differences between the two boards.

Question: what are you using for understanding human speech? I haven’t been able to experiment due to a lack of microphone which I hope to fix shortly.

byerley · October 8, 2014, 12:39am

Definitely, arm is planned!

Definitely, arm is planned!

byerley · October 8, 2014, 12:44am

I actually purchased a Dagu

I actually purchased a Dagu 5 Rover with only 2 motors, without encoders, Pololu sells them in different versions. So I drive only a sinlge motor each side. Seems fine for tracks, although I am finding out turning in place on high-traction surface is getting increasingly difficult as robot weight increases.

byerley · October 8, 2014, 1:10am

The Max is shipping I

The Max is shipping I hear. Also, Intel is shipping the Intel Edison now, I’m super excited to start messing around with that too, a full linux PC the size of a stamp. I haven’t tried speech recognition yet. I was planning on something like PocketSphinx, looks pretty promising, but perhaps much more poweful is to offload it to a server, maybe google speech. Let me know if you make progress, I’m still stuck in opencv…

webmaster · October 8, 2014, 2:27am

It’d be awesome if you would

It’d be awesome if you would edit your post and fill out some more fields - it looks so terribly blank on our front page Thanks!

dangerousthing · October 8, 2014, 2:54pm

I just ordered an Edison
I just ordered an Edison yesterday from Adafruit - they still had a few and SparkFun were out. Maybe next month I can try for a Minnowboard.

navic · October 11, 2014, 8:18am

Excellent work!

The functionality and control are clearly well done but the build itself is just breathtaking! Thanks for sharing.

byerley · October 13, 2014, 1:12pm

Sweet! How’s the Edison

Sweet! How’s the Edison working out? I’m really curious how much compute power it has compared to bigger SBCs such as Raspberry PI and Minnowboard.

dangerousthing · October 17, 2014, 5:36pm

Sorry, but RL has taken too
Sorry, but RL has taken too much of my time, so I have t done more than look at it for now. I have to figure out how to put headers on the bottom of the miniature base board so I can access the I/O pins.

mtriplett · November 18, 2014, 8:43pm

Very Impressive Work

Very impressive work, looks like you’ve been making huge progress. I hope you keep posting updates. That pattern.en from CliPS looks really cool, seems to go far beyond anything I have found for .NET for NLP.

I’ve been looking over some of your modules, learning what I can, not being a python guy. Great stuff. We are definitely pursuing a lot of the same goals. I find an ocean of possibilities opens up once a bot gets verbal. Dealing with memories is another ocean too.

Can’t wait to see where you take it. Thanks for the mention.

Regards,

Martin

byerley · November 19, 2014, 11:20am

Thanks Martin. Anna has

Thanks Martin. Anna has raised the bar very high; so much cool stuff to explore. To me one fascinating thing about this tinkering with AI is that you get to realize how incredibly complicated the human brain really is, as I am struggling to make a machine understand even basic concepts of grammar or shapes.

byerley · January 2, 2015, 5:32pm

You ask, we deliver

You ask, we deliver

markusb · February 17, 2015, 2:12am

Very nice, byerley. This

Very nice, byerley. This made my day.

roxanna77 · February 17, 2015, 7:31pm

this is amazing!

He is so polite! Really, this is amazing work, very well done!

mtriplett · February 17, 2015, 8:24pm

Nice video…Caesar has skills.

You have been a very busy person with the face, arm, and emotions! Well done. Awesome stuff!

I really like the hand/eye coordination tasks. Sweet.

rpopeye · May 2, 2015, 7:22am

This is a great, complex

This is a great, complex robot project! I especially liked the detailed description and block diagrams. About the voice interface, have you considered using Apple’s Siri, Microsoft’s Cortana or Google Now, I am not even sure if these can be freely used. Again, great job!

byerley · May 8, 2015, 4:32pm

Thanks! Yes, espeak is not

Thanks! Yes, espeak is not that great, or at least I haven’t been able to configure it to be very clear, so I am planning to use festival instead, that package can be compiled on Linux Yocto as far as I know. I’ll post an update if I get it to work!