MyRobotLab - Template Matching - Bot finds teet

GroG · July 21, 2011, 1:55am

Logitech C600 Camera - PanTilt Rig with $16 BBB Arduino Clone - not super fancy but it works…

Template Matching is now available through MRL.

Template Matching is a process of matching a small sub image within a larger global image. As an exercise I chose a wall socket since this could be the goal for a self charging robot.

When the target is locked and centered, an event will fire off. If this were a mobile platform and the goal was to mate with the socket, the next behavior would be to move closer to the socket and avoid obstacles. Since, this is not a mobile platform, I have chosen to send the event to a Text To Speech service with the appropriate verbiage.

The interface for creating a template can be programmed with coordinate numbers, or selected through the video feed. To select a new template the Matching Template filter should be high-lighted, then simply select the top left and bottom right rectangle of the new template. You will see the template image become visible in the Photo Reel section of the OpenCV gui.

Currently, I am using the Face Tracking service in MRL. The Face Tracking service will soon be decomposed into a more generalized Tracking Service, which can be used to track a target with any sort of appropriate sensor data. Previously I found tracking problematic. The pan/tilt platform would bounce back and forth and overcompensate (Hysterisis). The lag which video processing incurs makes the tracking difficult. In an attempt to compensate this issue, I have recently combined a PID controller into the Tracking service, and have been very pleased with the results. The tracking is bounces around much less, although there is still room for improvement.

PID is a method (and artform) which allows error correction in complex systems. Initially a set of values must be chosen for the specific system. There are 3 major values.

Kp = Proportional constant - dependent on present errors
Ki = Integral constant - is the accumulation of past errors
Kd = Derivative constant - attempts to predict future errors

The video will show the initial setup. This involves connecting an Arduino to a COM port, then connecting 2 Servos to theArduino (1 for pan & another for tilt). After this is done, I begin selecting different templates to match as the test continues. The template match value in the upper left corner represents and represents the quality of matching.

The states which can occur

"I found my num num" - first time tracking
"I got my num num" - lock and centered on target
"I wan’t dat" - not centered - but tracking to target
"boo who who - where is my num num?" - lost tracking

More to Come

In the video you can see when I switch off the lights the lock is lost. Template matching is sensitive to scale, rotation, and light changes. Haar object detection is more robust and less sensitive to scale and lighting changes. The next step will be taking the template and proceeding with Haar training.

Kinect & Bag Of Words - http://en.wikipedia.org/wiki/Bag_of_words_model_in_computer_vision association

Update 2011.08.08

I was on vacation for a week, however, when I got back I wanted to make sure the latest (mrl 13) was cross platform compatible.
I had some problems with Fedora Core 15 / GNOME 4 desktop / Java & OpenCV
FC15 can install opencv version 2.2 but 2.3 is available for download.
I removed the 2.2 version - and did a clean install of mrl 13

The desktop is still acting a little “goofy” but after :

copying *.so’s from the bin directory to /usr/lib (blech)
ldconfig
loading opencv service - and refreshing the gui screen (blech)
using the Gray/PyramidDown/MatchTemplate filters

I got tempate matchin on the MAAHR brain going at 78 ms per frame running with debug logging. (almost 13 frames per second!)
it says 93ms because the screen-capture slows the process down.

MAAHR is currently running on a 12V SLA
The CPU and power supply are cool
None of the CPUs are over 60% and this should drop off significantly if the video feed was not being displayed.
MRL-13 has now been tested on Windows XP & 7, Fedora Core 13 & 15 - The next version should come with opencv binaries compatible with an ARM-7, although Ro-Bot-X’s Chumby appears down for some reason…

https://www.youtube.com/watch?v=UMXrk6EVWfI

rik · July 21, 2011, 2:58am

Video made my morning

chilled

ro_bot_x · July 21, 2011, 8:05am

Nice and fun-num-num!

Nice and fun-num-num!

an_tech · July 21, 2011, 11:19am

Grog, I love myrobotlab, but
Grog, I love myrobotlab, but haven’t figured out how to get two to communicate, could you point me to a clear tutorial please?

GroG · July 21, 2011, 11:33am

Two of which?

Hi An-Tech,

I’d be glad to help, but I need more info.

Are you trying to connect…

Two Servos?
Two MyRobotLab instances running on separate computers?
Two Services talking to one another?

an_tech · July 21, 2011, 12:13pm

All I want to do is convert
All I want to do is convert text into binary commands and send them to my robot from wherever I am (through a my robot lab serial connection).

an_tech · July 21, 2011, 12:14pm

All I want to do is convert

All I want to do is convert text into binary commands and send them to my robot from wherever I am (through a my robot lab serial connection).

an_tech · July 21, 2011, 12:16pm

Sorry about the double post
Sorry about the double post

GroG · July 21, 2011, 12:24pm

Who where what?

What / Where is the source of text ?
What is the text represent ?
What is expected to happen when Arduino receives it?

an_tech · July 21, 2011, 12:55pm

** 1. The text is from my**

1. The text is from my keyboard.

2. the text will be a genral command to the IRIS System. Anything from “Anthonys room - Lights - 75%” to “Butler Bot - Deliver: drink 1 - Kitchen”.

3. not arduino, prop. or picaxe (which do you think is easier to communicate with from myrobotlab?). When it gets the binary message, it will execute the command.

GroG · July 21, 2011, 1:42pm

Ok… A few answers and a few more questions

1. Hmmm… MRL does have keyboard input but only in the context of previous robots… where keyboard was used to do remote control. I could make a keyboard service easy enough…

2. Ok, does this IRIS System already exist? And can it do these things? e.g. Lights 75%, Butler Bot Deliver drink 1 - Kitchen?

3. Not Arduino Prop or Picaxe? You currently have something listening on a serial port? Arduino is currently supported… in that it has its own gui in MRL. MRL can currently connects through the serial port to control an Arduino. This could be done for anything which listens to the serial port.

an_tech · July 21, 2011, 6:00pm

**1. Sounds good

Iris**
1. Sounds good
Iris system will be a prop that I am ordering. It will use xbee, IR, cables, and rf to send commands to various things. It just needs to receive a command.
Sorry about that, it was a little misleading, I meant: not an arduino. Either a prop. Or a picaxe. Also, in your oppinion, do you think it would be easier to add the computer to the communications cog (which will require adding an extra ID bit to the command and involve waiting until the board was ready (not already receiving), or to have an 08m monitor the port and take care of all that?
Thanks a lot for the help btw

an_tech · July 28, 2011, 12:38am

Poke
Poke

christhecarpenter · July 28, 2011, 9:47am

@An-Tech

A simply system that I have used quite a bit is a simple letter/number combination. For example, you could go direction and speed as F100 (for forward 100%) or R100 (reverse 100%). Maybe L90 for left 90 degrees. I also use P5 for “Play track number 5” off of the mp3 player. This simple system has worked flawlessly for a lot of projects of mine. Just a thought.

Now, the real question is: Does Butler Bot have any wheels attached yet or are we still just “staying alive” and blinking leds?

ro_bot_x · July 28, 2011, 10:02am

So, the user can select a

So, the user can select a box in the image and MRL can track it. How can we store several selections? How can we make the selection automated? I mean, I want the robot to be able to find a wall socket, a cup, a beer, a flower, a toy, whatever. To do that, it has to have pre-stored selections that he can try to match with the current video image. If he knows what to look for, it compares just one selection to the image. But if it needs to recognise some object, it needs to be able to make the selection of the object in the image by itself, then compare it to the pre-stored selections. If it can’t find it, then it can ask for a name for that object and store it as a new object. You already have a service (function) for this?

GroG · July 28, 2011, 11:02am

Ouch …

Yeah… gettin there… I had to do a bunch of upgrades… OpenCV came out with a new version of its software, and I’m dealing with broken parts… Have to get a new mode working for the Kinect too (interleaved - where 1 frame comes in as depth and another comes in as the image) this will make great object isolation. So things in range can be identified…

The keyboard input is easy squeazy, but it would be helpful if you had the hardware setup first… Maybe string together what you have so we can start test and developing what you want…

Small steps/changes are usually more productive…

GroG · July 28, 2011, 11:07am

Good suggestion CTC

I’d like to make a GUI in MRL where you can dynamically create and bind buttons to Strings… I’m guessing your doing that with your Android project? When are you going to start coding for MRL again

GroG · July 28, 2011, 11:53am

Yep - all good questions…

It’s headed that way…
The next simple (almost trivial step) is to store more than 1 selection… I’ll be doing that shortly…

You and I are on the same track … we know what we want as an end goal…

Here are some things to be aware of though:
TemplateMatching works pretty well if the image’s lighting conditions are the same and the scale is the same…
This “matching” is pretty delicate… To have more meaningful matching we’ll need to do Haar training / SURF. What this does is basically de-constructs the image into a data “map”. Imagine taking several sheets of stretchy tracing paper - overlaying the image of interest with it, and tracing it. On one page you trace the most “obvious” form… and on subsequent pages you fill in more details.

This process is called Haar Training, and its result (the sheets of tracing paper) are referred to as Haar cascades. They are more robust than template matching - since they are traces - they are much less sensitive to light changes… Since they are done on stretchy paper… you can scale them quickly to attempt to match objects at different ranges…

In some ways its like getting closer to the “essence” of an object - which is closer to how we objectify things…

When we think of a “table” we build a meta-table image in our head - although it can be paired with a specific table image, the construct or essence of table is universal. We can save the image and bind it to the construct, but the generalized table is what we find useful to communicate with.

I’m pretty close to this type of thing… I got a Kinect working with MRL now…

Object isolation is key in object identification… You’ve got to first isolate the thing your interested in. You can do that with a Kinect since it gives you range info - you just say your interested in objects from 30cm to 60cm in range. Qbo does isolation using stereo cameras - it finds “the nearest thing”

Predator - does isolation by tracking the object and isolating an arbitrary size around the tracking point.

I like the idea of user selection to provide some of the information on how to isolate objects. It is pretty convenient to look through the robots eyes and point/click what you want it to learn. The real exciting part about it is the idea, that the robots vision can be streamed to the internet, and many users could be point/click/teaching the robot what to learn - the processing of these images to Haar Cascades is pretty intensive, but could be done by more than one computer. The idea is the robot then would have access to a large online library of work done by people to help identify important "objects"

I’d like to start such a library… and I’m making small steps to that end goal…

an_tech · July 28, 2011, 3:34pm

Actually Chris, I think I
Actually Chris, I think I have a pretty good system: each robot has an I.D. Byte and a pre-programed list of commands. If a robot is listening and hears it’s I.D, it will listen for the command. This gives me 256 reusable commands for each robot.

I think I will redo butler bot in the near future. I will make it out of more lightweight materials and add a mini fridge on the bottom, and a pump going to the top which will have glasses. Also, I may add a small “snack bar” in the middle.

ro_bot_x · July 28, 2011, 7:39pm

I know the Kinect is the

I know the Kinect is the best thing to use, but I don’t realy want to add a Kinect to a Chumby based robot.

What if we add an IR sensor and a US sensor to the webcam to give it range info? The microcontroller that pans/tilts the camera can also read the range sensors.

Or, use a projected laser line and use the height of the line in the image to detect the closest object?

Which of these alternatives are better/easier to use with MRL?