Scout - 3D Vision, SLAM, NLP, Neural Nets

Posted on 20/05/2021 by mtriplett
Steps completed / 4
Press to mark a step as
completed or click here to complete all
Components you will need
Select missing items to add them
to the cart or select all

Scout is a robot built to test 3D Vision, SLAM, NLP, neural nets, mapping, pathfinding, etc.  I am using this bot to test features meant for another and larger bot of mine (Ava v2).  My goal is to perfect the ability to move intelligently around my house from any point to any other point accessible to the robot.  I also intend to perfect the 3D perception system and a new spatial 3D memory for everything the bot sees.  This bot fuses data from multiple sensors and neural nets to perform the various functions.  This bot is controlled via voice remote, web page, or game controller.  This bot also has an autonomous mode.

See video and pics below.

I incorporated the following off-the-shelf neural nets into the brain of this robot.  Most of these models are running in python on a separate laptop with a graphics card,  and accessed through a custom built flask api.

Verbal Models

  1. NLP DialoGPT Model (Microsoft)
  2. NLP Text Generation Model
  3. NLP Sentiment Detection Model
  4. NLP Masking Model (Transformer)
  5. NLP Question Answering Model (Transformer)
  6. NLP Entity Recognition Model

Vision Models

  1. YOLO v3 DarkNet Model
  2. AlexNet Model
  3. Gender/Age Detection Model
  4. Face Detection Model
  5. Emotion Detection Model

I incorporated the following libraries and algorithms into the software.  This list is only the major ones.

  1. Spacy NLP Library
  2. Open CV Vision - Various algos for color, shape detection, etc.
  3. A Star Pathfinding (in Python)
  4. 2D Occupancy Grid Map (in Python)
  5. 3D Memory System - (in Python)
  6. Various custom built NLP Algos
  7. Various Graph Algos
  8. Fast Fourier Transforms (for audio spectrum analyzer)

This bot learns on its own any time any new word is encountered.  For this, I use the following data sources.

  1. Word/Thesaurus API
  2. Wikipedia Text and API
  3. RDF Triple Sources (Dbpedia and others on Linked Open Data Web)
  4. Wolfram Alpha API
  5. ConceptNet 
  6. GeoNames database
  7. Some custom built SQL Server databases
  8. Weather API



A black and white image of a depth stream:

A 2D map created from the depth info but with a force field applied around all objects.  This is used for pathfinding.

A portion of a 2D map...before the force field is applied.

A representation of the sonar data input.

LikedLike this to see more

Spread the word

Flag this post

Thanks for helping to keep our community civil!

Notify staff privately
It's Spam
This post is an advertisement, or vandalism. It is not useful or relevant to the current topic.

You flagged this as spam. Undo flag.Flag Post