MDIBs - Metadata Driven Internet Brains

mtriplett · July 20, 2014, 12:37pm

I'm always asking myself, "What is logical for how to build a smarter bot with today's tech and the tech soon to come?"

Currently, I come to the following conclusions. I realize that each point I make is hugely debatable, I’m just putting out some opinions, not trying to prove anything. This is the course I am on for my currently, so I thought it might stimulate fun discussion.

1. A bot can't know everything, so at some point a bot will need to "look up something" on the internet. Likely, a bot will need to look up many things at the same time, or do many things that involve internet resources.

2. I believe the main "brain" of smarter bots should be physically "off bot" and "on the web" for small and medium sized household bots that have internet connectivity. I used to really want everything to be on a bot, but I come to this conclusion for performance, economic, and reuse reasons.

Performance: A bot can call its "Internet Brain" once, and the "Internet Brain" or IB, can call many other web services/resources as needed, in separate threads, before figuring out what to do.

Economics: Bots that have to carry the power and weight of "big brains" will be bigger and more expensive than most people would like. I’d personally like to have 3 or more bots per household, so they need to be affordable, and smart.

Reuse: Should bot brains be custom builds? I don't think so. I believe brains should be reused. Until we figure out how to better share/leverage software agents and develop some common concepts/interfaces/etc, we will all be building bots that aren't as smart and useful as they could be.

3. Bots should not wait for or expect to get an answer as to what to do to any given circumstance immediately. Basically, things should be asynchronous. This means bots should make a call to an IB with something like "Is it going to rain on Tuesday?" and then call again a fraction of a second later to see if an answer is waiting. A mechanism for a server to call the bot when the answer is ready would obviously be better.

4. Bots will have different sensors, actuators, behavior, etc. This means Internet Brains (IBs) will need to support many different configurations. I will refer to this as "Metadata Driven IBs", or MDIBs. It is logical for this metadata to exist on the internet and be maintainable by robot builders through an app of some kind. It would be very helpful (but exceedingly unlikely) if standard concepts and structure could emerge for this metadata. There would be a huge amount of this metadata and many different substructures. (Instead of waiting for these standards which will never happen, I will probably just come up with some. Why not?)

5. People will want to interface with their bots through various devices while not physically around them. They may want to see an avatar of their bot, onsite images, video, maps, or sensor data through phone, web, tablet, etc. These maps might be consolidated “views” on multiple bots/sensor data, like home automation data / internet of things stuff.

6. Bots that are owned by the same person should be able to share data so as to increase their “situational awareness” of a given location. The internet of things should be tied into as well. This should be a function of the MDIB. Any bot in your house should know whether the front door is locked, what the thermostat is set to, whether there is motion at your back door, or a flood in your basement.

7. Complex rules should be able to be built on the MDIB coordinating the home, its sensors, and one or more bots.

8. If a MDIB is a “black box” that is configurable, useful, and interoperable, then robot developers do not really need to know or care what technology was used to build it.

9. While MDIBs should run “on the internet”, they should also be able to be extended and customized back “into the home” by supporting some common interfaces and being able to call internet resources that homeowners put on their own computers. This means developers/homeowners should be able to build intelligent agents, register them with the MDIB (Metadata Driven Internet Brain), configure them through the MDIB app, write and publish their code on their PC or other device, and then have the MDIB start using their custom agents when appropriate to do so.

10. What is the role of microcontrollers in this model of robot building? Robots still need an on-board brain. This brain needs to directly handle sensors which are timing sensitive activities (like sonars, gyros, etc.), actuators, motors, etc. This brain will need to handle “reflex actions” like obstacle avoidance, and be able to call an MDIB for higher level less time sensitive brain functions. A unified “Dictionary of Commands” will need to be devised to robots can communicate with MDIBs and implement commands given to them.

11. How should data intensive sensor processing (like video processing) be handled in this model? That is an open question. I suspect a hybrid approach with most of it being done onboard and some “interesting” frames being sent to a MDIB for additional processing (object recognition, localization, face recognition, etc)

The next question is “How should a brain work?”

To me, that is an unsolved problem. I ran into a quote again today that reminded my own efforts and deserves repeating:

What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle. – Marvin Minsky, The Society of Mind, p. 308

My efforts at a brain thus far can basically be summed up as a collection of services and software agents that use different techniques to accomplish general and specific tasks. A service figures out which agents are applicable to the circumstances and executes them. When all these agents are done, the service arbitrates any conflicts to determine what desired behavior gets sent back to a robot.

Given this concept of a brain (which might not be a good one, but lets run with it for the sake of this point) I think it is quite easy to visualize the “Society of Mind” concept as an MDIB. If a configurable brain is built as a collection of agents running largely independently of one another, with all the configuration stored as metadata and maintainable through an app, many robots would be able to easily share agents/code/metadata/knowledge.

As new agents or new versions of existing agents are written and introduced into the collective MDIB DNA, some robots might use them, others not. I can only guess that robots would get a lot smarter, a lot faster, at a much lower cost, with much less code.

What do you folks think? (other than the obvious, that I just wrote a manifesto)

dangerousthing · July 21, 2014, 2:37am

I agree. Just do it! Me, I’d
I agree. Just do it! Me, I’d probably spend too much time studying whereas you went and built and programmed Anna within a year! Doing is better most of the time, once you have some sort of plan. I look forward to seeing what you build.

Will you create a human-level intelligence? I rather doubt it, but that’s not what we need. We need something that can pretend to be human in a conversation, as well as providing a bunch of other services. This is a wonderful idea.

However, I do see some problems.

Mainly these are web connection problems. For instance, Verizon randomly seems to drop my connection or slow it down. Also, despite having the wifi in a central part of the house, some parts of the house seem to lose the connection to my own in-house signal. At least that one has some fixes,

I would like to propose a slightly different emphasis on the layering scheme:

Layer 1: The robot

Layer 2: The home MDIB server

Layer 3: The main MDIB server

I propose that the connection between Layer 2 and Layer 3 be considered intermittent and probably slow.

This would put more of the load on Layer 2, assuming that Layer 2 can handle the load. This would mean that Layer 2 would probably have to handle the speech to text part, as well as at least a mid-sized dictionary.

In addition there is a possible problem if I took a robot outside with me, away from my wifi connection.

One solution for connection problems, at the cost of a slower connection speed, is to use a cell phone modem inside the robot. For example, I have one of the Adafruit Fona breakouts to experiment with in that direction.

In Groucho, I can put a full server and a 3 TB USB disk if I want to. And I can use him for the home server for the smaller bots.

Though to be honest, I might do better by using a separate server.

I have to run now, sleep is catching up to me now.

mtriplett · July 21, 2014, 8:08pm

Universal One Table Memory Structure for Atoms

I put forth where I was headed with Version 1 of this idea, what you are describing in what I had in mind for Ver. 2.

I tend to agree with your assessment of the layering (having a home server), as thats the approach I’m taking at home. Incidentally, I have used the bot successfully outside the home as well.

I’d like to put forth a brain model and API where the majority of robot builders could get some experience with other brain functions without having to house their own hardware or understand the code, by getting familiar with the various metadata that controls behavior, through a website. There will be some learning curve.

I will say I made some huge progress yesterday towards a “Universal Memory Model” that consolidates every table and structure I currently use (or can forsee using) for “Robot Memory” into a single table. Every memory is some type of “Atom” or some association between “Atoms”. Each Atom (or Association of Atoms, which is an Atom as well) can have some data that rides along with it. It allows the creation of new memory types (called Atom Types) within the same store. I am really excited about this for three reasons:

1. I can build a relatively small number of forms that will “adapt” and let me view and maintain anything that the robot knows or any behavior that is defined, basically any aspect of anything.

2. This should facilitate “syncing” memories/behavior fairly easily from one server to another, thus setting up for Version 2, the home server you have described. It could even be a CSV file exchange or something. Syncing code will be another matter, but I believe I could store the code for the agents inside the brain as well…and then create the classes “on the fly” from the code at runtime. I have some C# colleagues that have written books on the subject, so I think its doable.

3. Because there will be little cost in time changing a DB or creating UIs to build out new brain structures/ideas, I will probably build a lot more a lot faster, The hold up will be the one true holdup…thinking of good ideas.

While all this might sound undisciplined (for the record, it is) as I am breaking decades of data modelling conventions, it is my intention to run each “Atom Type” as a separate “In Memory Database” and use lots of threads/processors in my new brain. The new brain could end up being a lot faster than the old one. This means SQL server (or something else) will essentially only be used to store the permanent record, so I might kick SQL Server out as soon as the idea matures.

Gotta run, wife calls.

dangerousthing · July 22, 2014, 11:15am

Just a thought
I was thinking about robotic cognition the other day.

One of the things that marks human cognition is remembering similar situations. For a concrete example think about a robot getting caught in a bathroom. To a robot, there are probably many clues that a room is a bathroom: a small room (mostly: my mom-in-law had a master bath room + sewing room), towels hanging, tile, books to be read, etc. For a robot of any size, maneuvering in my bathroom would be a pain. In Lee’s bathroom there are privacy concerns and perhaps a wheelchair in the room.

It would be nice if the robot could somehow generalize the concept of a bathroom and the special rules regarding it: privacy, and the problems maneuvering.

Without programming the robot for the specific case of bathrooms, I don’t see how a learning robot can associate all of these things together.

I would expect to have to tell the robot about the privacy concerns, but how to generalize this to “don’t store images or post the, on the net” I have no idea.

I’m sorry, I’m sure there is a thought in this mess somewhere, but I suspect my pain killers have kicked in and my thoughts are heading towards la-la-land. I’ll write more on this later when I can keep rub two thoughts together to make a discussion.

Have a nice day.

dangerousthing · July 22, 2014, 11:16am

You could try PostgreSQL.
And yes, it does run under Windows and runs very quickly with large databases.

It sounds like you’ve taken something from OpenCog. I think I would use a bit more structure, with other tables for data of different atom types. On the other hand, I have a button that says “Why make something simple and elegant when you can make it complex and wonderful!” (Seriously, I generally try for simple code, but during the middle part of a project when creature feep it at its maximum things get complex before I can fold them nicely and refactor the code.)

I look forward to seeing this.

dangerousthing · July 22, 2014, 2:13pm

OpenCog: AtomSpace as a language
One of the OpenCog bloggers had an interesting post that claimed their AtomSpace had all that was necessary to be a computer language. He then went on to prove it quickly, without an example though. He compared it to Prolog.

If this is true, it would be easy to write sections of high-level code fairly easily. It would even be reasonable for a robot to write code of its own, because the distinction between a memory and running code is fairly small.

Just another thought.

EDIT: The blog title is “Why Hypergraphs?”

dangerousthing · July 23, 2014, 1:09am

Have you tried sic’ing your text to speech service on Wikipedia?
Martin,

I’m wondering if a specialized version of your language-understanding-service would work on Wikipkedia articles.

It would have to be a specialized version in order to either skip or use the metadata available (ToC and the various strings from the Wiki managers such as “need more citations” etc.) Also, most Wikipedia articles are written in the third person and such.

Also, it would be an interesting experiment to run a natural language processor (NLP) with inference and error recovery on the “blogosphere.” I think I’d find it interesting from a linguistic standpoint. However, I think you’d almost have to entirely redo the OpenCog project in order to do this. (They have several interesting ways of improving the strength if a given atom.) There would be too many statements of “fact” that are merely opinions. Though, as long as you were working on a copy of your brain it wouldn’t matter if it got too messed up.

What I would look for in the above experiment is to see if the NLP’s vocabulary could improve and how, and also how various opinions/facts are entered and change during the course of the experiment.

This would require an inference engine be built into your MDIB and that language (words, sentence fragments, etc.) be atoms. Unfortunately OpenCog seems to have written a parallel version of AtomSpace just for NLP, which is a huge duplication of effort, and also creates two versions of the code - one for normal atoms, one for NLP.

There are many things that OpenCog did right, though - truth values from 0-1.0 along with a strength value (and this is the simple model, there are more complex models), inference, the hypergraph, and the attention currency model. Unfortunately it isn’t finished now, but it is still being used and there are many full-time developers on this for an open source project.

I’m still up in the air as to whether I will use OpenCog, a derivative, your eventual service, or something totally different. I expect that I’ll try your service, at least, because it has the most immediate promise and you have a version that actually works and isn’t just academic farting around.

mtriplett · July 23, 2014, 3:03am

**re: Wikipedia **

Jay, I have tried a few variations of your query about Wikipedia.

I map words in my dictionary to terms in wikipedia. This allows some questions to be answered by going out to wikipedia and grabbing particular bits of the xml.

Question Answering: An example in the videos is when I ask questions like “What is the population of germany?” or some such. There are a great many xml tags in the wikipedia data. I have also tried tags about people so I can do things like ask how much a given athelete weighs or what team they play for.

Reading and Learning: I tried writing a routine that grabs given articles, removes html/xml and other markup, and then attempt to find sentences in the remaining data that contain a given word. Of the results, this routine currently generates about 30% garbage. The rest are decent knowledge about the world that a bot could use in conversation. Right now I don’t set it loose because I don’t want the garbage and the routines coule run until my harddrive filled up with excess trivia.

Blogosphere

Sounds interesting. I’m not really a blogger, so I have trouble imagining the possibilities. I suppose “infering/learning” from many different sources (Twitter?) could lead to a lot of robot knowledge or a bunch of fluff, depending on the source I suppose. I would need a good way to filter out the garbage, like the wiki issues, or is some probabilistic inference approach possible?

NLP

I believe words, phrases, etc should be atoms. Last night I converted all of Anna’s known words, phrases, and associations into Atoms. This is paramount for word/phrase associations to occur. Some people call these “triples”. “Birds can fly”, “Humans are primates”, etc. The next level would be triple+, something like “Birds have 2 wings”. This might sound like a weird exception, but the one thing I am leaving out of atoms currently is the Princeton WordNet data, as I mainly just use it to look up the definition or part of speech of a word. It has 210000 rows and I don’t think all that is particularly needed in the atomspace. I could change my mind and import them if I decide to decipher their synonymn/antonym data, but I can get that sort of stuff from a web service. I have been teaching those to Anna manually in speech. Besides, I create a new “word” atom the first time the bot encounters it automatically, so I think I could do without around 190000 atoms that I will likely never use and can look in up in the other table when needed. My next goal is to convert sentences, this is probably my most complex conversion as I have a lot of sub structure and sub types of sentences.

It looks like a good bit of OpenCogs triples and other brain data is available. I could load these as Atoms. That would be quite a knowledge set.

I include a “Usage” on every atom that tallies the number of times that word has been used. That alone is probably not enough, I will probably also need a “probabillity” or “strength” for an atom. Not worried there, easy stuff to create, just how to use it?

Attachment Theory and Trust Level

I am considering some type of robot/human “attachment theory” and “trust level” where the robot is inherently trusting of its creator and close friends (which must be earned), and inherently skeptical of others without multiple sources. This means I will need to save sources for everything. While that means a lot of “source” atoms, which are simple associative atoms to people, websites, etc, which are simple more atoms, it is useful to be able to interact socially and say things like “I heard from Jenn that your mother was sick.”, or “Wikipedia says birds are modern day dinosaurs.” I think this is necessary. Also inherent in this model would be “trust level” atoms to represent the level of trust that the robot has in a given person/source/website/etc.

B.S. Detector

Ideally I’d like the robot to evaluate a new statement like “Penguins can fly” against prior statements from other sources like “Penguins can not fly” and determine which statements are credible and which should be forgotten, and which sources "Might not know what they are talking about because they seem to give me a lot of B.S. and lower their trust level.

Brain Viewing

I would like to make the brain “viewable” as soon as I can. The sooner I can make it transparent, the sooner I can get help. I have to convert all her existing memories to atoms, and write a webapp to view and edit the atomspace. At this point I could make the memory “viewable” and editable. The brain would not be usable though until I rewrite the various services and agents to use the new memories, and re-test everything Thats before I make any improvements.

Its a lot of work, but I know it can be done, I know it will be powerful, and I know I could do it if I could sit down for even 2 weeks of straight time and crank it out. Unfortunately, my days are filled right now with other summer labors. It will happen though. I would like to pull you and a few others in at that point and see if I can get a few to try it out and keep contributing ideas. I very much value your ongoing thought contribution/leadership on this.

I’ve mellowed in my enthusiasm for OpenCog. I’m no longer frothing at the mouth, I’m merely a big fan and student of their work. I still lack the computer science and AI background to grok a lot of techniques they are using. Perhaps others can help on that. I like Minsky’s ideas too and they seem similar (although I haven’t read his stuff, just wiki). Minsky makes me think that my memory ideas and having a bunch a fairly simple agents for different purposes is a good foundation to build on. We’ll see.

A final thought…“Gesture” atoms. A gesture would represent a pose or a set of synchronized movements. If robot responses can be associated with gesture nodes, then this facilitates more expressive behavior beyond just facial emotion expression, if a robot has arms, head movement, or other means of movement/expression. Figuring out how to map that to servo movements will be on wish list for next year. More atoms.

dangerousthing · July 23, 2014, 7:48am

The blogosphere thing was mostly a thought experiment
I was thinking about the more prose-like areas of the web, and blogs along with some forums came to the top of my brain. This is something I wouldn’t try on the default brain, but a copy. There are way too many untrustable facts in blogs and too many opinions masquerading as facts.

I will help you with your brain as much as possible. I can test, I can code, I can build. As long as Lee doesn’t need the time first. I look forward to seeing it.

Be careful about having too many types of atoms. It’s probably very easy to do now. On the other hand, most of the math behind AI assumes typing, so you need different types of atoms. I wish I knew how many were ideal.

I also feel strongly that you should look at the OpenCog TruthValues. They have three different types (though one is just a list of other TruthValues). Reading their glossary and blogs (even though the math/philosophy is beyond me) I’m getting an idea of some of the ways these can be used. The TruthValues are very useful in filtering out crap. And they are essential to making inferences from old data to new data.

There are always going to be problem areas when you’re dealing with the real world and natural language.

The whole “birds can fly; penguins are birds; therefore penguins can fly”.

A real artificial general intelligence (AGI) has to deal with exceptions and also must have some non-Boolean TruthValues in it. For example, the statement “birds can fly” is a bad statement if told as 100% truth. Think of baby birds, think of birds with broken wings, and yes, think of penguins.

As for hard drives, they are inexpensive nowadays. I have a 3 TB USB 3 drive that I have free at the moment and may use for something like this.

mtriplett · July 23, 2014, 9:15pm

re: Truth Values

I love thought experiments. Einstein thought they were useful, and he turned out ok.

I will look into the truth values and see what they are doing. I haven’t gotten that far. If you figure it out and don’t mind summarizing, that would be great too.

I dealt with the exceptions thing when writing the initial algos for Anna. Anna doesn’t assume something is true just because she has one truth statement saying all. She does look for exceptions if it seems relevant. I wouldn’t be surprised if she could be tripped up though. “Can birds fly?” can be answered by a bot in the general sense saying “Yes” or “Most can”, even though the bot knows of specific exceptions. If you asked “Can all birds fly?”, or “Can penguins fly?” that’s getting a bit more specific, and deserves a more precise answer. Anna can handle some of these currently, the others are fairly easily doable with a little more coding.

I had a concept of quantifiers…where words implied a value or probability. It seemed useful at the start, but I haven’t touched it since I wrote it (maybe it is working). The words mostly, some, a few, probably, seldom, never, always, etc. strongly imply probabilities. Ordinals were also quantifiers, so “First” implied 1, and so on. These could be used to sort things or to predict a truth confidence.

The number of atom types might seem artificially large as anything that has a limited set of values (something that would have a table with an int key and a varchar description), basically a list, could be an atom type. Not all lists, just lists relevant to the brains internal functions, like my list of software agents, or a list of emotions. Each of these could have additional data added to them in the future though. Setting these up in the meta will allow the UI to show english readable descriptions of the info that is in each atom, and allow data to be changed using dropdown lists. Otherwise, looking at the atom would mostly look like a bunch of integers and a few strings…a confusing matrix of gibberish.

The biggest foundational “weakness” between what I have thus far and opencog and other natural language systems is that I’m not truly using any of the academic theory on NLP with the parsing of sentences into NP, VP (noun phrase, verb phrase), and all the other symbols they use into a tree using probabilities. It may be possible to add this later but it could be a real mess to not figure out how to fit it in now. I mostly use a set of home grown techniques (as I have previously described) that amount to “feature extraction” or “question-answer translation” where some degree of implied meaning and structure are derived at the same time through regular expressions or data. A regular expression of “Where is _____?” (and other similar expressions) can be mapped to an agent that determines location. Normalizing the sentences first allows the system to handle some variations in the text, like “please” being in the sentence. The system is very practical and easy to understand, but it is NOT NLP. The huge weakness of this is exposed by trying to use anything other that simple sentence structures that the bot understands. Anything with multiple “phrases” or several adjectives or adverbs will not be understood by this system. I have read about NLP, but they get so into the trees and symbols, that I never get how to get practical “meaning” from the tree so I can map it to an agent that can do something. If I could get how this might be done, it would be a huge leap forward.

A note on gender differences…Sadly, as soon as humans start forming sentences with multiple phrases, I often lose the point of what they are saying. Sometimes it is because the speaker lost the point and forgot to include it in the words, hinting, implying, etc., and telling me later that they told me something that was not in the parse. Some would say its a mars venus thing. As a guy I get the mars thing, but I’m not sure how a robot is ever going to parse through the venus thing. Will I need a separate set of “Venusian Parse Algorithms”? Food for thought. Maybe they figured it out with Jibo.

dangerousthing · July 23, 2014, 9:27pm

Gesture atoms
I really like this idea, though there should be two types of them: robot gestures and human gestures (or meatbeing gestures, because cats and dogs make such things also).

Location and approximate location could be two more types of atoms or one atom with appropriate truth values. And then there is time, which might be an atom, but I think it fits better as an attribute of atoms. Arrrgghhh! Now I’m seeing atoms everywhere!

Thinking about human gestures makes me wonder how best a robot should think abstractly about humans. I think some sort of tree structure that could be easily assembled into a stick figure might be best. I remember reading a long time ago that stick figures are one of the better candidates for how humans think abstractly of humans.

Allowing gestures, which are collections of atoms, as a single atom, means that AtomSpace needs to be a hypergraph.

mtriplett · July 23, 2014, 9:33pm

re: Your thought…Google or Amazon Vision Services Needed

A Google or Amazon Vision Service is badly needed…

Input: An Image

Output: An array of objects in the image, and their projected positions in 2D (in the image) and 3D space (relative to camera).

If this service existed, then we might be able to do the stuff you are talking about.

Anna already knows things like “A kitchen has a faucet”, “A bathroom has a toilet” so mapping words like “faucet” to whatever words google gave to things should be doable if the service existed. Infering location should then be possible.

I would think if it isn’t there already in the new Amazon Phone or Google’s Project Tango or others, that there are teams somewhere working on this right now. It is just too logical and too important not to do.

dangerousthing · July 23, 2014, 9:40pm

NLP
I just installed Julius on Groucho. Julius is a speech-to-text processor for Linux and also Windows. There is a partial English word model that can be used as as soon as I can find a microphone I’ll give it a whirl. There is also sphinx.

dangerousthing · July 24, 2014, 1:38am

There was a Kickstarter that
There was a Kickstarter that was trying to produce this service, but I think they didn’t get funded.

dangerousthing · July 24, 2014, 2:35am

It doesn’t have to be from
It doesn’t have to be from Amazon or Google. It could be from one of us if we had the knowledge to do this.

I could attempt it. In fact I’ll have to attempt something like this if Groucho is to get to where I want him to be.

The trouble is that vision is very much like NLP in that it involves some really deep brain structures and we only know the basic ones. I know that I can do it, but the algorithm I’d have to use is slow, and it wouldn’t work in all cases.

Break the image down into concrete objects.
Kill the background from each object, such that each object exists in an image all its own.
For each image, compare to a database of generic objects in various rotations.

It’s step 3 that is so slow. Comparing images is slow when you’re not expecting an exact match. I’ll have to check on this. Maybe there has been progress. I’ll do some checking around.

dangerousthing · July 24, 2014, 2:45am

Truth values and thinking
One of the things that cognition algorithms are supposed to do in OpenCog is to reward the atoms that “helped” the algorithm come to its conclusion. It does this by adding to some attention value associated with each atom. A cognition algorithm has a limited amount of “attention” it can give to atoms, so it’s a form of currency.

Truth values seem to get stronger the more they are used. There are different ideas on how TruthValues should be used.

Both of these are in the BrainWave blog that can be accessed from the main OpenCog page. Unfortunately without more experience with the system I’m as far as I can get. The programming I understand, but once it starts towards philosophy I can barely follow along, but if I lose my guide I certainly can’t find my way back.

mtriplett · July 26, 2014, 10:13am

Some Progress to Address the NLP gap

Made some progress on NLP, someone ported OpenNLP to C#. I have the source for it and its version of WordNet installed and running.

It has a lot of basic function that are far better than mine that plan on incorporating. I am really hoping that none of these pieces will slow the overall brain down too much.

Split paragraph into Sentences

Split sentence into tokens (words and symbols)

Determine Part of Speech of each word using stats and maximum entrophy algo.

Finding people, places, dates in the sentences. (I’ll still need to do mine I think, so I’ll do both)

Determine structure of sentence(s), full NLP parse into a Tree. Even if I don’t use the full parse for meaning or mapping to agents/actions, this parse is a very useful annotator for determining tense, plural, etc.

It would also appear that it can interpret the WordNet data and get all kinds of “Is A” and “Has A”, many thousands of them, that I had taught Anna previously through speech or web page.

There are a lot of other features here, just getting going on real NLP.

The full parse trees are impressive, now trying to figure out how to use them to do something useful. My regular expression stuff was simple and worked, determining structure and meaning in one simple step. Should I start doing a new type of pattern recognition based on NLP Parse Trees? I would think yes, seems like it would be complex and slow though, so my thought was to do it only if I didn’t get a sufficient match first using my existing techniques. I am trying to figure out the types of statements that would be useful to a robot that would lend themselves to recognition with “Trees” rather than direct sentence matches or regular expressions. I would have to come up with a way to match parse trees based on search criteria…which might circle back around to regex.

I’ll worry about that later I suppose, uses for this will emerge. Happy to have some new tools.

mtriplett · July 26, 2014, 1:08pm

Stick Figures w/ Kinect

I can see the recognizing human gestures with stick figures using Kinect. One of the middleware libraries for it outputs stick figure representations of some kind. We do some of the XBox dance stuff for fun in our house during family gatherings, the newer ones can recognize at least 4 people/stick figures at the same time.

It will probably be a long time before I can take that one on, like to crawl before I run. My thought of crawling or walking would be to tackle “Robot gestures”. I would start mapping emotional context and gestures to responses. An example would be, when the robot says “No”, a simple less happy emotion and facial expression, a looking slightly down, and a slight horizontal head shake would seem good to synchronize.

dangerousthing · July 26, 2014, 4:06pm

Good!
Anna was already impressive. Now she will be more impressive. I can’t wait to see what you develop from all these pieces.

dangerousthing · July 27, 2014, 10:45pm

Different Sensors
I know that somebody is doing a database of images from the Kinect sensor, but what about other sensors that are more diffuse such as the various ultrasonic sensors available?

I can picture identifying a room by its sonar fingerprint of s room, but this is something that has to be up to the individual Builder. Each robot will have its own set of sensors.

Just some thoughts.