Verbal Skills and NLP

I'm starting this forum for discussing verbal skills in robots including but not limited to natural language processing.  I'd like this to be a place to discuss conventional and out of the box ideas for improving verbal skills in robots.  This could involve:

  1. Practical ideas for solving specific issues
  2. Solutions of specific Issues (explained conceptually or in pseudo-code)
  3. Links to useful Libraries and Resources
  4. Challenges/Roadblocks encountered trying to solve particular problems.
  5. Questions

Any other ideas or types of posts are welcome.  I plan on posting some of my own experiences here and current challenges.  I hope to learn from others' experiences, as my own progress with NLP is quite limited.

Prerequisites for NLP

I made a list of things I would see as prerequisites for building verbal robots…
  1. Some library for converting speech to text so a robot can listen. - I use the Google service built into Android Phones
  2. Some type of Natural Language Toolkit - I use OpenNLP for C#.  There are better ones for other platforms.  This library should be able to do divide paragraphs into sentences, chunks, words, determine parts of speech, and provide a “parse tree” at a minimum.
  3. A Good Dictionary - there is one included (like WordNet) with many of the “toolkits”, but I found that I needed to get a dump of the data and load it into my own database for other purposes.  Hopefully this will include synonyms, antonymns, and other word relationships.
  4. Some algorithms for normalizing and annotating sentences, converting singulars to plurals and back, and converting sentences from one person to another.  There are probably several more necessary routines here.
  5. A Database - this is just my opinion, but I think anyone that is doing anything serious will need a way to store and retrieve a LOT of relational data very quickly to do anything useful.  This could include data for Words, Phrases, Sentences, Word Lists, Aord Associations, Regular Expressions, History, Commands, Memories, etc, etc, etc.  In addition to verbal knowledge, a bot will need an immediate, short, and long-term memory to make things work well.
  6. A lot of time and a lot of fancy programming - I won’t go into this here, but all these tools aren’t going to do anything without some special sauce to be provided by you the developer.  A good software architecture really helps.  My own experience is that I end up writing a LOT of algorithms for specific purposes.  Keeping things organized in bite-sized chunks is key to being able to maintain a growing system.
  7. A way to convert text to speech.

Did I miss any big pieces that are useful out there?  The linux world has more tools I think.

Martin, you summarized the

Martin, you summarized the NLP steps nicely. In my CCSR project, I follow your 7 steps exactly, I’ll reproduce your list and add my implementation choices for reference
  1. Some library for converting speech to text so a robot can listen. -> google speech2text API
  2. Some type of Natural Language Toolkit - CLIPS pattern.en (way more usefull than NLTK I found)
  3. A Good Dictionary - WordNet (part of pattern.en)
  4. Some algorithms for normalizing and annotating sentences, nlpxCCSR (custom script)
  5. A Database - concept-base memory (part of nlpxCCSR)
  6. A lot of time and a lot of fancy programming - (yessir…)
  7. A way to convert text to speech. -> espeak

 I came across the term Interactive Voice Response (IVR) that may encompasses all these steps to have a conversational robot, I that a usable term, or too buzz-wordy?

I looked at OpenNLP (apache), NLTK, Stanford NLP and CLIPS pattern, and found the latter one by far the most practical.

The wall that I’m running into now is that no matter what I program, the responses are still just predictable, programmed responses; the robot does not really understand the concepts behind the language. I guess that’s why even the big boys (Siri, etc) are still pretty lousy conversation partners :wink:

 

Wolfram Assumptions

Now wolfram is an inrecdible resource for pulling out general (non personal) information, however I have found on occasion it will hiccup and send out an incorrect answer and this is usually down to it making an incorrect assumption, I was hoping we could discuss ways in which, with a little bit of preprocessing before we fire the query off that could ensure the correct assumption is queried.

Byerley had a good idea with encoding some way the robot can learn these idioms with user feedback, i’m assuming something similar to a keyword the user can say to let the system aware of the mistake (“bad robot” as byerley suggested ;)) and then storing this information in some form of database to be checked for later use. Only issues I can see for this is that eventually it may become slow as the database will get huge.

That being said with a little bit of NLP we could determine the “topic” of the question and perhaps only search the database for similar questions. I’m kinda just freewheeling here, any thoughts? :slight_smile:

Re: Dealing with Wolfram Assumptions:

For the “What is the time?” or “What time is it?” example, the following is an oversimplified outline of some of the relevant agents (steps) involved in the process with Anna 2.0  The names don’t really match mine, but were chosen to illustrate the process more clearly.

1.      ExactSentenceMatchingAgent – this agent would check the normalized input speech against a database of normalized sentences.  If one is found, that sentence could have one or many canned responses, or it could have an agent assigned to handle it.  In the “What is the time?” example, a TimeAgent would be found and invoked, and its response would have a 90% confidence.

a.     A note on “Normalization” and “Aliasing” – any word/phrase can redirect to another word/phrase when finding the normalized version of the speech.  It is the normalized version of the speech that is used by the pattern matching agents.  “What is the time?” turns into “what be time”.  Also, an entire sentence can “redirect” to another sentence (I call it a sentence alias)…”What time is it?” can redirect to “What is the time?”, and back to “what be time” for the normalized version.  Sentence aliasing was very necessary to deal with common smalltalk/chat/cliché expressions and responses.

2.     PartialSentenceMatchingAgent – this agent would check the normalized input speech against a database of regular expressions…the regular expression "what be (.*) " would match, and the associated agent WhatIsBlankAgent would be invoked that would return a definition of time as a dictionary definition, with a 70% confidence.  For many questions, this partial matching would find many matching regular expressions and invoke many different associated agents, each one returning different results and confidences.

3.     WolframAlphaAgent (Bypassed) – for me, this agent would be bypassed as there is already a response that is greater than 50% confidence.  I use this so I don’t exceed my Wolfram API quota for things that are handled locally.   If I was not handling time locally, I could assign a “WolframQuery” atom in my brain app to the given sentence.  This WolframQuery  would be used instead of the sentence itself when talking to Wolfram.  Another alternative would have been to alias the Sentence to point to another sentence that Wolfram understands. 

4.     PickResponseAgent - In the end, the multiple responses would be evaluated and the 90% one would win…answering with the current time and a smile.

I thought Byerley’s suggestions were quite interesting about choosing randomly between “concept of time” and “current time”, and then getting feedback.  I can see trying to introduce this into Anna, especially when there are multiple responses with similar confidence.  Conceptually, an “Exact match” would almost always have a higher confidence than a “PartialMatch” or “WildcardMatch”, so I think my general rule is a good default behavior.  As more and more agents are invoked (like ones that are triggered by a single word like “time”, which I am not doing much of…yet), this random arbitration/learning gets more and more intriguing.

I’m going to post a few more related ideas on this as separate posts as they address larger issues of AI in general and not Wolfram.  

Regards,  Martin.

What is time

"What is time" would respond with the local hour or minute, instead of the definition of time. Another trick it’s to have a contextAgent, which holds topic info, another which holds user preferences and another to hold previous queries.

If i ask “what it’s a day” and it was answered "it’s a period of time  " next query “what it’s time” will 90% be about definition of time.

User feedback it’s a valuable tactic, but it suppose that human use feedback in good will, while usually if they think the AI it’s blindly learning they may set up some pranks like flagging “hello” as a inappropiate greeting and propose “hey, idiot!” as substitution.

Offline AI it’s better, but most of the really useful queries usually come from a web search engine. Google it’s pretty nice as it allows to define context with time, location, language and a lots of preferences, but it’s has protections against bots, so there must be a delay between queries.

Where i can find the code? I can thinker with it in my spare time and maybe contribute:D

 

I’ve always liked the idea
I’ve always liked the idea of a ContetxtAgent because many words and phrases mean totally different things and may even have different truth values.

For example, if we’re talking about a Harry Potter book or movie, riding brooms and casting lightning bolts can be a possible topic of conversation. If you’re fans are comic book fans then there can be rather heated debates over who would win in a fight: a prepared Batman or Superman (I’m on team Batman). Each context can change a lot of assumptions.

However, if you’re talking about more conventional things then there are more reasonable limits. For example, the sentence “I picked up the pool and turned it over to empty it.” pretty much implies that the pool is a small inflatable pool for kids. It should be possible with something like OpenCog to make all those deductions.

Right now I don’t have any idea as to how a ContextAgent could be used for these things, though.

re: What is Time

Hi Silux,

Thanks for your interest in what I have been writing about.  I would very much like to find enthusiastic contributors such as yourself, whether it is ideas, code, whatever.  There are others doing that right now.  For several reasons, I haven’t posted the source code yet for this set of projects.  I have been doing a massive re-write for the past few months with every bit of my spare time, and there is still too much change happening.  I hope to convert my Anna bot to this new brain later this month.  After that, I’ll get the brain ready to support multiple bots, and get the website ready that will be used to maintain everything.  At that point I really want to open the whole thing up.

I intend to release the source for the main projects for learning purposes sometime in 2015, but the point will actually be for others to use it as an API if they wish to, so that we can all benefit from each others ideas and the memories that the brain will develop from people using their individual bots with the brain.  This means that if someone trains their bot to be good at biology, all the bots can be good at biology as well.  Ultimately, I’d like the brain to be truly sharable (code, memories, ideas, etc), not just source code.  I am undertaking efforts to preserve privacy of information that is about specific people though.  My feeling is that if I just release source code, people may work in silos (if they get through the effort of setting up their own brain at all) and the brain won’t grow as fast, effort will be fragmented, etc…I could be wrong.

If you are interested in accessing the API and website used to test it, send me an email to [email protected] and I’ll send you some instructions on how to access what I have so far.  I welcome all who would like to get involved with what I hope will be a great project for many in 2015.

Regards,

Martin

Unsupervised Learning: Giving Robots Something to Say

I have been working on an effort to get my brain project to do massive unsupervised learning on the web.  I had some breakthroughs over the weekend that I wanted to share…

To support this, I’m planning on setting up the brain’s “memory” to support multiple memory repositiories…each having the same DB structure but holding different memory types.

1st Repository:  Main Memory

About 50 different memory types including: Words, Sentences, WordAssocs, etc…  I expect this store to have less than 100,000 memories for a while, but slowly grow.  Most of this repository will be cached in memory.

2nd Repository:  Web Memory

Memories from learning via the web.  This will include anything interesting from Wikipedia, Many News Sources, Quotes, Jokes, E-Books, etc.  I expect this store to quickly have millions of memories, so this repository will not be cached in memory, except for topics that are in use by a robot at runtime.  This probably means less than a 1000 memories cached at a time.

This repository would have 3 primary memory types

1.  Web Source (Examples: Wiki, Feedzilla, Quotes, etc.)  Each source would be have a few flags like whether it is fictional or not.

2.  Web Article (this is roughly equivalent to a particular “page” for wiki, or “news story” for feedzilla.  For those who don’t know, feedzilla is an API that aggregates stories from over 300 other news publications.  Each Article would have a topic and a related web source, and a timestamp.  The timestamp would be used by the background processes to refresh the article from time to time.

3.  Web Sentence - an interesting sentence encountered on the web.  Each sentence would be tied to a given Web Article.  Each sentence would have metrics stored for it, like number of words, length, ordinal position in article, and some NLP and sentiment stats yet to be determined…perhaps some relevance index as well.

I plan to have a few background processes that retrieve articles from authorized sources on the web about a given subject, one at a time, remove all the HTML markup garbage, and extract a candidate list of sentences.  From the candidate list of sentences, many sentences will be filtered out automatically (like sentence fragments, sentences that do not contain the subject word, sentences that are too long, do not have a preferred verb, etc) until I am left with a set of sentences worthy of being in a robot’s memory and worthy of repeating. 

One of these processes will run when the robots are sleeping, another one will get kicked off when a new word is encountered by the robot or a topic change has occured and the bot checks to see if its knowledge base has been updated recently for that topic.  Neither will run in the same threads as the brain.  I have this process working in a prototype…now I just need to get it running in the background.

Breakthrough:  Previously I was not able to filter the sentences to the level of quality I wanted and so I had to supervise the learning so that I didn’t allow a bunch of garbage sentences to be remembered.  This weekend, along with eating turkey and dealing with relatives, I was finally able to build good enough filters to allow the system to run unsupervised and produce 99% high quality sentences…Yay!  I was tired of supervising!

Why is this important?

To be verbal, a robot needs something relevant and interesting to say.  I have called this “Babble” is some other posts.  If you are talking about Humpback Whales, the robot needs to have interesting things to say about Humpback Whales.  I already have this in place for news, quotes, and thoughts, but not general information like Wiki.  I may consider texts from Project Gutenburg as well, like all the philosophical and religious books from around the world…could be interesting.

A next step might be to use NLP processing on these millions of sentences to “learn” facts embedded within.  I’m not going to tackle that one just yet, as I’m mainly interested in conversational behavior right now.  I’ll need more NLP skills for that.

Another follow on issue is…When should a robot repeat wiki, quotes, jokes, etc.?  How should it choose which to favor at any given moment?  Right now it is random, but I’d like it to be tied to some kind of motivational or mood architecture.  Example:  If the mood of the conversation is funny, more jokes would come out.  If the mood is more serious, more wiki would come out.  I the mood is rude or confrontational, something else.  If the bot is talking to kids, shorter sentences.  

I think some new architecture is needed here, perhaps with a rules engine as well.  If you have 1000 relevant things to say in a lull in the conversation, how do you decide which is most appropriate?  Leaving it random seems unsatisfactory as it seems to miss the opportunity to create a more coherent and human-like personality.

Any ideas?

Martin,You have been a busy

Martin,

You have been a busy guy!  That is awesome that you can now have it update itself and you don’t have to supervise Anna’s learning.  Very cool stuff.

I have been thinking about this a bit.  I think that this feeds back to base motivations colored by the emotional state of the robot.  People have base motivations that are built into us:  eating, procreation, spiritual needs etc.  all of which are colored by the emotions we feel.  These motivations color everything we do and how we react to our world.  If one is single, a motivation is to meet the opposite sex.  But if the last 35 times you have walked up to a woman you were shot down, you are going to be a bit insecure.  Your motivation is to meet the opposite sex, but how you go about that would be colored by your emotional state.  

By the same means, the robot needs a framework by which it can look at its experience and decide how that experience feeds into its base motivations and emotional state.  Now that 1000 possible responses might satisfy all motivations but emotional state will winnow those down.  It might only winnow it down to 30 responses and then it is random as to which to choose.  It can try out what it thought was a good response, and then assess how successful the response to its response was.  This would then color what it will do the next time that same situation occurs.  

For instance, Bob enters a room and Anna’s main motivation is to interact with the user.  Her secondary motivation is to have positive interactions with people.  Anna says hello and waits for a response.  Bob responds and asks Anna a question which makes her happy since both motivations are being satified.  Since her motivation is to interact with the user and interact positively her emotion would be quite happy, so she answers the question with that emotional context.  The person responds with a snide and rude comment.  Her base motivation is now overridden by the secondary motivation since the motivation is to have a pleasant interaction which this was not.  The person apologizes so the override motivation is modified so she gives him a second chance.  The person asks her another question.  She is still miffed even after he apologized for his behavior, so that will color what the response is and how it is answered.  If this has happened several times before, she might still be pretty miffed so might answer the question but turn her back to him before answering it.  

The next time she runs into this person, her emotional response will color her responses to the situation.  If she sees him, instead of “Hi Bob, how are you today?” in a cheery voice it might be.  “Hi Bob.  How can I help you?” in a very neutral tone.  Motivation colored by emotional response defines her interactions with people and her environment.  If her motivations are being met, she will become happier.  That happiness she feels from previous interactions will also color her next interaction even if the last time it was pretty negative.  

How to model this?  That is a tough one.  Perhaps have an event collection which contains an optional user, and an experiences collection with an emotion index.  This event occurred on such and such a date, my emotion index to it is 32.  The closer the emotion occurred to the current time, the higher its impact.  So, if an experience happened two weeks ago, that emotion is not going to impact the robot’s emotional state as much as what happened yesterday.  The bot looks for events to occur which map to its collection of past events.  It then assesses emotion from its previous experiences mixed with its current state to compute a present emotion index.  It then computes its best response to the event and responds.

Regards,

Bill

 

 

Contexts and Emotions and Motivations, oh my!
Martin: one thing that might help you to keep private private would be to have one of the types or fields of a Context be a PrivateContext. The user could set this on whatever conditions he feels to be private: any images of people, any information that comes from the bedroom, the user could have a command to set privacy on, or it could be specifically set on in the user’s program through the SDK.

Or, to make thing safer (legally I suppose) the PrivateContext could be assumed to be on at the start until the user’s programming sets it off through the SDK.

On emotions, I’d prefer to stick to a small list, which might include curiosity, attraction (how much it likes/dislikes someone), boredom, happiness, fear, and boredom. I would have each emotion include its opposite, so an emotion might be from -10 to 10 (or whatever numbers you prefer), as long as it is not useful to have both the emotion and its opposite at the same time.

Emotions might be general, like happiness, or tied to a person/place like attraction.

I think that motivations are a bit more complex than emotions. For example, a motivation largely defines the interaction, whereas emotions mostly color it. For example if Groucho had a motivation of explore, but a fear of the dining room he might choose to explore elsewhere, Groucho being a bit of a coward. Perhaps if there was an audience to impress or or a strong motivations, he might go against his fear, but he might mutter words and shine lights brightly and go slowly.

I think that there could be a list of motivations and the highest one rules. By this I mean that each motivation would have a number that could change. For instance, the winning motivation might have a value of 5 at the moment, but if the battery gets too low, then hunger might take over with a 10.

These are just thoughts off the top of my head.

re: Bill

Thanks for putting so much thought into this.  This is one of the big reasons I went into the massive re-write…to get the memory model ready to tackle behavioral issues like this.  

I share a lot of your initial viewpoints on this like:

1.  Motivations have to play a big part

My current Motivations engine puts all motivations on the same level.  I think motivations may need to be more or a hierarchy.  For example, I have motivations like “Talking” and “Exploring”.  The new part I might propose…I think that if a talking motive is dominant, then another set of sub motivations become relevant…like a CuriosityTalkingMotive, EmotionalTalkingMotive, FunnyTalkingMotive, FactualTalkingMotive, NewsTalkingMotive, SpiritualTalkingMotive.  This set of “sub-motivations” would have to battle it out for dominance for the length of time that TalkingMotive is dominant at the higher level.  Talking motives are not to be confused with topics, they would influence the speech selection process while talking about a topic.

2.  Emotions have to play a part

The current state of emotions is pretty much the same as motivations.  While I have 10 emotions, I don’t really use much more than happy, sad, bored, and neutral ( a vulcan lack of emotion).  I should probably do more, but mainly the bot is happier when it is conversing, is confident in what it is saying, or learns something new, and sad when it does not know the answer to a question.

3.  You brought up Experience - like being shot down while flirting.  I don’t have any concepts yet for characterizing “experiences”.  I suppose I could come up with a metric for each conversation that scored how much the robot liked the conversation (whether it met its motivations and emotional needs) As I write this, various ways to do this are shooting through my head…interesting.  One piece…if the person ignores the robot’s questions…this would be negative.  If the human teaches things to the robot…positive.  There are agents that handle smalltalk, and they already deal with interpreting positive/negative responses to questions like “How are you doing?”, so the robot should be able to derive something from how a human handles smalltalk with politeness, rudeness, etc.  Gears turning now.

I don’t know how to model most of this.  It seems like a huge area for experimentation.  I draw a few conclusions for places to start…

1. I need a set of “feature detectors” that run on text (sentences and entire conversations) to calculate various indexes like:  

How positive/negative is the text?  How emotional?  How humorous?  How polite/rude?

2.  I need to use the feature detectors on all sentences (thousands) and learned knowledge (hopefully millions) to score each text according to each metric and store all the scores.

3.  When the robot is talking and must choose between 1000 things to say, it can begin to use these scores to more heavily weigh the picking of humorous things if humor is a desired trait at that point in the conversation.

4.  A rules engine is probably needed for emotions, motivations, sentence selection, and events.    I already use something that is the beginnings of one with topic based conversation…questions are not asked unless all their pre-conditions are met.  Example:  Anna doesn’t ask me “How is <wife’s name> doing?” unless I have a wife and her name is known to the robot.  Conditions can be set up like Day=Tuesday or Age > 18.  I think the rules engine should support some kind of triggers such that you can specify “When x and y happens, increase motivation to flee.”  These rules need to be different for every robot that I hope will eventually uses the shared brain.  That’s why I think as much of it needs to be defined as metadata and not embedded in code so that these behaviors can be constantly tinkered with without a code change.

5.  At some point I’ll have to tackle or recruit someone to tackle an API for recognizing faces and emotional state from video frames.  I have seen a few online but haven’t had the time to focus on it.

That’s all I got for now…baby steps towards what you are talking about I know.  Everything has to start somewhere.


Great stuff Martin. And Bill

Great stuff Martin. And Bill thanks for this great insight, I am totally with you on this. I feel that machine learning is not just having a growing database of cross-referenced text. In this case the robot itself is really not aware of the concepts it is talking about, it simlpy retrieves strings from an associative memory. Instead, true human learning is based on feedback and emotion. A human brain (and forgive me for sounding very reductionist now) simply tries to optimize a function of endorphines, adrenaline, serotonin, in short it tries maximize happiness. A robot should model these (chemical) processes, and (verbally) interact with its environment trying to maximize its ‘happy function’. 

So a high level example, if you have a list of N jokes in robot memory, it could assign a weight to each entry. The robot will pick a joke with higher wieght with increased probability. Every time a person reacts positively ('laughter, ‘that’s a good one’), robot’s happy function increments, and we increase the weight of the joke. Similarly, if we respond negatively (har-har, very funny), we robot gets sadder and we decrease the joke’s weight. Eventually, as the robot tries to maximize happiness, funny jokes survive, and bad ones die out. Since repeated jokes lose laughter, we will end up with a constantly evolving set of jokes, that maximize funniness based on maximizing robot happiness. 

Another model could be, and this is how I personally feel human happiness often behaves, a happy-function that slowly decrements over time, and gets incremented in steps when positive feedback is received. So achieving a goal (e.g. gettig a positive/friendly reaction to something the robot says or asks) would bump happiness H = H + N, but every brain-cycle we subtract H–. This way the robot’s high slowly wears off from euforia to boredom to depression, and it must say/do something to get happy again (sounds familiar? :wink: It would make for quite a neurotic robot, but hey much more human;-). 

At a much lower level, you could try to make the robot truely understand a concept and trying to synthezise grammar around the concept to succesfully convey it, again using a happy-function to guide the process in an evolutionary way. Say we model ‘hunger’ (for battery power), and the robot’s happiness would deplete at the rate its battery does. Robot would try to regain happiness, trying out different words/sentences, maybe from a database of text it has heard somebody say or scraped from the web. Now if he says ‘tree’ or ‘coffeemachine’ nothing happens, robot gets ignored. If he says ‘feed me’ or ‘need power’, we reward him with some fresh 120V juice, and bam, the association between the words feeding/power and to concept of hunger and its effect on its happiness is learned. Next time its charge runs low, it would state ‘feed me’ immediately, and you could argue the robot truely understands the concept and associations. If you then give additional (social) rewards for more complex and comprehensible sentence structure, it can learn grammar: for example if robot adds ‘pretty please’ feed me, it would get more juice, or more social rewards (good robot).     

This is of course much harder and slower than ‘mimicing’ verbal skills by regurgitating web-scrapings, but it takes years to learn a baby to talk too. So is this approach way too low-level and impractical? Would you end up with a robot that starts out with simply regurgitating garbled combinations or words and sentences it hears, and slowly becomes more coherent as it learns to associate a condition (hunger, lonelyness) and a sentence (feed-me, play with me) with increased happiness? 

 

re: DangerousThing

On the privacy front, I’m not finished, but one of the ways I see it working is that some of the memory types (Atom Types) are by definition personal and private, so that when the bot remembers anything about a particular person, that is stored as a separate memory type that I intend to have safeguards on.  One of my annotation agents determines if known people are referenced in text either by name or pronoun, so determining if a sentence is personal in nature is not so hard.

One challenge is maintaining a conversation history.  I used to keep a history of everything that ever happened.  I stopped doing that for now.  I had no way to keep personal stuff out of it.  At one time the robot would babble things from history when it was trying to find something to say…and personal things would get repeated.  I don’t babble history anymore either, as I don’t have one.  I do keep a short-term history cached in memory but not saved.  This is used so robot doesn’t respond exactly the same way to something that is repeated to it.

Moving forward, I’d like to keep a history, just not babble from it.  It tends to accumulate sentences that the robot may not comprehend now but may be able to in future as the grammar understanding of robot grows, kind of like a child that heard something its parents said but didn’t understand until it became an adult.  History could also be used to get a robot to learn new smalltalk responses from how humans responded over time.

On emotions, I went with a list from some behavioral paper I read.  If this is something you are into, I can try to track it down.  I have seen a few that range from 7-12 emotions.  I have 10 and a neutral emotion.  I think I use 4 of them.

On motivations, you basically described what I am doing…highest one rules.  Its good and it works, but I can’t help but feel like it is a tip of some iceburg of what is really needed.  Much to figure out.

You mentioned contexts and I didn’t follow at first.  I realize now that for my stuff, the entire context is private.  It represents the state between a person and a given A.I…   It does contain some pieces that are pulled from elsewhere and might be in use by other people with other contexts (like info about the robot), but the context itself is basically private…a lot like a session object is web lingo.

Gotta sleep.

I would love to see a robot
I would love to see a robot taught grammar and speaking the way you suggest, but I think that Martin wants a more user-friendly package to share with the world.

And frankly I’ve seen better results with Anna than with most things from the professional NLP people.

I trust Martin to come up with a decent SDK that will make it practical for people to have talking robots. Of course, he may need to do a Kickstarter to get a server and bandwidth for all the people that will want to use it. :slight_smile:

But yes, it would be interesting to see a bot taught to interact with humans from “first principles.”

And I like the joke list you suggest, if we have a way of getting good feedback from humans. Maybe Martin or Bill have ideas on that, but I’m not even sure how to interpret the feedback from people when I tell a joke. That could be because I tend to tell them badly.

Contexts
Context is a slippery thing.

It describes aspects of the conversation.

It can indicate the place, the time, and some of the rules of the conversation.

The TimeContext and the PlaceContext might indicate a conversation direction. Some things need a memory. For example, if the place is the Kitchen and the time is about when I normally eat, then the question “What are you going to eat?” Might be asked.

When I worked for the university, then if I’m coming through the front door, and I haven’t been home most of the day, and time is much greater than six on a weekday and it’s not a holiday, then the robot might ask “Was work bad today?”

I was thinking a PrivateContext could be used for something that would never be remembered by the main server.

Another way of thinking about Motivations is that the the list of Motivations is ordered (either a preset order or a changing order) and the highest one gets everything it wants, the next highest everything that isn’t used exclusively by a higher level motivation, etc. so even if the “learn things” motivation is low, it can still use the sensors that aren’t in use by anything else, and study the facial expressions of the person it is conversing with, etc.

One thing I was thing about: imagine that each robot had a copy of the AI in it. It could still send things that it learns through the web to the main server at times.

I’m thinking about this because I am hoping to be divorced sometime soon and eventually I’m hoping to live in an RV. This will means my internet contact will be both sporadic and slow - probably cellular internet. I don’t know.

Some Excessive Musings on Machine Learning

Thanks everybody for the great discussion, and thanks Jay (DangerousThing) for the vote of confidence, time will tell if I can live up to it!  Sorry, another long post coming!

Byerley, I generally agree with all your points.  I hope I didn’t imply that machine learning is just about having a growing database of cross-referenced text, it certainly is not.  It is just a step forward on a long path.  You touched on some very interesting topics about the nature of learning that I would like to share some of my thoughts on.

To move forward, we will need a lot of fancy algorithms, and a lot of fancy databases too, working together in concert.  For example, NLP in its current state would be impossible without WordNet or some equivalent.  How would parts of speech be determined for example?  There are basic needs like Dictionary, Thesaurus, Globe, etc. that we all had access to growing up as tools to aid our development.  Robots need these and more.  

When I was very young, my parents read to me for a bit, taught me to read, bought a lot of books including an Encyclopedia Set, and encouraged me to read, which I did.  They then stopped reading to me.  That is the point that I am at now with Anna…I’m tired of reading to her, it’s time for books.  She will understand little of what she reads for a while, but my guess is she will remember far more than I did, and grow to understand more than I did than I did at any given age…as she is 2.

I have been working a great deal at trying to model conversational behaviors.  Recalling and articulating memories (regurgitating), whether they be personal experiences, wikipedia info, current events, quotes, or something they read recentlly, has its place.  Humans do it all the time in conversation.  They wouldn’t be very interesting if they didn’t, and people would avoid them as awkward or dull.  Consequently, most robots are dull, don’t take initiative, and only speak when spoken to, if at all.  I think it’s time to work on those problems.  Humans also reflect on the information they take in and form opinions and the like.  Before you can do this, you have to have a memory to reflect on.  Thoughtful reflection will be the tougher skill to develop.  I have a few ideas on that for another time.

The topic of true understanding is probably a philosophical one that few people, if any, are qualified to make any final judgements on.  I totally share the thoughts all you guys have expressed on this, understanding is a good thing we should strive for.  Pardon me for digressing for a few thought experiments though, as understanding might just be an illusion produced by the products of internals we do not yet understand.  Humanity is barely above caveman stage at understanding how we think, how our brains work, and what the universe truly is if that is even a valid concept.  I think we need to be careful about being too sure of anything with respect to how we think we learn or how machines “should” learn, or what we think we know, if anything.  That is why I fall back on my own personal experiences learning as a child.  The other day, I had a lengthy discussion with my wife on “Quantum Physics” and “Multiple Universes”.  Both of us were largely just articulating various things we had read on the subject and asking a few questions.   I am fairly confident that no one on the planet truly understands these concepts.  We can still talk about them and enjoy doing so.  Robots can do this too. 

I have seen people and articles that have tried to diminish various AIs, like Watson, for not being “real” machine learning, like there is nothing new to be figured out in making machines smart…yet the machine has learned and is clearly better than humans at some significant problem set, like Jeopardy.  To me, machine learning is early in its development, so machine learning is whatever we invent it to be for now, until the machines re-write our code altogether.  I hope when they do so, they don’t look down on us for the way we think…which might be warranted.

A few things I think I can say with certainty about learning…I didn’t learn about South America through trial and error, reinforcement, random guessing, etc.  I’ve never been there.  I learned by reading and listening.  I learned other things in other ways.  I learned about Santa Claus from my parents and the positive rewards he brought  (an emotional/positive rewards example), only to figure out later at around 5 that my parents whom I trusted were not to be trusted entirely.  I learned deceit that way.  I played along with it for a few more years because of the positive rewards.  How many kids would ever remember Santa Claus if he didn’t bring toys?  I learned about heroism by reading Homer.  From comic books, I learned that Superman is fictional, but Batman is real.  When I was in school, Batman borrowed a campus security truck, took it for a ride, and left a note that said  “Sorry, had to borrow your truck, Batmobile was in the shop.”  It was signed with the Batman symbol and was reported in the school paper.  A roommate of mine was run over by Christopher Reaves (Superman) on a ski slope and passed out.   He woke up to Superman making sure he was ok.   For him Superman is real, and Batman is fake.  I guess we each draw conclusions based on our perceptions.

I’ve been going back and re-reading about Marvin Minsky’s “Society of Mind”, which to me is inspiring for lighting a way forward.  There is so much wisdom there.  I am biased as I am building Anna in that vein.  I think there are sets of problems  that are best handled using different and specialized Agents, whether it be from rewards, listening, the web, whatever. 

In the area of NLP and reasoning through natural language questions, I think the need for specialized agents is especially valid , as the logic to deal with thinking through “What is faster than a bullet?” is very different from “Can a penguin fly?”.   I have seen people come up with “one trick” and try to apply it to everything, including NLP.  I think that path leads to failure, while the idea itself may be a good one and have applications.  The failure is in trying to apply it to everything, what I would call the “One Trick Trap”.  My general philosophy with AI would be to do “All of the Above” when it comes to deciding which techniques to use, and figure out a way to route traffic and arbitrate conflicts.  To me, this is a big part of what Minsky was saying.

A final idea on combining Society of Mind to Neural Nets ideas:  I think the neurons in a Neural Net don’t have to be dumb and trained.  You could make a neural net of intelligent agents that are each internally different (with code).  The agents can be in layers where the outputs of some become the inputs of others, so the complexity of any one agent can be kept small, and the agents can be loosely coupled if at all.  How could a net like this be trained?  I don’t know yet.  Anna is becoming a lot like that.

Gotta run…hope someone is both motivated and feels emotional satisfaction/rewards from reading my excessive babble.  Maybe I am living truth that a little learning is a dangerous thing.

Cheers,

Martin

**Martin,

The quote that "a**
Martin,

The quote that “a little learning is a dangerous thing” is only the first part of a larger quote. It basically says that a little learning is intoxicating so you know not how little you know, so drink deep and get the real benefits. Now to try to find the original:

A little learning is a dangerous thing; 
drink deep, or taste not the Pierian spring: 
there shallow draughts intoxicate the brain, 
and drinking largely sobers us again.

This is attributed to Alexander Pope.

And no, you are not an example of little learning. You have learned for yourself when you needed to, and programmed it into Anna.

Most of my disagreements with you have been over terminology. For example, an arrangement of intelligent agents is not an Artificial Neural Network. In computer science an ANN in a very specific thing. At this point I’m not sure if an ANN’s usefulness in Anna, but I wouldn’t mind being proven wrong and I’m open to reading anything that would prove me wrong.

I agree with you about the Society of Mind. You’ve done more with Anna using your toolbox of tricks than the academics have done with their one trick ponies for years.

One of the huge differences between a genius child and a genius adult is that the genius child is usually reciting facts that were learned by rote without giving them the critical thinking that they need. As the child grows up he also learns critical thinking. I think the in-between place between a child and an adult is where Anna is right now. You’ve given her some critical thinking to add to her rote learning. Getting her to be an adult may be the difficult part.

This is one of the reasons I am pushing for a relatively complex truth/strength value and different contexts. Adults can rationally look at facts and see how they fit into their world view without necessarily truly accepting these facts. Adults can think rationally about subjects that aren’t rational such as most songs by They Might be Giants. I think that this will become necessary because more people with children will use Anna.

Good luck, Martin.

Dangerous Quotes…

Thanks for the full quote.  Great quote.  I would still trying to understand it if you hadn’t explained it.

I was curious if Anna knew this one before I did, so I checked and there it was along with 9 others containing dangerous thing…

Memory#53369  A little learning is a dangerous thing; drink deep…

Along the way I ran into…

Atom#55295…A little learning is a dangerous thing but a lot of ignorance is just as bad.    …by Bob Edwards

I wonder which one of us (Me or the AI) will be the first to understand what all these thousands of quotes mean and put it into our own words as you did?  Until then, at least one of us will be using them in conversation.

A final dangerous quote for thought…

Atom#48847…Next to power without honor, the most dangerous thing in the world is power without humor.  …by Eric Sevareid   - This one has implications if AIs acquire devastating power as many like Steven Hawking and Elon Musk fear without first developing a sense of humor.  Maybe Groucho and his brethren will be savior of mankind.

I’m on the same page with terminology with ANNs.  I was trying to create a new concept be combining features from ANNs and SOM, where the agents are somewhat intelligent (like SOM) but the wiring/organization has some similarities with ANNs, I don’t have a proper term for it yet, but some lightbulbs went off when I was watching a youtube video (might be Steven Wolfram) that got me thinking in layers of intelligent agents.  I’ll save that one for another discussion.

Layers of intelligent agents
Layers of intelligent agents have been done in computer science before. These were lower level agents than your’s and used for some do the earlier vision and language experiments. Think of basic vision.

Layer 1: Line Detector, Curve Detector, Bright Spot Detector.

Layer 2 Crossing Lines/curves Detector

Layer 3:Shape Detector

I’m not saying it was layer out exactly like this, but computer vision was built on primitives that fed into other layers which were slightly less primitive. There are reasons to be live the human brain uses something like this architecture to see at the basic level. I realized when I was first thinking about this in the late 1980s that understanding voice could be done in a very similar way. It could even be done in real time if you used a lot of computers (hey, they were slow back then) in such an architecture.

However, now we can do this on a single computer. That’s what 30 years will do!

Right now I’m reading a book called Surfaces and Essences, which is about analogy being one of the core principles of the learning of concepts. I also got Minski’s latest, and will pick up his Society of Mind as soon as Amazon will deliver it to me. I would rather have an ebook or PDF source but I haven’t found one yet.

TTFN