Looking at conversational agents, also referred to as chatbots, obviously the goal is to mimic and exceed what is possible with human-to-human conversation. However, an important part of any human-to-human conversation is intent discovery.
Think of it like this; when you approach someone attending to an information desk, their prime objective is to determine the intent of your request.
Once intent is established, the conversation can lead to resolution.
Within a chatbot the first line of conversation facilitation is intent recognition.
And herein lies the challenge, in most chatbot platforms there is a machine learning model of sorts used to assign a user utterance to a specific intent.
And from here the intent is tied to a specific point in the state machine (aka dialog tree). As you can see from the sequence below, the user input “I am thinking of buying a dog.” is matched to the intent Buy Dog. And from here the intents are hardcoded to dialog entry points.
Below you see a single dialog node from IBM Watson Assistant, where the heading says: “If assistant recognizes”. Under this heading a very static and fixed condition or set of conditions can defined.
Why Is This A Problem?
This strait-laced layer between the NLU ML model and the dialog is intransigent in the sense of being a conditional if-this-then-that scenario which manages the conversation.
The list of Intents are also a fixed and a hardcoded and defined reference within a chatbot. Any conceivable user input needs to be anticipated and mapped to an single intent.
Again, the list of intents is rigid and fixed. Subsequently each intent is linked to a portion of the pre-defined dialog; as mentioned earlier.
User input is matched to one intent. Subsequently, the identified intent is part of a fixed list of intents. In turn, each intent is assigned to a portion of the dialog. If a question is asked by the user which is ambiguous, it poses a problem. Or if an utterance cannot be matched to an intent.
But, what if the layer of intents can be deprecated and user input can be mapped directly to the dialog?
This development is crucial in order to move from a messaging bot to a conversational AI interface.
This layer of intents is also a layer of translation which muddies the conversational waters.
How would it look if intents are optional and can by bypassed? Where user input is directly mapped to a user story?
No Intent Stories
This is one of two ways of approaching no-intent stories. Below is the simplest approach; a no-intent conversation living in the same training file as other intent based stories.
Glaringly intent and action is absent.
- story: No Intent Story
- user: "hello"
- bot: "Hello human!"
- user: "Where is my nearest branch?"
- bot: "We are digital! Who needs branches."
- user: "Thanks anyway"
- bot: "You are welcome. No to branches!"
Below you can see a conversation which invokes this story, and the deviations from the trained story is obvious.
ML Story is defined on the left, and an interactive test conversation on the right. Rasa X and interactive learning are not yet available for no intents.
The next step is to look at a hybrid approach, where no-intent dialogs can be introduced to an existing stories.
Looking at the story below, you will see the story name, and flowing into the intent name, action…and then user input is caught sans any intent. Followed by an action.
- story: account_checking
- intent: tiers
- action: utter_tiers
- user: "just give that to me again?"
- action: utter_tiers
Here is the conversation with the chatbot:
ML Story is defined on the left, and an interactive test conversation on the right. Rasa X and interactive learning are not yet available.
From a user perspective, context of the conversation exits in the user’s mind. Someone might be ordering a pizza, and ask for extra cheese, and then say in the next dialog, “That is too expensive”.
From a user perspective, the message is to cancel the extra cheese. From a dialog and contextual perspective, this is not so obvious. Building truly contextually aware chatbots is not an easy feat.
Rasa wants the context of the conversation to affect the prediction of the next dialog.
Looking at possible advantages and disadvantages…
- With solutions like Rasa, Microsoft LUIS, Amazon Lex etc., the NLU service is separate from the dialog/state machine component. This means the NLU model can be used as a separate resource within an organization. With the deprecation of intents, NLU and dialog/state management merges. And I cannot see how this will be possible.
- Perhaps this gives rise to a scenario where more user stories needs to be created and where user stories cannot be too similar. Seeing the conversation pivots and relies heavily on the training data in ML stories.
- Rasa is the avant-garde of the chatbot world, pushing the boundaries. Intent deprecation is inevitable. If any chatbot will attempt this successfully, it will be Rasa.
- The case is not made for 100% dedication to intents or no-intents. No-intent user stories can be used as a quick solution, especially for chit-chat and small talk. This is done already, to some degree; think of Microsoft’s QnA maker; which is not intent based. But it is limited in scope and functionality. Also think IBM Watson Assistant’s new Action Dialogs. Which is really a quick and easy way to roll out a simple dialog interface. But not serve as a complete solution.
Hi, I'm Cobus... Currently I conceptualize, mock-up, wire-frame, prototype and develop final products. Primarily…
Cobus Greyling - Medium
Using Regular Expressions With LUIS & Bot Framework Composer The concept formed during the 1950's by an American…
Open source conversational AI
You can do a lot in a modern browser these days. You can, for example, detect speech via the Web Speech API . It's an…
We're a step closer to getting rid of intents
One year ago I wrote that it's about time we get rid of intents, and that seems to have struck a nerve with many people…