Currently There Are Four Distinct Chatbot Dialog Development Approaches
Chatbot Dialog Design & Development Are Settling Into Four Main Categories
Introduction
When different vendors and platforms converge on the same basic approach & principles, it is safe to assume it is the most efficient way of doing it.
Looking at the large and emerging chatbot platforms, they all converge on two key aspects:
- Intents
- Entities
There are elements like form or slot filling, policies etc. which is also important. But for the purpose of this story, we will not focus on it.
Intents
Intents are usually defined with by a description and a few training or utterance examples.
The utterance examples are what a user is anticipated to say or state. The intent they expect to have fulfilled.
Some environments have additional features like intent conflict identification and utterance suggestions.
Entities
Entities are the nouns the user enters, often multiple entities compounded in one user utterance.
The challenge here is extract the entities when first uttered by the user.
Most environments can extract compound entities and also contextual entities.
So, compound and contextual entities are being implemented by more chatbot platforms.
The option to contextually annotate entities are also on the rise.
Often entities have a finite set of values which are defined. Then there are entities which cannot be represented by a finite list; like cities in the world or names, or addresses.
These entity types have too many variations to be listed individually.
Dialog Creation: Development & Management
The area where there is a divergence rather than a convergence is that of Dialog Development and Management.
Meaning that for some the verdict is still out on what approach is ideal…
Dialog Creation
The dialog component has the responsibility of managing the state of the conversation, turn-by-turn. This is also referred to as the state machine, dialog flow (in generic terms and not Google’s platform), conversation design or dialog management.
It needs to be stated some environments have a clear separation between their NLU component and the core or dialog components. Rasa, Microsoft and AWS fall in this category.
IBM Watson Assistant has the opposite approach where the two are really merged.
Below I look at four distinct approaches currently followed in the chatbot marketplace to dialog creation and management.
1. Design Canvas
A design canvas environment is part and parcel of the new Bot Society design environment.
In the Botsociety environment designs can be deployed to solutions like Rasa, Microsoft Bot Framework and more.
It is evident that Botsociety is becoming more technical nature and complex.
Cleary the product is in the process of morphing from a design, presentation and prototype only tool, into a conversation development tool.
You can choose to export your design to:
- Bot Framework / Azure
- Dialogflow
- Rasa.ai
- Or to own codebase (API)
Dialogflow CX also leverages this canvas approach where you can map out a complex conversation and expand pages or conversational nodes.
This approach is easy to kick off a chatbot project, and designers feel comfortable initially.
But as complexity grows, scaling and management are impacted.
Advantageous of this approach are:
- Ease of collaboration
- Panning and viewing of the design
- Zoom in and out to see more or less detail
- Combining of the design and development process.
- Suitable for quick prototyping & cocreation.
Disadvantageousness of this approach are:
- Complexity of large implementations
- Change management and impact assessments
- Troubleshooting and identifying conversation break points.
- Multiple conditions per dialog node which are impacted when parameters change.
2. Dialog Configuration
You might ask what is the difference between a design canvas and dialog configuration…
Dialog configuration is an approach where you don’t quite have a canvas to design on, but conversational nodes are defined graphically.
These design nodes are in a linear fashion and the development environment is more rigid and sequential.
Within each dialog node conditions are set, and the conversation can skip up or down within the sequence, which can lead to confusion.
IBM Watson Assistant follows this design principle.
For each dialog node conditions can be set and certain outcomes defined. Dialogflow ES also reminds of a more dialog configuration approach together with Microsoft Composer. Microsoft Power Virtual Agents are also based on a dialog configuration approach.
Advantageous of this approach are:
- Slightly more condensed presentation of the conversation
- Restrictive nature prohibits impulsive changes.
- More technical in nature with varying levels of configuration.
- Suitable for quick prototyping.
Disadvantageousness of this approach are:
- Difficult to present and perform walk-through
- For larger conversations there is mounting complexity and cross-referencing.
- Mindfulness of how parameter and settings changes will cascade.
- Not suited as a conversation design tool.
3. Native Code
Native code makes the case for a highly flexible and scalable environment. Solutions which come to mind in this category is Amazon Lex, and to some extend Alexa skills.
But especially Microsoft Bot Framework running on native code. The advantage here is that non-propriety code can be used. In the case of Microsoft Bot Framework C# or Node.js can be used. In the case of Lex or Alexa skills; Lambda functions, you will most probably use Node.js or Python.
Native code affords you much more agility and flexibility. Although there is a chasm between design and implementation. Here a shared understanding needs to be established and cocreation is inhibited.
There are also solutions which use propriety code for the dialog, such as Oracle with their BotML.
Advantageous of this approach are:
- Non-propriety in terms of development environment language.
- Flexible and accommodating to change in scaling (in principle)
- Non-dedicated, specific skills or specific knowledge required.
- Porting of code, or even re-use.
Disadvantageousness of this approach are:
- Design and implementation is far removed from each other.
- Design interpretation might be a challenge.
- Another, most probably dedicated, design tool will be required.
- The complexity of managing different permutations in the dialog still needs to exist; within the code.
4. ML Stories
Here Rasa finds itself alone in this category; invented and pioneered by them. Where they apply ML, and the framework calculates the probable next conversational node from a basis of user stories.
stories:
- story: collect restaurant booking info # name of the story - just for debugging
steps:
- intent: greet # user message with no entities
- action: utter_ask_howcanhelp
- intent: inform # user message with no entities
entities:
- location: "rome"
- price: "cheap"
- action: utter_on_it # action that the bot should execute
- action: utter_ask_cuisine
- intent: inform
entities:
- cuisine: "spanish"
- action: utter_ask_num_people
Stories example from: #https://rasa.com/docs/rasa/stories. 👆
Rasa’s approach seems quite counter intuitive…instead of defining conditions and rules for each node, the chatbot is presented with real conversations. The chatbot then learns from these conversational sequences, to manage future conversations.
These different conversations, referred to as Rasa Stories, are the training data employed for creating the dialog management models.
Slots and forms can be incorporated…and the idea if CDD (Conversation-Driven Development) underpins continuous improvement of the models.
Advantageous of this approach are:
- Everyone knows the state machine needs to be deprecated; this achieves that.
- Training time is reasonable.
- No dedicated or specific hardware required.
- No dedicated ML experts and data scientists required…AI for the masses.
- Complexity is hidden in presented in a simplistic way.
Disadvantageousness of this approach are:
- This approach may seem abstract and intangible to some.
- Apprehension in instances where mandatory data needs to be collected. Or where legislation dictates conditions. However, here Form Policies comes into play.
Conclusion
There is so much talk of chatbots with or without machine learning…
Some degree of machine learning is always involved when training is performed on intents and entities. This, unfortunately, is not extended to the dialog flow and management; in most cases.