“The medium is the message.” ― Marshall McLuhan
For Starters, What is Fine-Tuning?
In general, the aim of most chatbot development frameworks, are to create an environment which enables medium-level technical people ease of onboarding.
In most cases, only performing Natural Language Processing (NLP) tasks demands a simpler data-in-data-out environment. Without any dialog state management, disambiguation, response text management etc.
As a conversational agent grows and evolves over time, more complexity is introduced. This complexity is constituted by elements like dialog management, maintaining context, slots, contextual awareness, disambiguation, digression etc.
Hence there arises a need for flexibility in the interface used for developing and managing the chatbot. And this is especially the case with the dialog state.
The challenge is to have a natural and adaptive dialog which is also predictable and manageable. These two elements are juxtaposed.
Think of a natural and adaptive dialog in terms of the OpenAI language API (based on GPT-3) as apposed to IBM Watson’s Dialog State Management, for instance.
The more no-code or low-code the solution becomes, the more the fine-tuning capability diminishes. Examples here are OpenAI Language API, Design Canvases etc.
The more pro-code the environment, for instance, Microsoft Bot Framework or Cisco MindMeld, the more fine-tuning is possible. But complexity ins introduced for management and development of the Conversational AI environment.
Low-code interfaces are usually made available via a single or a collection of tools which are very graphic in nature; and initially intuitive to use.
Thus initially having the guise of rapid onboarding and speeding up the process of delivering solutions to production.
However, as with many approaches of this nature, initially it seems like a very good idea. However, as functionality, complexity and scaling start playing a role, huge impediments are encountered.
When someone refers to the ability or the extend to which fine-tuning can be performed, what exactly are they referring to?
A few common elements which constitutes fine-tuning are:
- Forms & Slots for capturing user input.
- Natural Language Generation (NLG)
- Dialog State Management
- Chatbot Message/Text Management
General Chatbot Trends
For starters, there are six general chatbots trends emerging…
1️⃣ VoiceBots / Speech Interfaces. There has been growing activity in voice/speech interfaces, particularly access via a phone call, and not necessarily a dedicated voice assistant device.
IBM Watson Voice Agent was launched 2018, but from March 2021 it will be deprecated and fully integrated into Watson Assistant as the newly released phone integration.
There is however a loss of fine-tuning ability, hence how this will play out in practice remains to be seen. One scenario is that intent-less skills are built by business units and not technical teams. And these skills act as an extension of an existing assistant.
Examples of intent deprecation implementations are IBM Watson Action, Microsoft Power Virtual Agents and Amazon Alexa Conversations.
3️⃣ Merging Intents & Entities. Intents and Entities continue to merge and contextual annotation of entities within the intent or utterance is becoming commonplace & very necessary.
Compound entities are also becoming more important. The merging of intents and entities is a process where entities are tightly coupled a certain context within a certain intent. Resulting in an efficient feedback loop.
4️⃣ Entity Data Structures. Data structures are introduced to Entities… This trend is visible with Rasa, Alexa Conversations tools and especially Microsoft LUIS. Rasa calls it Entities Roles & Groups. AWS calls it Slots with Properties.
And Microsoft LUIS, ML entities which can be decomposed. Cisco MindMeld also spent time on building entities out.
5️⃣ Edge Installations. Edge installations are becoming more important…NVIDIA Riva and Rasa come to mind for install anywhere. For instance autonomous vehicles etc.
6️⃣ State Machine Deprecation. Deprecating of the State Machine is inevitable, Rasa is leading the charge here. IBM is introducing automation to their Dialog Management system with customer effort scores and auto disambiguation menus. Watson Actions need to be mentioned.
Most frameworks converge on ideas like intents, entities, dialog messaging and similar approaches are followed. Whilst when it comes to dialog state development and management, there is a significant disparate approach to the problem.
NVIDIA is working on Riva Studio which will most probably include dialog state development. Something which is not part of Riva now. The current Riva demos make use of Rasa and Google Dialogflow for dialog management. This illustrates the versatility of NVIDIA Riva.
Initially the technical & design decisions are easy. However, as technology grows and the chatbot scales, those design & technical decisions become harder and loaded with ramifications. Hence careful initial considerations are necessary. Especially if an investment is made; otherwise a prototype/test approach can be followed.
Cross-Industry & Technology Trends
Chatbot trends which are being seen cross-industry and technology platform, are:
- Intent deprecation
- Intent Disambiguation with Auto Learning Menus
- The merging of intents and entities
- Deprecation of the State Machine. Or at least, towards the deprecation of a rigid, conditional state machine.
- Complex entities; the introduction of entities with properties, groups, roles etc.
There are both horizontal and vertical growth with chatbot technology.
From the diagram above it is clear where this growth is taking place:
Vertical — Technology
The Conversational UI is moving away from a structured preset menu and keyword driven interface. With movement towards unstructured natural language input and longer conversational input.
Allowing users to disambiguation when two or three intents are close in score. Using this as a mechanism for autolearning.
Horizontal — User Experience
In this dimension the bot is transforming from a messaging bot to a truly conversational interface. Away from click navigation to eventual unrestricted compound natural language input from the user.
In Closing, The Digital Employee
The end-game is where the digital employee, emerging from the chatbot environment, has evolved into areas of text and speech.
With contextual awareness on four levels:
- Within the Current Conversation
- From Previous Conversations
- Context gleaned from CRM & Other Customer/User Related Data Sources
- And across different mediums
The digital employee with grow across different mediums and modalities. Mastering languages with detection, translation, tone, sentiment and automatically categorizing conversations.
Mediums will include devices like Google Home, Amazon Echo, traditional IVR and more. As we as humans can converse in text or voice; similarly the digital employee will be able to converse in text or voice.
Subscribe to my newsletter.
NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces, Ambient…