ChPhoto by Lina Loos on Unsplash

Five Approaches To Managing Conversational Dialog

And Which Elements Can Play A Supporting Role

Cobus Greyling
8 min readJun 6, 2022

--

Introduction

When building a chatbot, developing, managing and fine-tuning the dialog flow state is important. Certain conversational AI elements are really intended to be used as the backbone of the dialog, others should be used in a supporting capacity, or only for specific use-cases.

There are 5 broad approaches to managing a conversation:

  1. Intents, Entities, Dialog State Systems & Bot Messages
  2. ML Stories
  3. Chitchat
  4. Large Language Models
  5. Knowledge Base Management

1️⃣ Intents, Entities, Dialog State Systems & Bot Messages

This is the stock-standard approach to digital assistants or chatbots. Intents at the frontline, sometimes fronted by a high-pass Natural Language Processing (NLP) layer for sentence boundary detection, summarisation, classification, keywords extraction, etc. Followed by entity extraction, and state based / condition based dialog management.

🟢 Advantages:

  1. Scales well as functionality and scope are added.
  2. Easy to manage and the standard, the Gartner leaders are leaning towards.
  3. Design and development of conversational experiences are merging.
  4. The Dialog State machine makes it easy to have a message abstraction layer for management of bot responses based on the conversational medium.
  5. Due to the ubiquitous nature of this architecture, skilled and experienced professionals are available.
  6. This architecture makes it easier to segment the development process for parallel work, to some degree.
  7. Leads towards a more controlled and predictable user experience.
  8. Fine-tuning is baked into the architecture.

🔴 Disadvantages:

  1. The conversational experience is not flexible and still has the underlying state machine. Having some kind of probabilistic classifier for predicting the next dialog turn will be helpful. This can add flexibility to some areas of the conversation.
  2. Large knowledge bases cannot be natively absorbed without much work in segmentation data preparation.
  3. Incorporation of Large Language Models (LLM’s) are not yet commonplace from a dialog perspective. Regarding LLM’s, ODA does have a trainable auto-complete feature.

2️⃣ Machine Learning Stories

The ideal would be to have a probabilistic classifier of sorts, observing the user input and replying with the most appropriate message. And where a rigid set of steps are required, like opening an account, a sequential set-states approach can be followed. Rasa, with their ML Stories also have a Rules approach. This is a type of training used to train where short pieces of conversations are described, which should always follow the same sequence.

🟢 Advantages:

  1. ML Stories can generalise to unseen conversation paths.
  2. A natural conversation can be followed, users can digress and move from one conversation to another.
  3. The conversational agents’ whole domain is leveraged for the conversational experience where different user conversations can be submitted for training with more possible subsequent scenarios.
  4. Conversational designs or flows can be seen as disposable. If not used, it is not developed further and user stories which are in demand can be built out.
  5. There is an element of control and fine-tuning with stories, which is not completely the case with the zero or few shot chatbots of LLM’s.
  6. For enterprises, the vast majority of use-cases for chatbots are really limited to a few examples, ML stories can focus on these whilst catering for all the deviations.

🔴 Disadvantages:

  1. The probabilistic approach are seen as risky by enterprises and large organisations.
  2. It is a new way of working for teams, currently much emphasis is placed on the conversational design and development canvas approache and graphic collaboration.
  3. Chatbot testers need to understand the context of stories; misalignment might be a challenge.
  4. Steep learning curve.

3️⃣ Chitchat / Smalltalk

Many chatbot implementations cater for chitchat implementations.

This is were smalltalk is incorporated into the chatbot, which includes basic curtesy. And the handling of errors and exceptions gracefully and in a highly conversational manner.

Simple ways of improving general chitchat is to focus on not repeating messages or allow any form of fallback proliferation. Identifying users can help in creating context for conversations.

Of course chitchat is just an aid in managing the conversation, in most cases. There are instances where smalltalk can support the whole conversation, but these are general, companion, non-domain specific chatbots.

If a chatbot was developed in a specific minority language, than out-of-the-box chitchat will most probably not be available.

Chitchat making use of OpenAI.

🟢 Advantages:

  1. Chitchat makes the conversational agent more natural.
  2. Also, chitchat can be used to make the conversation more personalised in instances where users can be identified.
  3. A large part of first-interactions of any conversational agent is people just exploring the interface and asking general and often random questions. If these can be fielded and the conversation managed, first-impression can be positive.

🔴 Disadvantages:

  1. Setting the boundaries of chitchat can be challenging, can users ask for the weather, the time etc?
  2. Domain specific conversational implementations must not be confused by users with general, broad domain, implementations like Google Assistant, Siri, Alexa and the like.
  3. Care must be taken to leverage chitchat in such a way that the conversation gain traction in order to extract definite intents.
  4. As stated before, developing chitchat is time consuming and defining what will most probably be chitchat even more. Leveraging chitchat from LLM’s is a possibility.

4️⃣ Large Language Models (LLM)

Large Language Models (LLM) have a whole array of implementations with which the dialog of a conversational agent can be created. LLM’s give a large degree of flexibility, with zero to few shot training. In other words, much can be achieved with no to very little training data or effort. This flexibility is astounding at first and the implementation possibilities flood one’s mind.

However, flexibility without control does not scale well and fine-tuning is crucial for any implementation. The LLM’s do have a degree of fine-tuning available, but does not include fine-tuned dialog managment.

LLM’s include OpenAI’s Language API, co:here, AI21labs and HuggingFace.

LLM’s are exceptional at any langauge task, including Natural Language Generation. LLM’s can also play a vital role in supporting a chatbot implementation.

🟢 Advantages:

  1. Zero to few shot training.
  2. Multiple language related tasks can be performed.
  3. LLM’s can play a vital supporting role in chatbots. For instance extracting named entities, summarisation, classification of text and NLG.
  4. Democratising access to Large Language Models and powerful processing.
  5. No-code, cloud based approach, which can grow into low-code and eventually pro-code.
  6. Training data does not have to be extremely large or specific formats.

🔴 Disadvantages:

  1. Cost is both and advantage and disadvantage. Initially cost will be low, but as volume and functionality are added, cost will escalate.
  2. Fine-tuning will be a challenge; training on custom and own data.
  3. LLM’s can maintain a coherent and general dialog maintaining context, with NLG and no repetition, etc. But programming the dialog is not possible.

5️⃣ Knowledge Base Management

Firstly, one could address Questions and Answering via alternative approach, other than traditional Knowledge Bases. Doing so via traditional chatbot development affordances; intents, entities, dialog trees and response messages.

🟢 Advantages:

  • The process and approach form part of existing chatbot development process.
  • Ease of integration with existing chatbot journeys, and act as an extension of current conversational functionality.
  • QnA experiences can be transformed into an integrated journey.

🔴 Disadvantages

  • Maintenance intensive in terms of NLU (intents & entities), dialog state management and dialog management.
  • Does not scale well with large amounts of dynamic data.
  • Semantic search is more adept to finding one or more matching answers.
Three levels of Knowledge Base implementation and focussing on the three different use-case implementations. Depending on the knowledge you want to represent.

A second level is where a custom knowledge base is setup. This can be done via various means, Elasticsearch, Watson Discovery, Rasa knowledge base actions, OpenAI Language API with fine-tuning or Pinecone.

The second level knowledge bases are focussed and aimed at domain specific search data, and loading or making searchable data available.

A challenge with this level 2 knowledge base is to have an effective message abstraction layer. Response messages should be flexible, a portion of a response might be more appropriate for a specific question. Or, there might be a need for two or more messages to be merged for a more accurate response.

🟢 Advantages:

  • Scales well with large bodies of data which changes continuously.
  • Lower maintenance as the incorporated search options takes care of data retrieval.
  • Advances in Semantic Search, vector databases and more.
  • Knowledge bases negate chatbot fallback proliferation by most probably having a domain related answer to the question.

🔴 Disadvantages:

  • More demanding in terms of technical skills.
  • Cost might be a consideration.
  • An additional dimension is added to the Conversational AI landscape to manage.

Lastly, a third level, could be seen as instances where general, non-domain specific questions can be asked. And where a vast general knowledge base needs to be leveraged. This can be Wikipedia, GPT3, etc.

OpenAI’s Language API does a good job at fielding any general knowledge questions in a very natural way with no dialog or messaging management. In the image below general random questions are fielded in short, well-formed sentences.

The Q&A bot of OpenAI’s Language API.

NVIDIA Riva has a general Question and Answer chatbot where Wikipedia is leveraged, as seen from the Notebook example below.

The NVIDIA Riva Notebook showing how Wikipedia can be leveraged for a general question and answer conversational AI.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com