Advances In Conversational Dialog State Management

Any Conversational User Interface needs to perform dialog state management, determining what the next dialog state & system response should be to the user.

Cobus Greyling
6 min readMay 16, 2023

--

The Problem With Fixed Dialog Flows

The holy grail of Conversational UI’s is to have maximum Flexibility together with maximum Predictability in terms of conversation dialog state development & management.

The trade-off for high flexibility is usually low predictability; and for high predictability, low flexibility.

Dialog State Development & Management: Flexibility vs Predictability

The traditional approach to chatbot and any Conversational UI development is a dialog-flow approach. The dialog-flow starts with intent detection and branches out in to further sub tasks or sub flows.

This approach is very rigid and fixed, but well suited for fine-tuning on a granular level but is very rigid and lacks any level of flexibility.

The example below is a dialog-flow development interface of Cognigy.

The challenge has always been the division between the designed user experience and the user’s desired experience.

Below is an overview of eight dialog management options for Conversational UIs. Some of these options have not been available up until recently.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

LLM Agents

LLM Agents have a very high level of autonomy.

Chain-of-thought reasoning is used to decompose the user question into sub-tasks. From which a chain of execution (analogous to dialog and process flows) are created. Hence chains are created on the fly based on the user input.

Upon receiving a request, Agents leverage LLMs to make a decision on which Action to take. After an Action is completed, the Agent enters an observation step. From the observation step, the Agent shares a thought; if a final answer is not reached, the Agent cycles back to another Action in order to move closer to a Final Answer.

Prompt Chaining

Where Agents form a LLM chain on the fly, prompt chaining is the process of creating a predetermined chain for an anticipated use-case.

The advantage of Agents is that an Agent can address not envisaged user requests.

While in the development process of prompt chaining, pre-determined chains are created based on expected use-cases. Prompt chaining still has a higher level of flexibility when compared to traditional dialog flows.

Prompt Pipelines

Prompt Pipelines extend prompt templates by automatically injecting contextual reference data for each prompt.

Prompt Pipelines can also be described as an intelligent extension to prompt templates.

As a request is received, the prompt pipeline has access to tools like knowledge and document stores and semantic search, to populate the prompt template.

This granular and specific composed prompt is submitted to the LLM.

Few Shot Prompts

By making use of a Few Shot Training approach, as seen below, contextual data can be added to a prompt and dialog state can be maintained in the prompt by including the contextual data and a few dialog turns as a reference.

Below is more detail on how a complete chatbot can be bootstrapped by making use of LLMs.

Quick Reply Intents

Quick reply intents, or intents with imbedded answers, is a contained approach where QnA and other single-turn dialog requirements are serviced.

The user utterance is assigned to an intent, and the response is imbedded within the intent. This can be seen as an example where the lines between intents, dialog flow and bot messages are blurred.

The overhead of segmenting the functionality and breaking it up between intents, dialog flow sections and bot messages is negated.

This example above of Quick Reply Intents is from Oracle Digital Assistant, read more about it here..

Knowledge Base / Semantic Search

A knowledge base is a repository where documents and other data is uploaded and processed.

The knowledge base can then be queried via natural language and the response is usually contextual and in well formed natural language.

This approach is not well suited for longer dialogs and is seen as a supplementary aid to more formal and granular dialog developments.

Traditional NLU & Dialog State Management

Dialog flow managed via a state machine scales well, as functionality and scope are added. There is a level of standardisation and this approach remains the mainstay of all the Gartner leaders.

Due to the non-technical nature of developing a flow in such a logic and visual manner, design and development is merging. Conversation designers are designing their conversations in the run-time environment. Hence there is no translation required between design and development.

Also , due to the ubiquitous nature of this architecture, skilled and experienced professionals are available.

Elements like fine-tuning, collaboration and parallel work are enabled.

ML Stories

The ideal would be to have a probabilistic classifier of sorts, observing the user input and replying with the most appropriate message. And where a rigid set of steps are required, like opening an account, a sequential pre-set-states approach can be followed.

Rasa, with their ML Stories also have a Rules approach. This is a type of training used to train where short pieces of conversations are described, which should always follow the same sequence.

The principle of ML Stories was ahead of its time, but ML stories seem to be superseded by LLM based applications like Agents and Chaining.

⭐️ Please follow me on LinkedIn for updates on LLMs ⭐️

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

https://www.linkedin.com/in/cobusgreyling
https://www.linkedin.com/in/cobusgreyling

--

--