Photo by Sandro Katalina on Unsplash

Is Chatbot Dialog State Machine Deprecation Inevitable?

And How Might The Best Approach Look…


This example shows how IBM Watson Assistant attempts to disambiguate user input. Based on the user’s response, the chatbot learns automatically.

The three chatbot architecture elements which need to be deprecated at some stage are state machine dialog management ,intents and bot responses.

The current chatbot status quo sits between keyword recognition and structured intent and entity matching.

General Observations on Level 4 & 5 Chatbots

A Level 4 assistant will know you much more in detail. It doesn’t need to ask every detail, and instead quickly checks a few final things before giving you a quote tailored to your actual situation.

Elements of autonomy have been introduce to IBM Watson Assistant with the Autolearning function. Together with tracking Customer Effort.

The three impediments to chatbots becoming true AI agents are intents, state machines and bot responses.

The three areas of rigidity are indicated by the arrows, as discussed here.

So it is clear with these three levels of rigidity, progress to levels 4 and 5 are severely impeded.

Breaking down the rigidity of current architecture where machine learning only exist in matching user input to an intent & entities.

1. Conversation State Management

The dialog node development interface of Microsoft Bot Framework Composer
The development canvas of Google Dialogflow CX

The user conversation is dictated by this rigid and pre-determined flow with conditions and logic activating a dialog node.

- story: collect restaurant booking info # name of the story - just for debugging
- intent: greet # user message with no entities
- action: utter_ask_howcanhelp
- intent: inform # user message with no entities
- location: "rome"
- price: "cheap"
- action: utter_on_it # action that the bot should execute
- action: utter_ask_cuisine
- intent: inform
- cuisine: "spanish"
- action: utter_ask_num_people

Stories example from: # 👆

2. Intents

The reason behind this is that a finite list of intents are usually defined.

Intents are also a rigid layer within a chatbot. Any conceivable user input needs to be anticipated and mapped to an single intent.

The user utterance is assigned to an intent. In turn the intent is linked to a particular point in the state machine.
User input is matched to one intent. The identified intent is part of a fixed list of intents. In turn, each intent is assigned to a portion of the dialog.

3. Chabot Text or Return Dialog (NLG)

Restaurant review is created from a few key words and the restaurant name.

The wording returned by the chatbot is very much linked one-to-one, to a specific dialog state node.

An apple pie review based on four generic words.