The IBM Watson Assistant Architecture Should Look Like This

And Why Multiple Orchestrated Skills Make Sense

9 min readFeb 16, 2022

Introduction

IBM Watson Assistant is currently focused on Action Skills, with Watson Discovery acting as a Search Skill to back-up user intent not covered by Actions.

There are signs that Dialog Skills will be introduced at some stage in the future. But the current moratorium on creating new instances with a Dialog Skill does not help the IBM cause.

Dialog Skill is referenced under settings as a future feature or addition.

As seen above, Dialog Skills are displayed, and visible in assistant settings as a future, coming soon feature. The length & duration of this moratorium on Dialog Skills is not known and the shape or form of future Dialog Skills are also not known.

I sure hope that there will be a sense of continuity in terms of Dialog Skills’ functionality etc. Otherwise the inevitable re-work of development work will add overhead.

It does seem like the IBM Watson Assistant team is focusing on Action Skills and seem to be convinced that this is sufficient; for now at least.

The approach from IBM should not be one of searching for a single silver bullet to solve for all conversational AI challenges. But rather the approach should be to attack the problem on multiple fronts. With a digital assistant to which complexity can be added.

Orchestration

The ideal scenario would be, where users can create an assist by implementing and orchestrating multiple skills within a single digital assistant.

Orchestration can be rules based, on user input and NLU results. Certain combinations of intents, entities, contextual awareness based on user entry points, profiles gleaned from API calls etc. can all play a part.

Each skill type, of which there are three, has a specific use-case; this can be extended of course. However, used out of place, can seriously impede the scaling of your chatbot. There should be multiple skills of each type.

There is an opportunity to create a NLU model for orchestration and leverage this within the NLU sections of the dialog skills.

Digression & disambiguation between skills will be a huge plus. Search & Action Skills need to be used primarily as an extension to Dialog Skills.

There are instances where Action and Search Skills can be used in a stand-alone mode. But these instances will be limited in scope and chatbot functionality.

Imagine if you can have one or more Dialog Skills constituting the Assistant, these Dialog Skills can address different part of the business. The Dialog Skills can be orchestrated within the assistant; where they can be switched off, prioritized or deprioritized. And tweaked in terms of when they are invoked.

The ideal is where an Digital Assistant can be constituted by one or more **Actions**, **Dialog** or **Search** Skills. These skills can be orchestrated, prioritized on an assistant level. Uses are free to decide which level of complexity they would like to introduce to the assistant.

Smaller assistants can be completely developed using one or more Action Skills.

The ideal would be that an Assistant is made up of multiple orchestrated Dialog skills, smaller dialogs are handled as extensions built with Action dialogs. And one or more Search Skills can be employed.

A consideration is to add connectors for popular databases like MongoDB, SQLServer, Cloudant etc.

The Search Skill can be used in a standalone scenario where you want to create a searchable knowledge base. But this will run into scaling impediments when conversations needs to be specific.

Action Skills can be used for a quick survey, or slot filling chatbot. The fact that Action Skills are Watson Assistant’s first foray into end-to-end intent-less conversations is exciting. But these skills cannot handle complex dialog configurations, digression, disambiguation, auto learning etc.

Dialog Skills should be the backbone of any conversation, augmented and complimented by search and action skills.

Dialog Skills

This should be the main skill is the assistant. All conversational agents should be anchored by one or more Dialog Skills.

The Dialog Skill allows for defining intents and entities, NLU structure, conversations are defined by a dialog tree.

A graphic dialog editor is available and scripting can also be used. The response dialog is also defined here.

Key benefits of the Dialog Skill are:

Below is a simple example of Dialog Configuration For Multiple Intents and illustration how the Dialog Skill’s development interface looks.

I went with the simplest dialog structure possible to create this example. Here you can see some of the conditions within the image. The idea is for the conversation to skip through the initial dialog nodes and evaluate the conditions.

Watson Assistant’s dialog creation and management web environment is powerful and feature rich. It is continuously evolving with new functionality visible every so often.

Actions

From 9 February 2022 all new instances of IBM Watson Assistant (WA) points to the new interface. This new interface or experience is Actions based and not independent NLU/Dialog Skill based.

Seemingly, for new instances the previous interface is inaccessible. However, for existing implementations in the previous interface, no work will be lost when toggling or switching between the two environments.

For existing implementations, you can switch back to the classic/previous experience at any time by clicking Switch to classic experience from the account menu.

A conversation scenario where both Dialog and Actions Skills are employed.

Firstly, Actions should be seen as another type of skill to complement the other two existing skills;

dialog skills and
search skills.

Actions cannot be seen as a replacement for dialogs.

Secondly, actions can be used as a standalone implementation for very simple applications. Such simple implementations may include customer satisfaction surveys, customer or user registration etc. Short and specific conversations.

Thirdly, and most importantly, actions can be used as a plugin or supporting element to dialog skills.

Of course, your assistant can run 100% on Actions, but this is highly unlikely or at least advisable.

The best implementation scenario is where the backbone of your assistant is constituted by one or more dialog skills, and Actions are used to enhance certain functionality within the dialog. With something like a search skill.

This approach can allow business units to develop their own actions, due to the friendly interface. And subsequently, these Actions can then plugged into a dialog.

Setting up a dialog node to call an Action skill.

This approach is convenient if you have a module which changes on a regular basis, but you want to minimize impact on a complex dialog environment.

Within a dialog node, a specific action that is linked to the same Assistant as this dialog skill can be invoked. The dialog skill is paused until the action is completed.

An action can also be seen as a module which can be used and reused from multiple dialog threads.

When adding actions to a dialog skill, consideration needs to be given to the invocation priority.

Within the dialog, if the Dialog Skills intent is #Balance, invoke a action skill with a return variable.

If you add only an actions skill to the assistant, the action skill starts the conversation. If you add both a dialog skill and actions skill to an assistant, the dialog skill starts the conversation. And actions are recognized only if you configure the dialog skill to call them.

Fourthly, if you are looking for a tool to develop prototypes, demos or proof of concepts, Actions can stand you in good stead.

Mention needs to be made of the built-in constrained user input, where options are presented. Creating a more structured input supports the capabilities of Actions.

Disambiguation between Actions within an Action Skill is possible and can be toggled on or off. This is a very handy functionality. It should address intent conflicts to a large extend.

System actions are available and these are bound to grow.

How NOT To Use Actions

It does not seem sensible to build a complete digital assistant / chatbot with actions. Or at least not as a standalone conversational interface. There is this allure of rapid initial progress and having something to show. However, there are a few problems you are bound to encounter.

Conversations within an action are segmented or grouped according to intents. Should there be intent conflicts or overlaps, inconsistencies can be introduced to the chatbot.

Entity management is not as strong within Actions as it is with Dialog skills. Collection of entities with a slot filling approach is fine.

But for more advance conversations where entities need to be defined and detected contextually Actions will not suffice. Compound entities per user utterance will also pose a challenge

Compound intents, or multiple intents per user utterance is problematic.

If you are use to implementing conversational digression, actions will not suffice.

Search

Amongst others, there has been two general notions within the chatbot framework ecosystem.

The first is the deprecation of intents. There are four emerging approaches to the deprecation of intents.

An Example IBM Watson Assistant Search Skill integrated to Watson Discovery.

The second is the deprecation of the state machine. This is necessary to introduce a more flexible conversational flow. The leader in this space is currently Rasa.

But there is another way to introduce more flexibility to a state machine driven dialog management environment; where all conversational paths and responses are pre-define…

This is by introducing a feature where, if there is no intent detected with a high confidence, the dialog can default to search a knowledge base and respond with the result.

This is not something unique to any chatbot framework. NVIDIA Riva, which was released recently has integration examples to Wikipedia to serve as a knowledge base which can be searched. Other platforms like MindMeld, Rasa, Microsoft and more make provision for such functionality. Obviously these systems vary in complexity and implementation steps.

Conclusion

From the examples you should have a good idea on how these three skills can be orchestrated. The Search Skill can be used in a standalone scenario where you want to create a searchable knowledge base. But this will run into scaling impediments when conversations needs to be specific.

Action Skills can be used for a quick survey, or slot filling chatbot. The fact that actions are Watson Assistant’s first foray into end-to-end intent-less conversations is exciting. But these skills cannot handle complex dialog configurations, digression, disambiguation, auto learning etc.

Dialog Skills should be the backbone of any conversation, augmented and complimented by search and action skills.

All information is freely available, to donate via PayPal, click here…

All information is freely available, to donate via PayPal , click here:

cobusgreyling.me

Subscribe to my newsletter.

NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces, Ambient…

cobusgreyling.me

Cobus Greyling — Medium

Read writing from Cobus Greyling on Medium. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer…

cobusgreyling.medium.com

IBM Cloud Docs

Find documentation, API & SDK references, tutorials, FAQs, and more resources for IBM Cloud products and services.

cloud.ibm.com