How To Orchestrate The Three IBM Watson Assistant Skill Types

And Using Each One To Their Strengths

Cobus Greyling
15 min readApr 16, 2021

--

Introduction

While recently exploring the tight integration between Watson Assistant & Watson Discovery, I came to realize something. The three skill types within Watson Assistant complement each other well, in terms or Dialog, Search & Action skills. But these skills need to be orchestrated correctly, with the Dialog skill acting as the backbone of the conversational agent and facilitating disambiguation, auto learning, digression and negating fallback proliferation. And with Search & Actions facilitating flexibility, extendibility and intent deprecation.

IBM Watson Assistant is constituted by two main component. An Assistant and one or more skills. This story is about how to orchestrate multiple skills, and multiple skill types within an Assistant.

The basic components of Watson Assistant. The basic architecture of Watson Assistant consists of two main parts; skills and an assistant.

Each skill type, of which there are three, has a specific use-case; this can be extended of course. However, used out of place, can seriously impede the scaling of your chatbot.

Let’s start with the difference between an Assistant and Skills

The assistant can be seen as the container of the conversational agent.

The assistant also houses the skills, and the assistant also facilitates the connectors to the integration mediums.

The Assistant direct requests down the optimal path for solving a customer problem.

By adding skills, your assistant can provide a direct answer to an in domain question or reference more generalized search results for requests more complex.

Here is a few key characteristics of an Assistant:

  • The assistant integrates to various mediums; Facebook, Slack, Twitter etc.
  • The assistant also house the different skills.
  • An assistant can have a single or multiple skills.
  • You can also think of skills as different elements representing different parts of an organization.

Watson Assistant make provision for three types of skills:

Orchestrating Skills

The three skill types mentioned needs to be used not only in a way they were intended to be employed, but skills can also be used in such a way where they compliment each-other.

The three skills within Watson Assistant

The main skill is the dialog skill. All conversational agents will be anchored by one or more Dialog Skills.

The dialog skill allows for defining intents and entities, conversations are defined by a dialog tree.

A graphic dialog editor is available and scripting an also be sued. The response dialog is also defined here.

An Actions skill is really not intended to be used as a standalone skill. The actions skill is a quick way to augment and build extensions to an existing dialog skill.

If you add both a dialog skill and an actions skill to your assistant, the dialog skill is used. You can configure your dialog skill to process individual actions from your actions skill.

When no response is available from a dialog or action skill, the conversation can default to a Search Skill where this skill searches a body of data and retrieve a portion of information to present to the user.

Using Skill Types In The Right Way

In the following three sections we will dive into how these three skills should be used.

The best way to understand the specific implementation strategy is to create practical examples.

Dialog Skill

A Simple Approach Using IBM Watson Assistant

This example looks at creating a dialog skill which can handle multiple user intents.

Like all cloud based chatbot development environments, with Watson Assistant you can create a list of expected user intents.

Always Wait for Watson To Complete Training Prior To Testing

These intents are categories to manage the conversation. Think of intents as the intention of the user contacting your chatbot. Intents can also be seen as verbs. The action the user wants to have performed.

Hence the user utterance needs to be assigned to one of these predefined intents. You can think of this as the domain of the chatbot. Below you can see an example of a list of intents defined and a list user examples per intent.

IBM Watson Assistant: Defining Intents & User Examples

Typically the user utterance is tagged with one of these intents, even if what the user says, stretches over two or more intents.

Most chatbots will take the intent with the highest score and take the conversation down that avenue.

Already here you should see the problem, when an user utters two intents in a sentence.

Switch the lights on and turn the music down. Most chatbots will settle on one of the two intents in this sentence.

Our Approach

Intents are defined in most cases by a decimal percentage.

Confidence Rating With Decimal Percentages

A decimal percentage that represents your assistant’s confidence in the recognized intents.

From the example her you can see that meeting intent is 81% and the time intent is 79%. So very close and clearly both need to be addressed.

And in other cases there might be more, yet most conversational environments will take to highest score to address, leaving the user with no other option than to retype the second intent, and hopefully with no other intents this time.

Dialog Configuration For Multiple Intents

There are simple ways of addressing this problem and helping your chatbot to be more resilient. Here I will show you a simple way of achieving this within the IBM Watson Assistant environment.

The Dialog Structure

I went with the simplest dialog structure possible to create this example. Here you can see some of the conditions within the image. The idea is for the conversation to skip through the initial dialog nodes and evaluate the conditions.

Complete Dialog Structure

Watson Assistant’s dialog creation and management web environment is powerful and feature rich. It is continuously evolving with new functionality visible every so often.

Setting The Threshold

Within the second node we create the contextual variable named $intents and set it to zero. This we will use to capture all the intents gleaned from the user input.

Setting the Confidence Threshold

The Intents we capture with this contextual variable later in the dialog will include all the intents.

You see we also create a contextual variable called $confidence_threshold.

This is set to 0.5. The idea is to discard intents with a confidence lower than 50%.

This threshold can be tweaked based on the results you achieve within your application.

In general you will see a clearly segregated top grouping and then the rest.

Getting The Intent Values

In the third dialog node we define three more context variables and assign values to them. Firstly we define a variable with the name $intents. Then we use the Value field to enter the following:

“<? intents.filter(‘intent’, ‘intent.confidence >= $confidence_threshold’) ?>”

To learn more about expression language methods, take a look at IBM’s documentation. We are only filtering the intents which are equal or more than the confidence threshold we set of 50%.

From here..

We are going to extract only the first two intents, as those are the ones we are interested in. For the first intent we define the variable first_intent and for value we use:

“<? intents.get(0).intent ?>”

This extract the first intent value from the list of intents. Then we create a context variable with the value second_intent and we assign the second listed intent value:

“<? intents.get(1).intent ?>”

You can see the pattern here, and so you can go down the list. You can also create a loop to go through the list.

Three Context Variables Are Defined And Values Set

Now our values will be captured via context variables within the course of the conversation.

These values can now be used to direct the dialog and support decisions on what is presented to the users.

Dialog Decision

This is one example of where we create a condition within a dialog and if it recognizes these two intents, the dialog is visited.

If The Assistant Recognizes These Two Intents The Dialog Is Visited

This is a mere illustration in the simplest form possible. For a production environment, the best solution would be to handle the intents separately and not in one dialog. Thus minimizing the options to make provision for.

Testing Our Prototype

Testing Our Prototype In The Test Pane

Testing our prototype within the test pane shows how with a multi-intent utterance the intents are captured as contextual entities and used within the dialog. Thus allowing the bot to respond accordingly.

Action Skill

How To Use Actions

Firstly, Actions should be seen as another type of skill to complement the other two existing skills;

  • dialog skills and
  • search skills.
Option when crating a skill for an assistant. Search skill, dialog skill or actions skill.

Actions must not be seen as a replacement for dialogs.

Secondly, actions can be used as a standalone implementation for very simple applications. Such simple implementations may include customer satisfaction surveys, customer or user registration etc. Short and specific conversations.

Thirdly, and most importantly, actions can be used as a plugin or supporting element to dialog skills.

Of course, your assistant can run 100% on Actions, but this is highly unlikely or at least advisable.

The best implementation scenario is where the backbone of your assistant is constituted by one or more dialog skills, and Actions are used to enhance certain functionality within the dialog. With something like a search skill.

This approach can allow business units to develop their own actions, due to the friendly interface. And subsequently, these Actions can then plugged into a dialog.

Setting up a dialog node to call an Action skill.

This approach is convenient if you have a module which changes on a regular basis, but you want to minimize impact on a complex dialog environment.

Within a dialog node, a specific action that is linked to the same Assistant as this dialog skill can be invoked. The dialog skill is paused until the action is completed.

An action can also be seen as a module which can be used and reused from multiple dialog threads.

When adding actions to a dialog skill, consideration needs to be given to the invocation priority.

Within the dialog, if the Dialog Skills intent is #Balance, invoke a action skill with a return variable.

If you add only an actions skill to the assistant, the action skill starts the conversation. If you add both a dialog skill and actions skill to an assistant, the dialog skill starts the conversation. And actions are recognized only if you configure the dialog skill to call them.

A conversation scenario where both Dialog and Actions Skills are employed.

Fourthly, if you are looking for a tool to develop prototypes, demos or proof of concepts, Actions can stand you in good stead.

Mention needs to be made of the built-in constrained user input, where options are presented. Creating a more structured input supports the capabilities of Actions.

Disambiguation between Actions within an Action Skill is possible and can be toggled on or off. This is a very handy functionality. It should address intent conflicts to a large extend.

System actions are available and these are bound to grow.

How NOT To Use Actions

It does not seem sensible to build a complete digital assistant / chatbot with actions. Or at least not as a standalone conversational interface. There is this allure of rapid initial progress and having something to show. However, there are a few problems you are bound to encounter.

A conversation built making use of Actions with conditional checks and re-prompts where the condition fails.

Conversations within an action are segmented or grouped according to intents. Should there be intent conflicts or overlaps, inconsistencies can be introduced to the chatbot.

Entity management is not as strong within Actions as it is with Dialog skills. Collection of entities with a slot filling approach is fine.

But for more advance conversations where entities need to be defined and detected contextually Actions will not suffice. Compound entities per user utterance will also pose a challenge

Compound intents, or multiple intents per user utterance is problematic.

If you are use to implementing conversational digression, actions will not suffice.

Positives

  • Conversational topics can be addressed in a modular fashion.
  • Conversational steps can be dynamically ordered by drag and drop.
  • Collaboration
  • Variable management is easy and conversational from a design perspective.
  • Conditions can set.
  • Complexity is masked and simplicity is surfaced.
  • Design and Development are combined.
  • Integration with current solutions and developed products
  • Formatting of conversational presentation.

Negatives

  • If used in isolation scaling impediments will be encountered.
  • Still State Machine Approach.
  • Linear Design interface.

Search Skill

Search Skills in Watson Assistant with Discovery

Here is a step-by-step implementation of a search skill and adding it to an assistant. Using search functionality within the IBM Cloud offering. In this example we are going to make use of IBM Watson Assistant. This in essence will constitute our chatbot.

Added to this we will also make use of IBM Watson Discovery.

Uploaded PDF Document Annotated with Questions (green) & Answers (orange)

In short, Discovery is an IBM Cloud service allowing you to upload data, which becomes a searchable body of data. This is very convenient and fast to convert existing data, in virtually any format, into a searchable form.

Documents like PDF, CSV, Word etc. can be uploaded and annotated. Custom Tags can be created for specific annotation. Very conveniently, as you annotate, Discovery use your annotation to make live predictions in the document. So for a this document, I only had to manually annotate about 10 pages. This obviously depends on how standard your document is.

For an organization and production environment it is advisable to organize the data in a JSON format, preferable with a heading, body and URL reference. This will make the results yielded by Discovery more predictable and tidy in all instances.

Discovery Demo

Here, I took the Dictionary of IBM and Computing Terminology (PDF, 313KB) document.

For the following reasons:

  • It his not too much data to upload, only 95 pages.
  • The format of the document is very structured throughout and the live predictions of the ML function made the process even faster.

Hence it makes for a convenient questions and answer model without having to rework the data into a specific format.

Getting Started With Discovery

There is an existing Data Collection in Discovery; Watson Discovery News. But we want to create our own one. So, click on Upload your own data…

The First Step is to Upload your Data

The data-upload interface cannot be simpler, you can drag and drop your files, or select it.

Name The Data Collection & Specify Document Language

You can see here the formats which provision is made for, PDF, HTML, JSON, Word, Excel, PowerPoint, PNG, TIFF, JPG and more. The language of the document needs to be defined, and this list is not vast.

In the cases where your documents are in another language, translation will be necessary.

Simply Drag-And-Drop or Select Documents.

When you upload large files, the process take quite a while. If the JSON structure is too complex or nested, Discovery fails. So try and simplify your JSON as much as possible.

Data Being Automatically Processed

Processing of data can take a while; keep an eye on the Errors and warnings to identify any problem areas in your data.

Data Annotation

The annotation of data is crucial in having accurate results. This is a manual process, but is simplified by the predictive annotation.

Predictive Annotation

This is a Machine Learning model, which in real-time, learns from your manual annotation and propagates this forward in the document. You will find yourself going from annotating, to review, to just skipping through the pages.

Field Labels & Annotated Page

Test Search Your Document In Discovery

Lastly, search your document and test the results. The beauty of this is, you can search our data and documents making use of natural language.

Search Data Model with Natural Language

Already in Discovery you can test your data making use of natural language understanding.

Watson Assistant ~ The Chatbot

Now we move to the chatbot portion making use of IBM Watson Assistant (WA).

WA allows for a assistant to be created. Within this assistant one or more skills can exist. Skills can be seen as different components or smaller chatbots which can be combined into one larger assistant.

Adding Skills To An Assistant

Within a larger organization, you can have different departments working on different skills and then these skills can be combined into one larger assistant.

These skills can be a dialog, or a search skill.

Adding Search Skill

We have an existing customer care skill in our assistant. And now we are adding this additional search skill.

Adding A Search Skill

After adding a name and description to the search skill, the next window loads the Discovery instances available to WA. This can also take a while.

List of Discovery Instances Available

We do not want to create a new collection. However, the fact that we can launch from WA is convenient. But, we choose the collection we created earlier, called Custom Data.

Configuring Data Presentation

Chatbot Data Presentation

The next window allows yo to set the data which will be presented in the:

  • Title
  • Body
  • and the URL

You can also define a message which will inform the user about the source of the data, and that it is indeed a search result.

Define a message if you could not find any data, or should there be connectivity issues.

Chatbot Data Presentation Configuration

It is best practice to be as transparent as possible with a user.

Always announce it is indeed a bot, and not human. Announce when you return search result data which was not directly curated for that particular point in the conversation.

And, state when there is a connectivity issue, or when no results are return.

Watson Assistant Components

Once you have launched WA, there is an option to create an assistant. Define the name of your assistant, and a description. Preview Link we discuss later in this article.

Preview Link allows for the creation of a preview URL to be created and distributed for previews and testing. Changes to the underlying chatbot are reflected on the preview interface.

Window to Create an Assistant

Going back to the assistant, you will see there are two skills which constitutes this assistant; a Customer Care skill, and the search skill.

Skills Part of the Assistant

The idea here is, if the Customer Care skill cannot address the user intent, then WA will automatically fail over to the search skill and yield an answer.

Testing the Search Skill

Here is a view of our Customer Care and Search assistant. Watson Assistant will serve the chat session from the Customer Care Sample Skill. If this skill cannot address the query, WA will fail-over to the search skill.

Assistant View with Two Skills Added

There are numerous options to deploy the assistant; like Facebook Messenger, Slack, Web Chat. For the purposes for illustration, we are going to use the preview link.

Choose a Channel to Deploy your Assistant

There are a few basic configuration options available to the assistant. Toggle the availability of the search skill, inactivity timeout, API Details and naming.

Available Assistant Settings

In this preview interface, you can see the “Where are your office located?” is addressed by the default customer care skill. The technical questions are addressed by search skill.

Combination of Default Skill & Search Skill

Conclusion

From the examples you should have a good idea on how these three skills can be employed. The search skill can be used in a standalone scenario where you want to create a searchable knowledge base. But this will run into scaling impediments when conversations needs to be specific.

Action skills can be used for a quick survey, or slot filling chatbot. The fact that actions are Watson Assistant’s first foray into end-to-end intent-less conversations is exciting. But these skills cannot handle complex dialog configurations, digression, disambiguation, auto learning etc.

Dialog skills should be the backbone of any conversation, augmented and complimented by search and action skills.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com