Photo by Johannes Plenio on Unsplash

Defining What’s Irrelevant In Your Chatbot

IBM Watson Assistant Now Has Enhanced Irrelevance Detection


On 1 March 2022, IBM enhanced Watson Assistant’s irrelevance detection.

The detection classification algorithm was enhanced to use any provided counter examples as part of the training.

The assurance is given that existing workspaces sans counter examples will not be affected.

The update is relevant to the languages: English, French, Spanish, Italian. And very importantly, the general universal language model.

A few key considerations:

  • Irrelevance detection is a key and distinguishing feature of IBM Watson Assistant. Enhancements on it is key and this development is surely a good sign.
  • This feature is only available in classic IBM Watson Assistant.
  • All new instance of Watson Assistant points to the new WA. No new instance of WA can be created in the classic WA (with Dialog Skills).
  • Hence this feature is only available in existing classic WA instances.
  • Hopefully soon Dialog Skills will be available in new WA. Or access to new classic instances granted.
Hopefully soon Dialog Skills will be available in new WA. Currently the button under settings is grayed out.

Utterances with an assigned intent can be marked as irrelevant and saved as counter examples in the workspace. And hence included as part of the training data. This teaches the chatbot to explicitly not answer utterances of this nature.

Develop For User Input Not Relevant To Your Design

In general chatbots are are designed and developed for a specific domain. These domains are narrow and applicable to to the concern of the organization they serve. Hence chatbots are custom and purpose built as an extension of the organization’s operation, usually to allow customers to self-service.

As an added element to make the chatbot more interactive and lifelike, and to anthropomorphize the interface, small talk is introduced. Also referred to as chitchat.

But what happens if a user utterance falls outside this narrow domain? With most implementations the highest scoring intent is assigned to the users utterance, in a frantic attempt to field the query.

Negate False Intent Assignment

Often, instead of stating the intent is out of scope, in a desperate attempt to handle the user utterance, the chatbot assigns the best fit intent to the user; often wrong.

Alternatively the chatbot continues to inform the user it does not understand; and having the user continuously rephrasing the input. Instead of the chatbot merely stating the question is not part of its domain.

A handy design element is to have two or three sentences serve as an intro for first-time users; sketching the gist of the chatbot domain.

The traditional approaches are:

  • Many “out-of-scope” examples are dreamed up and entered. Which is hardly ever successful.
  • Attempts are made to disambiguate the user input.

But actually, the chatbot should merely state that the query is outside of its domain and give the user guidance.


So, user input can broadly be divided into two groups, In-Domain (ID)and Out-Of-Domain (OOD)inputs. ID inputs are where you can attach the user’s input to an intent based on existing training data. OOD detection refers to the process of tagging data which does not match any label in the training set; intent.

An example of the IBM Watson Assistant testing and training interface. User utterances can be assigned to an existing intent, or marked as irrelevant.

Traditionally OOD training requires large amounts of training data, hence OOD not performing well in current chatbot environments.

An advantage of most chatbot development environments is a very limited amount of training data is required; often 15 to 20 utterance examples per intent.

No one want developers spending vast amounts of time on an element not part of the bot’s core.

The challenge is that as a developer, you need provide training data and examples. The OOD or irrelevant input is possibly an infinite amount of scenarios as there is no boundary defining irrelevance.

The ideal is to build a model that can detect OOD inputs with a very limited set of data defining the intent; or no OOD training data at all.

The second option being the ideal…

Defining Irrelevance With Watson Assistant

In BM Watson Assistant you can teach your dialog skill to recognize when input is about topics which is OOD.

Switch Enhanced Irrelevance Detection On For A Skill

User conversations can be reviewed and marked as off-topic subjects and thus irrelevant.

These user utterances marked as irrelevant are saved as counterexamples and included as part of the training data.

Hence training the assistant to explicitly not answer utterances of this type.

While testing your dialog, you can mark an intent, based on a user input, as irrelevant directly from the Try it out pane.

Care should be taken during this process…

While testing your dialog, you can mark an intent as irrelevant directly from the Try It Out pane.
  • There is no way to access or change the inputs from the user interface later.
  • The only way to reverse the identification of an input as being irrelevant is to use the same input in a test integration channel, and then explicitly assign it to an intent.

When you set Irrelevance Detection to enabled, an alternative method for evaluating the relevance of a newly submitted utterance is triggered in addition to the standard method.

To switch this feature on for IBM Watson Assistant:

  1. From the Skills page, open your skill.
  2. From the skill menu, click Options.
  3. On the Irrelevance detection page, choose Enhanced.

This supplemental method examines the structure of the new utterance and compares it to the structure of the user example utterances in your training data.

This approach help chatbots that have few or no counterexamples, recognize irrelevant utterances.

The same user utterance with enhanced irrelevance detection switched on and off.

Looking a the image above, the skill has no intent for account status. Hence with irrelevance detection switched off, it defaults to the intent #Balances.

With the feature switched on, the utterance rightly goes to a Irrelevant status.

To build a chatbot that provides a more customized experience, you want it to use information from data that is derived from within the application’s domain.

And by adding your own counterexamples. Even if only a few.

How Does it Work?

Understanding what your users say is based on two pillars:

  • Intents you need the chatbot to address. Examples of this for a courier company might me order tracking, package collection etc. Training takes place by defining intents and adding example user utterances. These user utterances are grouped according to intent based on what users might say.
  • Defining counter examples which should be deemed as irrelevant or which needs to be ignored.

Time can be spent to understand the target audience’s domain and specific intentions. And subsequently craft or source training data accordingly.

Counter examples should be part of this training data.

The aim of enhanced irrelevance detection is to mitigate any vulnerability in counter examples.

According to IBM:

When enabled, an alternative method for evaluating the relevance of a newly submitted utterance is triggered in addition to the standard method.

The best approach is to add in domain examples, and also counter examples in an iterative fashion with continued monitoring.


This is a subject which does not enjoy the attention it deserves, and is often overlooked as testing is in most cases narrowly modeled on in-domain intents.

Communicating to the user in a clear fashion where the domain boundaries are can save the user from frustration and the chatbot from abuse. 🙂



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Cobus Greyling

Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; NLP/NLU/LLM, Chat/Voicebots, CCAI.