AI: Build Your First Machine Learning Model (Part 2 of 2)

Make use of Free Tools to create your very first ML Model!

Cobus Greyling

Follow

3 min readAug 9, 2019

--

Why are we doing this?

Read Part 1 here…

Text categorization is the task of assigning predefined categories to free-text dialogs. It can provide conceptual views of conversations and has important applications in the real world. For example, as news stories are categorized according to different subjects, a conversation or single dialog can be categorized.

Unstructured data, or sometimes referred to as dark data, is often in the form of video/images and text. The source of text can be emails, chats, web pages, social media, support tickets, survey responses, and more. W will focus on chats. Extracting insights from dialogs or text can be hard and time-consuming due to its unstructured nature. Organisations can employ text categorization for structuring dialogs in a fast and inexpensive way to enhance automated conversations with customers.

Most NLU/NLP interfaces have the functionality to categorize a dialog built in. These categories, even though often fast, are predefined. So, introducing a custom category or creating sub categories can be daunting; or impossible.

Say for instance you need to write a chatbot interface for a multi-national organisation. Do do initial intent discovery will be daunting. But what if you could perform an initial categorization of the conversation, and then direct the conversation to a specific skill withing your assistant to deal with the query.

In part one I created a simple data structure for my custom category. This data structure for (as seen in the image below) was converted into a model with the use of Watson Knowledge Studio.

Simple Data Structure for Custom Categories

This custom categories are then deployed to IBM Watson NLU API. On deployment in Knowledge Studio, a model ID is created. The custom categories can can be referenced via this model ID from Watson NLU. As seen below, the text is sent, and the categories model is referenced which provides the context.

Example 1: Custom Category is referenced from Watson NLU API payload

The user states in Example 1 a requirement for a specific glove; singular, not plural. According to our training data, this will refer to baseball. This kind of granular and specific categorization would otherwise not be possible. Have a look at Example 2 below where the label is returned of “/Sports/Sporting Goods/Baseball Equipment”.

Example 3 again shows how another element is referred to by the user, in this case a helmet, and this again is categorized as a conversation regarding American Football.