An Overview Of Microsoft LUIS Machine Learned Entities

This Is Important Because In General Speech Nested Entities & Sub-Types Are Used

Cobus Greyling

--

Introduction

Detecting entities which are embedded within user utterances remains a challenge. Especially should you want to capture the entities via unstructured methods and truly conversational.

Microsoft LUIS & Machine Learned Entities (Decomposition of Entities)

Intents can be seen as verbs, the intention of the user. You can think of Google Search as the biggest intent detection machine in the world.

Entities can be seen as nouns. Should a user say: I am taking the train from Paris to Lisbon…then the entities are: train, Paris & Lisbon.

Of course rudimentary methods can be employed to extract entities from one sentence or more…these include:

  • Prompt the user for each entity individually, one after the other. Regardless if the user has already said it or not.
  • Use word spotting or regular expressions to spot or extract specific words. As data grows, this becomes increasingly not feasible.

But What Do We As Humans Do?

Whenever we have a conversation, we are able to naturally and intuitively extract entities from an utterance.

We typically use two methods:

  • Contextual Awareness
  • Decomposition
https://www.merriam-webster.com/dictionary/decomposability

The context of the utterance help us find the entity. Should a customer say: “I wan to travel to Petropavlovsk-Kamchatsky.”

Even though we never heard of that particular city or town, we realize it is an entity (noun) of city, or at least a place.

So we use the context of the specific word in the sentence to know what it represents.

Secondly, we decompose the entity. Not only do we know it is a city, but we know that is a sub-type of destination city. As opposed to city of departure.

Entity Annotations

The process of annotating is a way of identifying entities by their context within a sentence.

Often entities have a finite set of values which are defined. Then there are entities which cannot be represented by a finite list; like cities in the world or names, or addresses. These entity types have too many variations to be listed individually.

For these entities, you must use annotations; entities defined by their contextual use.

LUIS: Annotating Entities in an Intent Examples

And this is where Microsoft’s LUIS really comes to the fore. LUIS makes provision for four entity types:

  • Patterns
  • Regex
  • List
  • Machine Learned
Entity Types Within LUIS.ai

Machine Learned Entities is our area of concern for this article.

Below is a view of our single intent called Travel with the example utterances. You will see that these utterances are relatively complex with multiple entities per utterance.

We could break these up into multiple intents to make the intents simpler. However, LUIS allows us to create complex entities, thus simplifying the intents process.

The Intent “Travel” is defined with Example Utterances

Here you can see the annotated sentences with the contextually defined entity called Travel Detail.

Annotating Entities in Intent Examples

But, you can see sub-types defined for each entity, this speaks to entity decomposition.

Decomposition

Machine Learned Entities were introduced to LUIS November 2019. Entity decomposition is important for both intent prediction and for data extraction with the entity.

When annotating entities within an intent example, entities and sub-types unfold which can be assigned to a single word or multiple words.

We start by defining a single entity named:

  • Travel Detail

Within this entity, we defined three sub-entities. You can think of this as nested entities or sub-types. The three sub-types defined are:

  • Time Frame
  • Mode
  • City

From here, we have a sub-sub-type for City:

  • From City
  • To City
Defining an Entity With Sub-Types which can be Decomposed

This might sound confusing, but the process is extremely intuitive and allows for the natural expansion of conversational elements.

Data is is presented in an easy understandable format. Managing your conversational environment will be easier than previously.

Adding Sub-Entities: ML Entity Composed of Smaller Sub-Entities

Now we can go back to our intent and annotate a new utterance. Only the From City still needs to be defined.

Annotating Utterance Example with Entity Elements

Here are the intent examples, used to train the model with the entity, sub-types, and sub-sub-types; fully contextualized.

Annotated Intent Examples

Testing our NLU Interface

Now that our intent and entities are defined, train the model. Training takes only a few seconds, after which the prototype can be tested.

Click on Train and Test

Below the entered sentence is: “on 3 august i am leaving paris for lisbon by train”. I chose this sentence as it is slightly different from the examples I added to the training data.

The results are clearly marked below.

Test Interface with Intent and ML Entities

Conclusion

One trend which is clear across commercial NLU environments is the merging or moving closer of intents and entities. No more can these two elements exist totally separated, and annotating entities in intent training data is a perfect example.

Being able organize entities and group conversation components with nested entities adds immense leverage.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com