`hPhoto by Josh Harvey on Unsplash

Using A Large Language Model For Entity Extraction

Can LLMs Extract Entities Better Than Traditional NLP methods?

Cobus Greyling
7 min readJul 12, 2022

--

Introduction To Entities

For the purposes of this demo, the Co:here Large Language Model was used.

Entities can be thought of as nouns in a sentence or user input. With conversation design, there are two approaches to entity extraction…

The first is where a more rudimentary, sequential slot-filling process is followed. Where the chatbot prompts the user for each entity one after the other and the user needs to follow this highly structured approach.

For example, in the case of flight booking, the bot prompts the user in the following way to capture the entities.

A framework like AWS Lex V2 has very much a slot filling approach, where the interface is not conversational and unstructured and the framework centres around slot filling.

Secondly, the more sophisticated approach is to design for a compound and contextual approach to entity types. And in the case of Microsoft LUIS machine learning nested entities are being pioneered; you can read more about nested entities here.

A hallmark of this approach is where the chatbot mines the user input thoroughly for entities. The chatbot does not re-prompt the user for any input already provided by the user. The user is also not forced to adhere to a predefined structure and format their input.

This approach is illustrated by the image below, the user input contains compound entities which are extracted contextually from the user utterance.

There is also a tendency amongst the Gartner leaders to have entities associated with specific intents. Hence once an intent is detected, the NLU has a smaller pool of expected possible entities which are associated with the identified intent.

Three Types Of Entities

One could argue that there are three approaches to entity extraction…

NLU Defined Entities

These entities are custom entities, predominantly defined within a chatbot development framework. Read more about the emergence of entity structures in chatbots, and why it is important for capturing unstructured data accurately & efficiency here.

Named Entities

In NLP, a named entity is a real-world object, such as people, places, companies, products, etc. Named entities do not require training or any process defining the named entities (in most cases) NLP / NLU systems detect it automatically. The only impediment is availability of the named entities functionality within a specific human language.

These named entities can be abstract or have a physical existence. Below are examples of named entities being detected by Riva NLU.

Named Entities code block in the Jupyter Notebook

Example Input:

Example Output:

spaCy also has a very efficient named entity detection system which also assigns labels. The default model identifies a host of named and numeric entities. This can include places, companies, products and the like.

Detail On Each Named Entity Detected
  • Text: The original entity text.
  • Start: Index of start of entity in the doc
  • End: Index of end of entity in the doc
  • Label: Entity label, i.e. type

Back To Large Language Models

Before we get to LLM’s and entities…The functionality of LLM’s can be divided into two broad implementations, Generation and Representation.

In this article you can read more on how a generative & representation model can be used to bootstrap a chatbot making use of semantic search, language generation and a concept I like to call intent-documents.

The Representation Language Model are used for classification and semantic search.

Entity Extraction With LLM’s

For entity extraction we will be using Co:here’s Generation Language Model which can be used for Completion, Text Summarisation and Entity Extraction.

Training a model and extracting entities by using a large language model like Co:here are different in the following ways:

  • A small amount of training data is required for a few-shot training approach.
  • The accuracy with highly varying data was astounding.
  • Managing and environment with multiple training samples and multiple entities can become complex. A graphic management studio environment will be ideal to visually manage the entities via a no-code interface.
  • I did not test entity extraction with compound entities, multiple entities per utterance or sentence. The system did well to detect multi word entities, something traditional entity extraction often fail at.
  • The utterances from which the intents were extracted were in some instances quite long, which made the LLM performance all the more impressive.
  • This type of extraction is interesting because it doesn’t just blindly look at the text. The model has picked up on movie information during its pretraining process and that helps it understand the task from only a few examples.

Below is the training data used, in JSON format…

And next we get the data to analyze:

Here are the results:

  • The model got nine out of 10 correct.
  • Number four (4) in the set was missed.
  • Experimentation is required to detect edge-cases along the way. For instance, what if someone mentions two movie titles? The more examples we can add to the prompt that address these cases, the more resilient the results will be.

Conclusion

A few observations from working through the notebook:

  • The few-shot training approach is indeed a more flexible and exciting prospect for entity extraction.
  • A chatbot can be bootstrapped to some degree, and entities can be added to the intent-document approach I discuss here.
  • With only a few training examples, it does seem like a broader base of potential user utterances are covered.
  • I see a use-case emerging where LLM entity extraction can be implemented within a chatbot as an extension, or avenue to bootstrap entity extraction. This is something I would like to explore in the near future.
  • And lastly, there is a dire need for a no-code studio approach through which users can access LLM functionality, create and submit training data and build entity extraction functionality.

--

--