Photo by Luca Bravo on Unsplash

What Is KI-NLP And How Can It Be Used For Conversational AI?

Knowledge Intensive Natural Language processing (KI-NLP) is well suited for answering questions instead of searching the web or leveraging black-box LLM’s. KI-NLP access a digital archive for relevant information, and the more comprehensive the archive, the more effective the result.

Cobus Greyling
5 min readAug 5, 2022

--

Introduction

Before diving into KI-NLP, I would like to consider an idea. The idea that NLU has always been central to chatbot development frameworks, and that this is in the process of changing.

The idea that language related models (LLM, KI-NLP) are being open-sourced, each covering varying use-cases.

The idea that these models will become part and parcel of chatbot frameworks in the not too distant future.

And lastly, the idea that new user interfaces will merge to access, manage, fine-tune and orchestrate these models.

These models which are being open-sourced and made available, cover the four basic pillars of any Conversational AI solution, as listed below .

What is KI-NLP? It serves as a vey broad domain, knowledge intensive interface for question-answering or fact-checking tasks, known collectively as knowledge-intensive natural language processing (KI-NLP). The AI models underpinning the KI-NLP framework searches through a digital archive for relevant information. The more comprehensive the digital archive, the broader and correct the answers.

Below an example of KI-NLP from the AI21studio playground, where a general question is asked with a response in natural language.

KI-NLP has always been seen as exclusive to the likes of Google, AWS, Meta, etc. But not anymore with the advent of Sphere

Keys to Language Technology like Large Language Models (LLM) use to be held by only a few…

This is changing fast with open-sourced models like BLOOM, NLLB and related to Conversational AI, GODEL, and now Sphere.

Knowledge Intensive Natural Language Processing (KI-NLP) allows for users to ask wide ranging general knowledge questions, and the KI-NLP is able to respond with an accurate, succinct and well formed natural langauge response.

Below is an example from OpenAI where KI-NLP is used to generate answers on virtually any question. The answers are also in natural langauge and brief in form. So we are using KI-NLP’s without even realising it.

Up until now, broad general domain KI-NLP’s where a black box, or largely dependant on Wikipedia and the like. With Sphere an organisation can create their own KI-NLP with full control over the results.

The Problem

KI-NLP faced a few challenges, until very recently.

1️⃣ KI-NLP is dependant on commercial black-box implementations, like the example from OpenAI shown below.

2️⃣ Another challenge is that digital agents are dependant on Wikipedia or other KI-NLP related API’s. An example is a NVIDIA Riva question and answer implementation. As you see in the example below, a question is asked via the NVIDIA notebook and a related response is returned.

3️⃣ Commercial search engines can be leveraged, but an abstraction layer will be required with some logic. Is the information relevant, how does the ranking work, how many results are returned, will it be in natural langauge…

Meta AI’s Approach

Meta AI claims that, with Sphere, they have created the first white-box information retrieval solution with leveraging information from the web.

Why the web? As a source of knowledge, It is:

  1. universal
  2. uncurated
  3. unstructured

At Meta AI, we’re creating new advancements toward more intelligent AI systems that better leverage real-world knowledge.
~ Meta

Sphere as a knowledge source, uses data from the open web…

Sphere contains 134 million documents — split into 906 million passages of 100 tokens each — representing orders of magnitude more data than the knowledge sources considered in current KI-NLP research.

Conclusion

As I alluded to this in the introduction…we are seeing that Large Language Models (LLM) are being open-source, and we are also seeing the open-sourcing of other Langauge Technologies, like search and answer now with KI-NLP.

Access is being democratised and this bodes well for the future.

These different elements will become pivotal in creating a complete conversational AI experience.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com