Analysis Of The New OpenBook NLU API

Botpress Recently Announced The Release Of OpenBook, A “Next-Generation NLU Engine”

7 min readJun 23, 2022

In this article I cover how Openbook works, how will it be used and finally Analysis of the solution.

Introduction

I have been fortunate to have received early access to this new product from Botpress, called OpenBook. Botpress is positioning OpenBook as a knowledge based chatbot framework and not a Knowledge Base, per se.

I do get the impression that Openbook is a thin abstraction layer which sits on top of a LLM, but more about that later…

Openbook is still in development, and elements like entities will come to the fore soon.

In the words of Botpress…

Openbook aims to solve four elements in chatbots:

Build faster & easier
Predictability & control
Scalability
Creating conversations that feel authentically human

How does Openbook work?

Openbook is sold as a Knowledge Based chatbot development framework, and not a Knowledge Base per se.
But, Openbook has an API available which will see it mostly leveraged by existing Chatbot frameworks, seeing Openbook will not be bundled with the Botpress offering. Openbook will most likely be made available on a pay-per-use basis.
Considering the image below, the test chat window has a strict and loose setting for Natural Language Generation (NLG). The strict version, returns verbatim the details of the training data.
Also from the image below, history level reminds quite a bit of the conversation context management of OpenAI.

Looking at the image below, the green bot response is generated from the strict answer-level, while the yellow response is from the loose answer-level. Strict is verbatim from the training data, whilst Loose is generated.

The two most impressive features of Openbook is the NLG, and maintaining context within the conversation, for up to 5 dialog turns back into the conversation history. Below is an example where a question is asked about children, and the context of pool is maintained. This feature reminds quite a bit of context management within OpenAI.

Below, is the sum total of the user interface. An Openbook application is constituted by a set of MD files (purple), each file can be considered as sections of the conversation flow, or as skills within a larger Digital Assistant. Within the .md file, are labels (green) which is analogous to intents. Each label in turn has facts (red), and some facts can be supplemented with contextual questions (yellow).

Entities will be introduced shortly by Botpress. Below an example of the JSON format which will define an entity. From the JSON format entities will be fairly advanced with synonyms and patterns. The presence of the patterns element probably denotes defining context for the entities within utterances.

{
  "lists": [
    {
      "name": "Montreal Hotel & Suites",
      "synonyms": [
        "MHS",
        "Montreal Hotel"
      ]
    }
  ],
  "patterns": []
}

How Will Openbook Be Used?

Openbook can be used in standalone mode for smaller implementations.

However, for larger enterprise-grade implementations which need to scale well, Openbook will be used as a supporting API.

The lack of entities and fine-tuning of the flow will make it difficult to implement Openbook for conversations which are longer with multiple dialog turns.

Openbook will also require a chatbot framework to orchestrate the conversation and in turn the framework will have to know when to call on Openbook.

Analysis

Knowledge Bases

Openbook is competing with established knowledge base platforms like IBM Watson Discovery, Oracle Digital Assistant and general QnA systems like NVIDIA Riva and other LLM offerings.

Botpress announced OpenBook as the “next-generation NLU engine”. A bold statement considering advances made by Co:here, OpenAI, AI21labs, etc.

Add to this the semantic search capability delivered by Pinecone, and it is clear that the competition is fierce.

That leads me to the following point, Large Language Models…

LLM’s

I always ask myself, what’s under the hood?

It really does seem like Openbook is built on top of a Large Language Model.

The elements constituting Openbook gives me this impression, for instance:

The short training time
The robust NLG element,
and the way in which the dialog is presented and managed.

Allow me to illustrate this with a few examples…

Below is the knowledge file for the amenities skill within the Hotel Demo application. The knowledge file is used to train Openbook.

When Openbook is queried with the question, “can children use the pool?”

The results below yielded by Openbook is:

When the exact same training data is entered into the Co:here completion API, very much the same result is returned. And the level of verbosity and NLG creativity can be set.

The same training data from amenities.md is entered into OpenAI, with comparable results. Again the NLG can be tweak in terms of creativity.

And AI21labs studio…

Lastly, keep in mind there are free open source AI initiatives like EleutherAI, shown below. Where the same approach is followed with good results. Deepset also has quite a few open source solutions available for semantic search, question answering and Conversational AI.

EleutherAI is a grassroots collective of researchers working to open source AI research.

Conclusion

I do get the impression that Openbook is a thin abstraction layer which sits on top of a LLM.

The likely architecture is where the user input (step 1) is sent to a LLM zero shot Semantic search API, to determine in which MD file the query falls.

From here the contents of MD file is sent, with the user input, to a LLM Generation API (step 2) for the bot response generation. In instances where context needs to maintained, (step 3) previous user inputs are included for Generation.

Step 3 is for instances where the user asks something like, “can children use the hotel’s pool?”, followed by a question like “how big is it?”. LLM Generation does quite well in detecting this contextual reference (pool) and responding accurately, in the case of this example, with the size of the pool.

This is a logical conclusion considering the generated answers based on few-show training scenario and the short training time. Also the NLG and the ability to set the temperature on generated answers.

If this is the case, there are a few considerations…one is the possibility that Openbook is dependant on underlying licensed software and the value add of Openbook in the greater scheme of things.

Openbook has been quite mute on what lies under the hood, due to the time I’ve spent recently with LLMs, I’ve grown to understand how they can be used and could definitely see how this approach would work. I will be very interested in being proven right or wrong 🙂.

Cobus Greyling - City of Johannesburg, Gauteng, South Africa | Professional Profile | LinkedIn

Rasa Hero. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces…

www.linkedin.com

Cobus Greyling — Medium

Read writing from Cobus Greyling on Medium. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer…ç

cobusgreyling.medium.com

Eliza Language Technology Community - Language Technology: Conversational AI, NLP/NLP, CCAI…

ELIZA - Where language technology enthusiasts unite.

eliza.community

AI Playground

A grassroots collective of researchers working to open source AI research.

20b.eleuther.ai

Botpress Is Releasing OpenBook Private Beta Within The Next Few Weeks

Botpress Recently Announced The Release Of OpenBook, A “Next-Generation NLU Engine”

cobusgreyling.medium.com

Build Chatbots | Chatbot for Developers

Botpress is a modern developer stack to build enterprise and open-source chatbots. Learn about our conversational AI…

botpress.io

OpenBook by Botpress - Chatbot Q&A Data

Textual Question and Answering dataset annotated across 12 quality parameters

www.kaggle.com

Read the Botpress OpenBook whitepaper.

NLP Solutions to Streamline Semantic Search | deepset

deepset is the company behind Haystack NLP framework. We help you build production-ready NLP.

www.deepset.ai

GitHub - yandex/YaLM-100B: Pretrained language model with 100B parameters

YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and…

github.com

Analysis Of The New OpenBook NLU API

Botpress Recently Announced The Release Of OpenBook, A “Next-Generation NLU Engine”

Introduction

How does Openbook work?

How Will Openbook Be Used?

Analysis

Knowledge Bases

LLM’s

Conclusion

Cobus Greyling - City of Johannesburg, Gauteng, South Africa | Professional Profile | LinkedIn

Rasa Hero. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces…

Cobus Greyling — Medium

Read writing from Cobus Greyling on Medium. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer…ç

Eliza Language Technology Community - Language Technology: Conversational AI, NLP/NLP, CCAI…

ELIZA - Where language technology enthusiasts unite.

AI Playground

A grassroots collective of researchers working to open source AI research.

Botpress Is Releasing OpenBook Private Beta Within The Next Few Weeks

Botpress Recently Announced The Release Of OpenBook, A “Next-Generation NLU Engine”

Build Chatbots | Chatbot for Developers

Botpress is a modern developer stack to build enterprise and open-source chatbots. Learn about our conversational AI…

OpenBook by Botpress - Chatbot Q&A Data

Textual Question and Answering dataset annotated across 12 quality parameters

NLP Solutions to Streamline Semantic Search | deepset

deepset is the company behind Haystack NLP framework. We help you build production-ready NLP.

GitHub - yandex/YaLM-100B: Pretrained language model with 100B parameters

YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Cobus Greyling

No responses yet