Meta AI’s Blender Bot 3.0 Is An Open Source Chatbot With Long-Term Memory & Internet Search
On 5 August 2022 Meta AI announced Blender Bot 3, the first 175B-parameter bot, publicly available. Approximately 58 times the size of Blender Bot 2. But how does it address the most common model mistakes in contradiction, repetition and hallucinating knowledge?
Before diving into Blender Bot Version 3, there is a concept I would like us to explore. There is an emergence of Large Language Models (LLM), and these models address different conversational disciplines.
The conversational disciplines addressed by these LLM’s can be segmented into the five groups below.
- Embeddings: the clustering of utterances and sentences are analogous to intent detection, but in an unsupervised and automated fashion. An example of this, is the POC that HumanFirst & Cohere performed.
- Dialog Management with GODEL and Blender Bot are exploring avenues in determining the most probable next dialog turn…more on this later in this article.
- Generation not only generates bot responses, but maintains bot dialog, contextual awareness and session context.
- Question & Answer are being addressed by models like KI-NLP (Knowledge Intensive NLP). Where broad domain and general questions can be answered without querying an API or leveraging a traditional knowledge base.
- Language Translation is available on various platforms, with Meta AI’s NLLB.
There are obvious concerns and questions, in terms of specialised hosting and processing requirements. Also fine-tuning, pinpointing and addressing specific areas of user experience.
It seems like there is a need for an abstraction layer when managing these large models. The details below on the Blender Bot do lend some insights.
How does these large models match-up with conventional chatbot development frameworks?
Below is a diagram with the large language models on the left and the traditional chatbot development frameworks on the right. Also listed on the right is the matching functionality from the development frameworks.
Is it a case of one or the other? No. Chatbot development frameworks need to start looking at how the large models can be implemented and put to use in order to deliver a better Conversational AI development framework. And integration with LLM’s can also be used as an avenue for differentiation.
Blender Bot — Version 1 to Version 3
Here is an overview of Blender bot versions 1, 2 and 3.
Version 1 (9.4 billion parameters)
Blender Bot version 1 was announced on 29 April 2020 as an open-sourced open-domain chatbot. The aim was to have an ever-ready conversational interface which feels human and manages context within dialogs.
The chatbot must also make use of NLG, and holds context for the lifetime of the conversation session.
As alluded to in the introduction, these skills or functionality in isolation are manageable to some extent, but blending (or orchestrating) these services are much harder. And performing the blending / orchestration via a rigid state / conditions machine is not the answer. Hence the ability to blend these layers make Blender Bot unique.
As seen below, according to Meta AI, the elements blended together are personality, knowledge and empathy. Considering a much older technology like DialoGPT empathy and personality are definitely missing.
This ability to blend / orchestrate different elements are to be seen as one of Blender Bot’s key differentiators.
- Engaging use of personality (PersonaChat)
- Engaging use of knowledge (Wizard of Wikipedia)
- Display of empathy (Empathetic Dialogues)
- Ability to blend all three seamlessly (BST)
Meta AI makes an interesting statement, that the most common model mistakes are contradiction, repetition and hallucinating knowledge.
I found hallucination a bit annoying with some of the LLM implementations, where knowledge was made up.
True progress in the field depends on reproducibility— the opportunity to build upon the best technology possible.
~ Meta AI
Blender Bot 2.0 was introduced 16 July 2021 with the intention to introduce longer and more knowledgeable, and factually consistent conversations. These conversations should be over multiple sessions.
The model takes relevant information gleaned from the conversation and stores int on long term memory. This is done in order to leverage the knowledge in future conversations in coming days, weeks or months.
Version 3 (175 billion parameter)
On 5 August 2022 Meta AI announced that it has built and released BlenderBot 3, the first 175B-parameter, publicly available chatbot, complete with model weights, code, datasets, and model cards. It is deployed in a live interactive conversational AI demo which can be found here.
BlenderBot 3 delivers superior performance because it’s built from Meta AI’s publicly available OPT-175B language model — approximately 58 times the size of BlenderBot 2.
~ Meta AI
Try Blender Bot
Blender Bot Version 3 can be access via the url: https://blenderbot.ai. As seen below, Blender Bot is only available in the US at this stage.
There are two other alternatives to access Blender Bot prior to version 3. One being the 🤗Hugging Face hosted inference API. The model card can be accessed here. Below is a short interaction with the Blender Bot.
Or access Blender Bot via a Colab notebook, below is Python code for the simplest interaction…
!pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio===0.9.1 -f https://download.pytorch.org/whl/torch_stable.html!pip install transformersfrom transformers import BlenderbotTokenizer, BlenderbotForConditionalGenerationtokenizer = BlenderbotTokenizer.from_pretrained("facebook/blenderbot-400M-distill")model = BlenderbotForConditionalGeneration.from_pretrained("facebook/blenderbot-400M-distill")inputs = tokenizer("I like to write about Conversational AI", return_tensors="pt")
inputsres = model.generate(**inputs)
And a screen print of the input and output.
With an increasing number of models being open sourced, there are opportunities to look at alternative avenues to approach Conversational AI.
At some stage chatbot development frameworks will have to be re-considered, and an array of new language tools and environments will emerge to manage these changes.
Cobus Greyling - City of Johannesburg, Gauteng, South Africa | Professional Profile | LinkedIn
Rasa Hero. NLP/NLU, Chatbots, Voice, Conversational UI/UX, CX Designer, Developer, Ubiquitous User Interfaces…
Cobus Greyling - Medium
Read writing from Cobus Greyling on Medium. NLP/NLU/LLM, Chatbots, Voicebots, Conversational AI, Ubiquitous User…
A state-of-the-art open source chatbot
Facebook AI has and open-sourced BlenderBot, the largest-ever open-domain chatbot. It outperforms others in terms of…
Blender Bot 2.0: An open source chatbot that builds long-term memory and searches the internet
Facebook AI Research has built and open-sourced BlenderBot 2.0, the first chatbot that can simultaneously build…