Photo by Annie Spratt on Unsplash

What Is GODEL (Large-Scale Pre-Training for Goal-Directed Dialog)?

On May 2022 Microsoft announced GODEL. GODEL is designed for general-domain conversation and is fully open-sourced.

Cobus Greyling
4 min readAug 3, 2022

--

Overview

A key element of Large Language Models is generation. Generation can take various forms, depending on the casting used for few shot learning.

We have seen generation used in creating a chatbot via Large Language Models.

One area where generation is being considered, is with creation and management of dialog/flow design, development and management. The notion of creating a dialog flow from example conversations, and having a machine learning approach to it. Where the systems determine the next dialog to present to the user, based on probability.

Interactive Learning

In 2019 I considered the approach of Rasa, which they call Interactive Learning. With Interactive Learning you can write dialogs (machine learning stories). This is where you write your stories while you are talking to your bot. And while you talk to your bot, the dialog flow is mapped in a browser.

Kore AI also has an innovative approach to building dialogs with their Conversation Driven Dialog Builder, with a conversational first approach. This does remind somewhat of Rasa’s Interactive Learning and Conversation Visualisation.

According to Kore.ai, conversation-driven Dialog builder automatically converts the storyboard scenes into a Dialog Task. Designers can focus on visualising the end-user conversation before building the dialog.

Source

As seen below in the screen extract from Nuance Mix, another case in point. This imminent feature of Nuance Mix allows for reverse engineering a dialog flow from a written transcript.

According to Nuance Mix, automatically convert your conversation paths into a dialog tree as you work. This sounds to be very much in line with what Kore.ai and Rasa are doing.

Source

Considering The GODEL Architecture

In 2019, the Deep Learning and Natural Language Processing groups at Microsoft Research released DialoGPT.

And on May 2022 Microsoft announced GODEL, Large-Scale Pre-Training for Goal-Directed Dialog. GODEL is designed for general-domain conversation and is fully open-sourced.

The pre-trained model of GODEL can be fine-tuned and adapted to be applied to new dialog tasks.

The aim of GODEL is to solve for the long-standing impediment to general-purpose open-ended conversational models. These conversational models can be task-oriented or merely open-ended, hence non-domain specific small-talk.

GODEL is striving to deliver human-like conversations which attain a high level of utility. The model should also have the ability to generate responses based not just on the context of the conversation, but also on external information, content that was not part of the dataset when the model was trained.

I think a key takeaway of the GODEL architecture should be the following…

Consider the JSON format in which the training data needs to be:

{
"Context": "Please remind me of calling to Jessie at 2PM.",
"Knowledge": "reminder_contact_name is Jessie, reminder_time is 2PM",
"Response": "Sure, set the reminder: call to Jesse at 2PM"
},

The fields catered for are:

Context — The context of the current conversation, from start to current turn.

Knowledge — External or environment state represented in plain text.

Response — The virtual agent response message or text. This can be a template, an api call or natural language generation.

In Conclusion

The three elements mentioned above, are vital to a well managed conversational experience.

The context of the conversation needs to be maintained, or at least a brief overview of the conversational session.

Knowledge, this could be entities or slots captured.

Response should also be contextual, but the hard part is text generation. Generating a response which is contextual, linguistically coherent and correct. And with the correct information imbedded, gleaned from knowledge.

https://www.microsoft.com/en-us/research/uploads/prod/2022/05/2206.11309.pdf

--

--