An Overview Of Chatbots, Codex, GPT-3 & Fine-Tuning

And The Importance Of Implementation Efficiency For Each Task

Cobus Greyling
15 min readSep 12, 2021

--

Introduction

There is been much talk about the low-code approach to software development and how it acts as a catalyst for rapid development. And how it can act as a vehicle for delivering solutions with minimal bespoke hand-coding.

A JavaScript Game created with the OpenAI Codex interface using only Natural Language input.

Low-code interfaces are made available via a single or a collection of tools which is very graphic in nature; and initially intuitive to use.

Thus delivering the guise of rapid onboarding and speeding up the process of delivering solutions to production.

As with many approaches of this nature, initially it seems like a very good idea. However, as functionality, complexity and scaling start playing a role, huge impediments are encountered.

JavaScript demo application created using natural language via OpenAI’s Codex.

In this article I want to explore:

What exactly does fine-tuning refer to in chatbots and why a low-code approach cannot accommodate it.

Looking at fine-tuning, it is clear that GPT-3 is not ready for this level of configuration, and when a low-code approach is implemented, it should be an extension of a more complex environment. In order to allow scaling into that environment.

What does Codex mean for low-code implementations and how can it be applied.

OpenAI, OpenAI API, GPT-3 & Codex

OpenAI

OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence benefits all of humanity.

GPT-3

GPT-3, in simple terms, is a language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series created by the company OpenAI.

OpenAI API

OpenAI API, makes use of GPT-3. And is a text-in-text-out API. Submit any text prompt to the API, and the API return a text completion. A pattern is detected from the input text, and matched in the output.

You can program the API by adding a few example dialog turns illustrating what you are trying to achieve.

There are fine-tuning options available to OpenAI API, but it is currently in Beta and limited.

API Positives

  • GPT-3 has quite a bit of functionality which can serve to augment a current chatbot.
  • Dialog can be diversified with the NLG capability.
  • General chit-chat can easily be created.
  • Copywriting is made easy for slogans, headlines, reviews etc.
  • Text transformation
  • Text generation
  • Creating a general purpose bot to chat to.
  • With their underlying processing power and data, creating flexible Machine Learning stories should be a good fit.

Not-so Positives

  • The API is cloud hosted
  • Cost
  • Social media bot content generation
  • Not a framework for sustainable chatbot scaling; yet.
  • Possible over and under steering with training data.

Codex

In essence, OpenAI Codex is an AI system that translates natural language into code.

Codex powers GitHub Copilot, which OpenAI built and launched in partnership with GitHub recently. Codex can interpret simple commands in natural language and create and execute code. NLU to applications.

This is a new implementation of natural language, one could argue. Code can also be submitted, and is explained in natural language. And code can be debugged.

There are some defernite niche applications, these can include:

  • Solving coding challenge and problems in certain routines.
  • Establishing best practice.
  • Quality assurance.
  • Interactive Learning
  • Generating specific components for subsequent human review.

Exceptional Elements which can be easily overlooked in Codex:

  • How natural and coherent the comments within the code is. The Natural Language Generation (NLG) is obviously on par with GPT-3.
  • Context is maintained within the conversation and where user input is not comprehensive or explicit, accurate implicit assumptions are made by Codex.
  • The code works. Users can copy it out of the Codex preview, paste it into a Notebook and execute. I did not have an instance where the code did not execute.
  • The process of generating code is very modular and a request is broken up in to separate sequential steps.

What Codex Is & What Codex Is Not

What is Codex?

In its essence, OpenAI Codex translates natural language into code. And, can translate code (back) into natural language…or at least explain what the software does.

How might Codex be used?

There are a few practical implementations for Codex…

Date calculator calculates the days difference between two dates created with a date picker. Created with Codex in JavaScript using natural language.
  1. As a coding tutor, helping developers in their coding endeavors. Software principles and functionality can be illustrated & explained. Functions and applications can be created from natural language input.
  2. Codex can be implemented as an AI powered Stack Overflow where users can ask questions in natural language, and Codex responds with code examples.
  3. Or code can be explained.
  4. Codex can act as a general resource for a company’s internal development team. Code can be generated or data insights can be gleaned etc.
  5. For individual use, of course. For any developer this is the ultimate Google for anything code related.
  6. Quality assessment.
  7. Automation of code validation.

What will Codex not be used for?

The brilliance of Codex is really evident and undisputable. The code generated is accurate and works. But…

  1. Codex will not replace developers and software engineers. Not now, at least. This is AI assisting & augmenting developers, not replacing developers.
  2. Codex does well in affording you various options, anticipating what your next question might be. Context is maintained and questions from users need not always be explicit. It is not autonomous and does not perform orchestration. Typos are absorbed. But…it is not autonomous, problems need to be broken down into smaller pieces and those presented to Codex. You need to know what you want to achieve, and how to get there. Basically define your algorithm.

The best approach to take when building an application using Codex…

Time difference in seconds between the two times are calculated. The times are selected with time pickers. Created with Codex in JavaScript using natural language.
  • Break a problem down into simpler problems or modules, and…
  • …then convert those simpler problems into code segments which can be combined and executed.
  • As a user, you need to know what you want to achieve; have an idea of what the best software is for the application.
  • You need to be able break your algorithm into into smaller tasks or modules.
  • If you have an idea, you can ask Codex for best practice. For instance, when a data frame is loaded via Python, you can ask, give me visualizing suggestions. Or, how can Crosstab be implemented, and Codex takes the initiative.

More about Codex…

Codex is the model that powers GitHub Copilot, which OpenAI built and launched in partnership with GitHub.

Codex is proficient in more than a 12 programming languages. Codex takes simple commands in natural language and execute them on the user’s behalf.

OpenAI Codex is based on GPT-3. According to OpenAI, Codex’s training data contains both natural language and billions of lines of source code from publicly available sources, including code in public GitHub repositories.

OpenAI Codex is most proficient in Python, but also incorporates languages like:

  • JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, Shell.

What surprised me about Codex?

  • How natural and coherent the comments added tot the code are. The Natural Language Generation (NLG) is obviously on par with GPT-3. The cadence of comments in code is also tight.
  • Context is maintained within the conversation and where user input is not comprehensive or explicit, accurate implicit assumptions are made by Codex.
  • The code works. Users can copy it out of the Codex preview, paste it into a Notebook and execute. I did not have an instance where the code did not execute.
  • The process of generating code is very modular and a request is broken up in to separate sequential steps. Even if your request is quite encompassing, Codex with break it down and present the code in a modular fashion.
  • Codex is AI with which users most probably will interact on a daily basis.
  • Mistakes in spelling and grammar are absorbed by Codex with surprising resilience.

Other Current Examples of Low-Code

Two low-code implementations have been successful of late. IBM Watson Assistant Actions and Microsoft Power Virtual Agents.

The reason these have been successful is because the low-code component can be used as a stand-alone approach. Or for complex implementations, act as an extension.

Fine-Tuning

When someone refers to the ability or the extend to which fine-tuning can be performed, what exactly are they referring to? In this section we are going to step through seven elements which constitutes fine-tuning.

  • Forms & Slots
  • Intents
  • Entities
  • Natural Language Generation (NLG)
  • Dialog Management
  • Digression
  • Disambiguation

Forms & Slots

An Intent is the user’s intention with their utterance, or engagement with your bot. Think of intents as verbs, or working words. An utterance or single dialog from a user needs to be distilled into an intent.

NVIDIA Riva Jupyter Notebook. Here the domain is not provided, the intent and slot are shown with the score.

NVIDIA Riva Jupyter Notebook. Here the domain is not provided, the intent and slot are shown with the score.

Entities can be seen as nouns, often they are referred to as slots. These are usually things like date, time, cities, names, brands etc. Capturing these entities are crucial for taking action based on the user’s intent.

NVIDIA Riva Weather App with contextual entities.

Think of a travel bot, capturing the cities of departure, destination, with travel mode, costs, dates and times etc. are at the foundation of the interface. Yet, this is the hardest part of the NLU process.

Keep in mind the user enters data randomly and unstructured; in no particular order.

We as humans identify entities based on the context we detect and hence we know where to pick out a city name; even though we have never previously heard the city name.

Make sure the vocabulary for an intent is specific to the intent it is meant for. Avoid having intents which overlaps.

For example, if you have a chatbot which handles travel arrangements such as flights and hotels, you can choose:

  • To have these two user utterances and ideas as separate intents
  • Or use the same intent with two entities for specific data inside the utterance; be it flights or hotels.

If the vocabulary between two intents are the same, combine the intent, and use entities.

Take a look at the following two user utterances:

  • Book a flight
  • Book a hotel

Both use the same wording, “book a”. The format is the same so it should be the same intent with different entities. One entity being flight and the other hotel.

Intents

In most chatbot design endeavors, the process starts with intents. But what are intents? Think of it like this…a large part of this thing we call the human experience is intent discovery. If a clerk or general assistant is behind a desk, and a customer walks up to them…the first action from the assistant is intent discovery. Trying to discover what the intention of the person is entering the store, bank, company etc.

We perform intent discovery dozens of times a day, without even thinking of it.

Google is the biggest intent discovery machine in the world!

The Google search engine can be considered as a single dialog-turn chatbot. The main aim of Google is to discover your intent, and then return relevant information based on the discovered intent. Even the way we search has inadvertently changed. We do not search with key words anymore, but in natural language.

Intents can be seen as purposes or goals expressed in a customer’s dialog input. By recognizing the intent expressed in a customer’s input, the assistant can select an applicable next action.

Current customer conversations can be instrumental in compiling a list of possible user intents. These customer conversations can be data via speech analytics (call recordings) or live agent chat conversations. Lastly, think of intents as the verb.

Entities

Entities can be seen as the nouns.

Entities are the information in the user input that is relevant to the user’s intentions.

Intents can be seen as verbs (the action a user wants to execute), entities represent nouns (for example; the city, the date, the time, the brand, the product.). Consider this, when the intent is to get a weather forecast, the relevant location and date entities are required before the application can return an accurate forecast.

Recognizing entities in the user’s input helps you to craft more useful, targeted responses. For example, You might have a #buy_something intent. When a user makes a request that triggers the #buy_something intent, the assistant's response should reflect an understanding of what the something is that the customer wants to buy. You can add a product entity, and then use it to extract information from the user input about the product that the customer is interested in.

NLG

NLG is a software process where structured data is transformed into Natural Conversational Language for output to the user. In other words, structured data is presented in an unstructured manner to the user. Think of NLG is the inverse of NLU.

With NLU we are taking the unstructured conversational input from the user (natural language) and structuring it for our software process. With NLG, we are taking structured data from backend and state machines, and turning this into unstructured data. Conversational output in human language.

Commercial NLG is emerging and forward looking solution providers are looking at incorporating it into their solution.

Dialog Management

Most often in conversational journeys, where the user make use of a voice smart assistant, the dialog is constituted by one or two dialog turns. Questions like, what is the weather or checking travel time.

With text based conversations, like chatbots, multiple dialog turns are involved hence management of the dialog becomes critical. For instance, if an user want to make a travel booking, or making a restaurant reservation the dialog will be longer.

RASA Interactive Learning and Conversation Visualization

Your chatbot typically has a domain, a specific area of concern, be it travel, banking, utilities etc.

Grounding is important to establish a shared understanding of the conversation scope.

You will see many chatbots conversations start with a number of dialogs initiated by the chatbot. The sole purpose of these messages is to ground the conversation going forwards.

Secondly, the initiative can sit with the user or with the system; system-directed initiative. In human conversations the initiative is exchanged between the two parties in a natural way.

Ideally the initiative sits with the user, and once the intent is discovered, the system-directed initiative takes over to fulfill the intent.

If the initiative is not managed, the flow of dialog can end up being brittle. Where the user struggles to inject intent and further the dialog. Or even worse, the chatbot drops out of the dialog.

Digression

Digression is a common and natural part of most conversations. The speaker, introduces a topic, subsequently the speaker can introduce a story that seems to be unrelated. And then return to the original topic.

Digression can also be explained in the following way… when an user is in the middle of a dialog, also referred to customer journey, Topic or user story.

And, it is designed to achieve a single goal, but the user decides to abruptly switch the topic to initiate a dialog flow that is designed to address a different goal.

Hence the user wants to jump midstream from one journey or story to another. This is usually not possible within a Chatbot, and once an user has committed to a journey or topic, they have to see it through. Normally the dialog does not support this ability for a user to change subjects.

Often an attempt to digress by the user ends in an “I am sorry” from the chatbot and breaks the current journey.

Hence the chatbot framework you are using, should allow for this, to pop out and back into a conversation.

Disambiguation

Ambiguity is when we hear something which is said, which is open for more than one interpretation. Instead of just going off on a tangent which is not intended by the utterance, I perform the act of disambiguation; by asking a follow-up question. This is simply put, removing ambiguity from a statement or dialog.

https://www.dictionary.com/browse/disambiguate

Ambiguity makes sentences confusing. For example, “I saw my friend John with binoculars”. This this mean John was carrying a pair of binoculars? Or, I could only see John by using a pair of binoculars?

Hence, I need to perform disambiguation, and ask for clarification. A chatbot encounters the same issue, where the user’s utterance is ambiguous and instead of the chatbot going off on one assumed intent, it could ask the user to clarify their input. The chatbot can present a few options based on a certain context; this can be used by the user to select and confirm the most appropriate option.

Just to illustrate how effective we as humans are to disambiguate and detect subtle nuances, have a look at the following two sentences:

  • A drop of water on my mobile phone.
  • I drop my mobile phone in the water.

These two sentences have vastly different meanings, and compared to each other there is no real ambiguity, but for a conversational interface this will be hard to detect and separate.

Conclusion

Often, technology at its infancy and inception seems rudimentary, awkward and redundant. Invariably discussions ensue on the new tech’s viability and right to existence, comparing it to technologies steeped in history and innumerous iterations.

When thinking in terms of low-code or even NLU in this case it is not an all or nothing scenario.

Some comments on low-code in general…

The Good:

  • Low-code on its own is not a solution to all problems.
  • Smaller applications and utilities are well suited for low-code.
  • Low-code is good for prototyping, experimenting and wireframes.
  • Low-code is well suited as an extension to existing larger implementation, and enabling business units to create their own extensions and customization.
  • Examples of good low-code implementations are IBM Watson Assistant Actions, Microsoft Power Virtual Agents, some of the Amazon Alexa Development Console functionality etc.

Impediments:

  • Fine tuning is problematic with low-code.
  • Scaling and integration.
  • Optimization
  • Performance management
  • Invariably you would want to include functions and extensions not available in your authoring environment.

And the same holds true for Codex. Will enterprise systems be built this way, most probably not. Will Fortune 500 companies go the Codex route in principle…no.

But, there are some defernite niche applications, these can include:

  • Solving coding challenge and problems in certain routines.
  • Establishing best practice.
  • Quality assurance.
  • Interactive Learning
  • Generating specific components for subsequent human review.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com