OpenAI Codex & Conversational AI

Creating Code Using Natural Language Understanding


Conversational AI has manifested in a number of ways.

One is more high level Natural Language Processing (NLP).

NLP addresses more high level tasks like summarization, translation, named entity extraction, categorization etc. 🤗HuggingFace is one of the market leaders, and the most recent addition is OpenAI’s API based on GPT-3. The big cloud platform providers also have their NLP environments, like AWS, IBM, Microsoft etc.

Another part of Conversational AI is chatbot development frameworks.

There has been numerous implementation approaches of development frameworks.

Microsoft has a whole ecosystem and IBM Watson Assistant come to mind. And there are a number of other environments. The avant-garde is Rasa with their more ML focused approach.

It needs to be mentioned, with the advent of GPT-3 via OpenAI’s API, a new category of Conversational AI has been introduced. A Text-In-Text-Out API.

This Text-In-Text-Out approach allows for a low-code, minimal fine-tuning framework. Which encapsulates:

  • dialog turn state management,
  • Natural Language Generation (NLG),
  • Natural Language Understanding (NLU) in terms of Intents and Entities.

Conversation User Interface

Conversational User Interface implementations are via two primary mediums;

  • text/chat and
  • speech/voice.

The conversation between the human and the chatbot emulates a conversation between two humans. These conversations have been primarily task based with focus on customer experience, healthcare, entertainment and self-help. With a small portion covering the companion/friendship aspect of a chatbot.

Codex ushers in a new era where conversational input is used to write code and create an application.

Structured & Unstructured Data

A chatbot conversation is where unstructured data (human conversation) is structured for processing and extracting meaning and intent. When the chatbot reply to the user, the structured data needs to be unstructured again into natural language.

This unstructuring and structuring of data demands overhead and special detailed attention. The degree to which the data can be entered unstructured, determines the degree of complexity. The more unstructured the input, the more overhead to structure the input for processing.

Some chatbots simplify the process by presenting the user with buttons, menus and other design affordances. Hence structuring the user interface to some degree.

And again, the degree to which data is unstructured in the chatbot return dialog can be limited with cards etc.

What makes Codex interesting is that natural language is structured at input, but there is no subsequent unstructuring required. Natural language input is structured and code is derived from this data.

This form was created in JavaScript by making use of the sentences listed below.

/* create a header saying Book your appointment */
/* make the font arial */
/* add a horizontal line */
/* Add Text in Arial saying "Select your representative:" */
/* change the font to Arial */
/* next to the text, add a dropdown box with 10 random names */
/* Add text saying "Select the date" */
/* Change the font to Arial */
/* add a date picker */
/* add a time picker */
/* make the background orange */
/* create a horizontal line */

Context is maintained within Codex and not every instruction needs to be explicit.

More On Codex

OpenAI Codex translates natural language into code.

Interesting fact; Codex is the model that powers GitHub Copilot, which OpenAI built and launched in partnership with GitHub.

Codex is proficient in more than a 12 programming languages. Codex takes simple commands in natural language and execute them on the user’s behalf.

OpenAI Codex is based on GPT-3. According to OpenAI, Codex’s training data contains both natural language and billions of lines of source code from publicly available sources, including code in public GitHub repositories.

A JavaScript application of two balls moving randomly and changing to random colors when overlapping. This application was crafted with the sentences listed here.

/* make a red ball bounce on the screen */
/* make it go faster */
/* crop the ball circular */
/* disable scrollbars */
/* make a blue ball bounce on the screen */
/* crop the ball circular */
/* make the blue ball move faster */
/* make both balls larger in size */
/* move the red ball faster */
/* When the blue ball and the red ball overlap, change the background color to a random color */

OpenAI Codex is most proficient in Python, but also incorporates languages like:

  • JavaScript,
  • Go,
  • Perl,
  • PHP,
  • Ruby,
  • Swift
  • TypeScript
  • Shell.

It has a memory of 14KB for Python code, compared to GPT-3 which has only 4KB — so it can take into account over 3x as much contextual information while performing any task.

The best approach to take when building an application:

  • Break a problem down into simpler problems, and
  • Map those simple problems to existing code (libraries, APIs, or functions) that already exist.

The latter activity is probably the least fun part of programming (and the highest barrier to entry), and it’s where OpenAI Codex excels most.

There is quite a bit of discussion on the legitimacy of a low-code approach. Just a few thoughts on low-code…

The Good:

  • Low-code on its own is not a solution to all problems.
  • Smaller applications and utilities are well suited for low-code.
  • Low-code is good for prototyping, experimenting and wireframes.
  • Low-code is well suited as an extension to existing larger implementation, and enabling business units to create their own extensions and customization.
  • Examples of good low-code implementations are IBM Watson Assistant Actions, Microsoft Power Virtual Agents, some of the Amazon Alexa Development Console functionality etc.


  • Fine tuning is problematic with low-code.
  • Scaling and integration.
  • Optimization
  • Performance management
  • Invariably you would want to include functions and extensions not available in your authoring environment.

Coding In Natural Language

Getting started with any programming language can be daunting, and establishing good habits is important. Say for instance you want start with data analytics in Python via a Notebook…


In Codex you can create a sample data frame by saying: “Create a dataframe of 12 random US cities with averate yearly temperature, humidity, rainfall an wind”.

The spelling mistake is accommodated. The code is generated to create a data frame with test data.

Create a dataframe of 12 random US cities with averate yearly temperature, humidity, rainfall an wind

The data frame created:


Here is the full list of code. The red block is the instruction I wrote, the blue block is the code generated with comments by Codex.

By merely asking Codex to create three graphs, the result can be seen below. The spelling mistake is negated, and four graphs are actually created, instead of three.

Create three grahps based on df

The input, in the red block and the Codex output in the blue.

This is a common trait of OpenAI GTP-3 and Codex, extending the initial request and returning an augmented response.

Three of the graphs generated within Google’s Colab notebook.

The request and full code generated…

Create a dataframe of 12 random US cities with averate yearly temperature, humidity, rainfall an wind
import pandas as pd
import numpy as np
import random
# Create a dataframe of 12 random US cities with averate yearly temperature, humidity, rainfall an wind# Create a list of random US cities
cities = ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix', 'Philadelphia', 'San Antonio', 'San Diego', 'Dallas', 'San Jose', 'Austin', 'Jacksonville']
# Create a list of random average yearly temperatures
temps = [random.randint(50, 100) for i in range(len(cities))]
# Create a list of random average yearly humidity
humidity = [random.randint(50, 100) for i in range(len(cities))]
# Create a list of random average yearly rainfall
rainfall = [random.randint(50, 100) for i in range(len(cities))]
# Create a list of random average yearly wind
wind = [random.randint(50, 100) for i in range(len(cities))]
# Create a dictionary of the lists
data = {'City': cities, 'Temperature': temps, 'Humidity': humidity, 'Rainfall': rainfall, 'Wind': wind}
# Create a dataframe from the dictionary
df = pd.DataFrame(data)
# Display the dataframe
Create three grahps based on df
# Create a bar graph of the average yearly temperature for each city'City', y='Temperature', title='Average Yearly Temperature')
# Create a bar graph of the average yearly humidity for each city'City', y='Humidity', title='Average Yearly Humidity')
# Create a bar graph of the average yearly rainfall for each city'City', y='Rainfall', title='Average Yearly Rainfall')
# Create a bar graph of the average yearly wind for each city'City', y='Wind', title='Average Yearly Wind')


There has been much discussion on Low Code versus handcrafted or traditional coding. It is not a case of the one or the other. There is a place and an application for both. With advantages and disadvantages.

I discuss that topic in detail here.

And the same holds true for Codex. Will enterprise systems be built this way, most probably not. Will Fortune 500 companies go the Codex route in principle…no.

But, there are some defernite niche applications, these can include:

  • Solving coding challenge and problems in certain routines.
  • Establishing best practice.
  • Quality assurance.
  • Interactive Learning
  • Generating specific components for subsequent human review.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store