How To Fine-Tune GPT-3 For Custom Intent Classification

Intent recognition of user utterances or conversations is the front-line of most all chatbots. Here I show you step-by-step how to leverage and fine-tune a OpenAI GPT-3 model with your data!

Cobus Greyling
7 min readJan 6, 2023

--

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

Virtually all chatbots have as their bedrock intent detection. And in essence intents/classifications are predefined using example training data, and a model is trained which detect and recognise intents/classifications in real-time from user input.

This is obviously using a LLM in a predictive fashion as opposed to the more commonly known generative scenarios.

Fine-tuning is often necessitated for domain specific use-cases and increasing accuracy for a specific implementation in terms of jargon, industry specific terms, company specific products and services, etc.

Let’s get started…it is easier than what you think! 🙂

For this example we will make use of a public dataset of sports related emails, it is an email mailing list, with 1,197 examples in total. These examples are split between 597 baseball examples and 600 hockey examples.

In this example the GPT-3 ada model is fine-tuned/trained as a classifier to distinguish between the two sports: Baseball and Hockey.

The ada model forms part of the original, base GPT-3-series.

You can see these two sports as two basic intents, one intent being “baseball” and the other “hockey”.

Total examples: 1197, Baseball examples: 597, Hockey examples: 600

I ran the whole fine-tuning process from start to finish in a Colab notebook. One pre-requisite is an OpenAI API key; you will need to register and generate an API key. This is done via the OpenAI console, as seen below.

https://beta.openai.com/account/api-keys

⭐️ Please follow me on LinkedIn for updates. 🙂

Getting The Data

Below is a view of my Colab notebook use for this demo. ⬇️

The newsgroup dataset can be loaded using sklearn. You can examine the data set as seen below:

from sklearn.datasets import fetch_20newsgroups
import pandas as pd
import openai

categories = ['rec.sport.baseball', 'rec.sport.hockey']
sports_dataset = fetch_20newsgroups(subset='train', shuffle=True, random_state=42, categories=categories)

print(sports_dataset['data'][10])

⭐️ Please follow me on LinkedIn for updates. 🙂

Data Transformation

With this snippet of code the data is transform into a pandas dataframe.

import pandas as pd

labels = [sports_dataset.target_names[x].split('.')[-1] for x in sports_dataset['target']]
texts = [text.strip() for text in sports_dataset['data']]
df = pd.DataFrame(zip(texts, labels), columns = ['prompt','completion']) #[:300]
df.head()

df.head() prints out the first few records of the dataframe. It is clear that the data is divided into two columns, prompt and completion.

With this command the dataset is saved as a jsonl file…⬇️

df.to_json("sport2.jsonl", orient='records', lines=True)

In this article I detail the most basic method of creating a fine-tuned model. Below is the basic structure of the training data in JSONL format. With the defined prompt and the completion.

For a production scenario a no-code NLU/NLG Design tool will be required to efficiently ingest, process and structure the NLU/NLG Design data.

{"prompt":"he MLB team with the most Hall of Famers is the New York Yankees with 27.",
"completion":"Baseball"}
{"prompt":"The first organized indoor game was played in Montreal in 1875.",
"completion":"Hockey"}

The Open AI Python code is also really simple:

!pip install --upgrade openai
!openai tools fine_tunes.prepare_data -f sport2.jsonl -q

All suggestions and corrections to the JSONL file is set to auto-accept. Below is the feedback from the data preparation utility:

⭐️ Please follow me on LinkedIn for updates. 🙂

Training The Model

Below is the line of code to initiate the training process…notice the placement of the api-key within the command:

!openai --api-key 'xxxxxxxxxxxxxxxxxxxxxxxx' api fine_tunes.create -t "sport2_prepared_train.jsonl" -v "sport2_prepared_valid.jsonl" --compute_classification_metrics --classification_positive_class " baseball" -m ada

As seen below, detailed feedback is given by OpenAI on the progress of the fine-tuning. Also, the fine-tune cost is given, this is especially important to ensure experimentation and testing does not generate exorbitant costs.

If the stream is interrupted, you can restart it with this command:

!openai --api-key 'xxxxxxxxxxxxxxxxxxxxxxxx' api fine_tunes.follow -i ft-KGbV68gqwGwmfEqnVMMM13FU

The training job ran successfully, the time-stamps below is a good indication of how long the training process was.

[2023-01-06 08:53:57] Created fine-tune: ft-KGbV68gqwGwmfEqnVMMM13FU
[2023-01-06 08:54:54] Fine-tune costs $0.78
[2023-01-06 08:54:54] Fine-tune enqueued. Queue number: 1
[2023-01-06 09:11:24] Fine-tune is in the queue. Queue number: 0
[2023-01-06 09:12:00] Fine-tune started
[2023-01-06 09:14:37] Completed epoch 1/4
[2023-01-06 09:17:11] Completed epoch 2/4
[2023-01-06 09:19:42] Completed epoch 3/4
[2023-01-06 09:22:14] Completed epoch 4/4
[2023-01-06 09:22:46] Uploaded model: ada:ft-personal-2023-01-06-09-22-45
[2023-01-06 09:22:47] Uploaded result file: file-kX8n4tm6DU7s5AFImIxChUAR
[2023-01-06 09:22:47] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai api completions.create -m ada:ft-personal-2023-01-06-09-22-45 -p <YOUR_PROMPT>

⭐️ Please follow me on LinkedIn for updates. 🙂

Testing The Model

The model can be tested with the code below…the fine-tuned classifier is very versatile. And even-though it is trained on emails, other type of inputs can work well in terms of tweets, user utterances, etc.

ft_model = 'ada:ft-personal-2023-01-06-09-22-45'
sample_baseball_tweet="""BREAKING: The Tampa Bay Rays are finalizing a deal to acquire slugger Nelson Cruz from the Minnesota Twins, sources tell ESPN."""
res = openai.Completion.create(model=ft_model, prompt=sample_baseball_tweet + '\n\n###\n\n', max_tokens=1, temperature=0, logprobs=2)
res['choices'][0]['text']

And the result:

 baseball

another example…

ft_model = 'ada:ft-personal-2023-01-06-09-22-45'
sample_baseball_tweet="""The ice is not well maintained so that is something the Caps will have to attend to."""
res = openai.Completion.create(model=ft_model, prompt=sample_baseball_tweet + '\n\n###\n\n', max_tokens=1, temperature=0, logprobs=2)
res['choices'][0]['text']

And the result:

hockey

And the view from the Colab Notebook:

⭐️ Please follow me on LinkedIn for updates. 🙂

The OpenAI Playground

As seen below, the fine-tuned model is immediately available within the OpenAI playground.

⭐️ Please follow me on LinkedIn for updates. 🙂

In Conclusion

For me, going through an exercise like this lends good insight into what demands a production environment will pose.

The missing link in this whole process is a NLU/NLG design tool which can be used to ingest unstructured data and convert this unstructured data into NLU/NLG Design data in a no-code fashion.

As can be seen from the code examples, the training data has a fixed format and manually processing training data via something like Notepad++ or MS Excel are not a feasible and sustainable solution.

A no-code NLU/NLG Design Studio like HumanFirst completes this process of data preparation, structuring curation and LLM integration.

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

https://www.linkedin.com/in/cobusgreyling
https://www.linkedin.com/in/cobusgreyling

--

--