How To Fine-Tune A Large Language Model Using HumanFirst & Cohere

In this article I consider converting unstructured data into NLU Design data for training and fine-tuning a Large Language Model (LLM).

Cobus Greyling
6 min readDec 8, 2022

--

Intent recognition is an important part of any digital assistant. In a previous article I detailed a few-shot learning approach to intent detection using a LLM.

In this article I detail the process of creating a fine-tuned (custom) large language model by making use of two technologies, HumanFirst Studio and Cohere.

In order to train and fine-tune a large language model (LLM) for classification (intent recognition) a corpus of user utterances or sentences are required which are all labeled.

Currently there is existing direct integration between the HumanFirst Studio and the Cohere Large Language Models. Read more about it here.

To create a fine-tuned (custom) model in Cohere demands a CSV file consisting of two columns. There needs to be sample text in the first column, and its associated label on the second column.

Also, five unique examples per label must exist with at least 250 unique examples in the total training sent.

HumanFirst Studio

The discipline of NLU Design is the process of converting unstructured conversational data into structured NLU Design data. Making use of the HumanFirst Studio, I could import more than 3,000 user utterances related to the banking domain.

Within minutes I had clusters setup with refined granularity, cluster sizes and defined intent names, as seen below.

Within HumanFirst Studio, all intents can be selected or unselected, or a collection of intents can be selected; all based on the nature of the custom model you are creating.

Further refinement of the export is possible, especially in the area of simplifying the export, and deprecating data features which are not accommodated by the destination system.

The export of > 3,000 user utterances and close to 100 intents were exported in mere seconds, after which a confirmation message is displayed.

Below you see the CSV file export, with the two columns of utterances and intent labels.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

Cohere Dashboard

Via the Cohere dashboard (which is a different environment to the playground) users can create a custom model. Custom models are segmented into three areas of NLP:

  1. Generation
  2. Classification
  3. Embedding

For the purposes of this example, we will focus on the classification model-type. The idea is to create an intent classification model based on our labeled training data.

Hence the Classify model is selected. I find it curious that the base-model size cannot be selected during this process. Cohere has small, medium, large and extra large base models.

The dashboard guides the user on the format of the training data and the training data can be uploaded via the dashboard following a no-code approach. The format of the training data cannot be simplified more. The fact that there is no requirement for a highly structured JSON format streamlines the process considerably.

Once the data is imported, a selection of data is displayed within the Cohere dashboard from where training and validation of the model can be done.

Once the process starts, if you check back under the models section, and the “your models” section in specific, you will see I have four other custom models which are already trained. With the latest models in the process of training.

And lastly, once the custom model is trained, Cohere sends you an email and your custom, fine-tuned, intent classification model it is ready to be tested out!

Conclusion

In closing, two elements I wished to highlight with this demonstration are:

  1. The ease with which NLU Design and Training data can be created using a tool like the HumanFirst Studio
  2. Secondly, the importance of creating a custom fine-tuned model. This fine-tuning is a necessity for any production, enterprise implementation.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

https://www.linkedin.com/in/cobusgreyling
https://www.linkedin.com/in/cobusgreyling

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com