Photo by Rodolfo Flores on Unsplash

Fine-Tuned Text Classification With co:here

And How To Use The Model For Embeddings Or Classification

Cobus Greyling
6 min readMay 19, 2022

--

Introduction

Currently co:here has fine-tuning options for Natural Language Generation & Representation Models. A previous article covered the fine-tuning options for generation.

Below is the playground view, the custom trained models are listed on the top right and any appropriate model can be selected. From the text “the termination fee” text, a list of appropriate sentences are generated. These sentences can be used for disambiguation or establishing context.

From the text “the termination fee” text, a list of appropriate sentences are generated. These sentences can be used for disambiguation or establishing context.

Back to text classification…

In NLP and language understanding in general, text classification is a very common task.

Training a custom model via fine-tuning for a specific domain or implementation can help with:

  • Establishing sentiment for utterances, and not only if the utterance is positive or negative, but to what degree it was.
  • Wide classifications can be created, not only for positive and negative, bot different classes like accounts, cancelations, upgrades, deliveries etc.
  • Classification can also act as an initial high-pass in a conversational agent to determine the users intent.

Brief Overview Of Classification Fine-Tuning

The baseline embeddings do perform really well in many tasks, but fine-tuning becomes relevant for:

  • Wanting to boost the performance of the baseline models.
  • When domain specific or product specific text are not classified accurately.

Fine-tuning should yield the best results when classifying text and make the co:here LLM specialised for the intended task; thus effectively leveraging the LLM’s.

The representation fine-tuned model can be used for Embeddings or Classification. In the example below, the fine-tuned sentiment model is used to visually represent the sentiment of sentences.

Based on a trained model with different sentiments, utterances are grouped in positive and negative clusters, ranging in degrees.

A selection of sentences are entered on the right, and run against the fine-tuned custom model. When examined, both the positive and negative comments range in relevance. There are a few outliers with really no sentiment attached.

Fine-Tuning Process

Data needs to be uploaded in CSV format, below is a selection from the CSV file. Sentiment is divided between 1 and 0. Positive and negative. The selection of data was 1000 records in size.

Data needs to be uploaded in CSV format, below is a selection from the CSV file. Sentiment is divided between 1 and 0. Positive and negative. The selection of data was 1000 records in size.

From the co:here playground, selection the option to create a fine-tuned model. The baseline model selected was small, this was merely to speed up the training process.

This time instead of a generation model, a representation model is selected. For the uploaded CSV the minimum number of records are 250, with 500 as the best minimum. For this demo, 1000 was selected.

A small model size trains faster and might be a good option for testing and prototyping iterations.

Below the file is uploaded and pre-processing is performed on the file.

The file is uploaded and pre-processing is performed on the file.

The CSV file is uploaded and the preview button can be selected. The model type and the baseline mode are confirmed. The name of the model can be defined. As you use the framework, the number of fine-tuned models will grow, hence naming the model in a descriptive manner is vital.

Naming the fine-tuning model is important when models increase in number.

When the Start finetuning button is clicked, the process kicks off. The Representation model trains much faster than the Generation model. The Generation model can take 3 to 6 hours to train. And the time and effort to prepare the training is significantly more demanding.

Below the models within my playground is listed with base model size, Model type (one of two available) and status. Once the process has moved pass Queued and Finetuning, the rest of the process happens quite quick.

The Generation models train longer than Representation models and require more data.

Below, the sequence of events and progress tracker of creating a fine-tuned model…

When training the model is done metrics are available and the model is ready for testing in the playground.

Metrics can be viewed and the playground can be launched from here.

And the code to run the classification model with input, and how the model is referenced.

Conclusion

Fine-tuning will definitely grow as an avenue to leverage the large language models of co:here. Thinking of conversational AI implementations, co:here can act as a supporting technology to:

  • Assist as an initial NLP high-pass on user input.
  • Analysis of unstructured user input in clustering and detecting new intents.
  • Detection of areas of conversation not covered by existing defined intents.
  • Generation of disambiguation menus sentences or bot response messages.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

I’m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

No responses yet