Fine-Tuned Text Classification With co:here
And How To Use The Model For Embeddings Or Classification
Introduction
Currently co:here has fine-tuning options for Natural Language Generation & Representation Models. A previous article covered the fine-tuning options for generation.
Below is the playground view, the custom trained models are listed on the top right and any appropriate model can be selected. From the text “the termination fee” text, a list of appropriate sentences are generated. These sentences can be used for disambiguation or establishing context.
Back to text classification…
In NLP and language understanding in general, text classification is a very common task.
Training a custom model via fine-tuning for a specific domain or implementation can help with:
- Establishing sentiment for utterances, and not only if the utterance is positive or negative, but to what degree it was.
- Wide classifications can be created, not only for positive and negative, bot different classes like accounts, cancelations, upgrades, deliveries etc.
- Classification can also act as an initial high-pass in a conversational agent to determine the users intent.
Brief Overview Of Classification Fine-Tuning
The baseline embeddings do perform really well in many tasks, but fine-tuning becomes relevant for:
- Wanting to boost the performance of the baseline models.
- When domain specific or product specific text are not classified accurately.
Fine-tuning should yield the best results when classifying text and make the co:here LLM specialised for the intended task; thus effectively leveraging the LLM’s.
The representation fine-tuned model can be used for Embeddings or Classification. In the example below, the fine-tuned sentiment model is used to visually represent the sentiment of sentences.
A selection of sentences are entered on the right, and run against the fine-tuned custom model. When examined, both the positive and negative comments range in relevance. There are a few outliers with really no sentiment attached.
Fine-Tuning Process
Data needs to be uploaded in CSV format, below is a selection from the CSV file. Sentiment is divided between 1 and 0. Positive and negative. The selection of data was 1000 records in size.
From the co:here playground, selection the option to create a fine-tuned model. The baseline model selected was small, this was merely to speed up the training process.
This time instead of a generation model, a representation model is selected. For the uploaded CSV the minimum number of records are 250, with 500 as the best minimum. For this demo, 1000 was selected.
Below the file is uploaded and pre-processing is performed on the file.
The CSV file is uploaded and the preview button can be selected. The model type and the baseline mode are confirmed. The name of the model can be defined. As you use the framework, the number of fine-tuned models will grow, hence naming the model in a descriptive manner is vital.
When the Start finetuning button is clicked, the process kicks off. The Representation model trains much faster than the Generation model. The Generation model can take 3 to 6 hours to train. And the time and effort to prepare the training is significantly more demanding.
Below the models within my playground is listed with base model size, Model type (one of two available) and status. Once the process has moved pass Queued and Finetuning, the rest of the process happens quite quick.
Below, the sequence of events and progress tracker of creating a fine-tuned model…
When training the model is done metrics are available and the model is ready for testing in the playground.
And the code to run the classification model with input, and how the model is referenced.
import cohere
from cohere.classify import Example
co = cohere.Client('{apiKey}')
classifications = co.classify(
model='xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xx',
inputs=["I\'m still infatuated with this phone.", "Freezes frequently4.", "best bluetooth on the market.", "Do not make the same mistake as me.", "Don\'t waste your money!.", "Worked perfectly!", "I am also very happy with the price.", "Rip off---- Over charge shipping.", "Better than you\'d expect.", "Price is good too."]
print('The confidence levels of the labels are: {}'.format(
classifications.classifications))
Conclusion
Fine-tuning will definitely grow as an avenue to leverage the large language models of co:here. Thinking of conversational AI implementations, co:here can act as a supporting technology to:
- Assist as an initial NLP high-pass on user input.
- Analysis of unstructured user input in clustering and detecting new intents.
- Detection of areas of conversation not covered by existing defined intents.
- Generation of disambiguation menus sentences or bot response messages.