Photo by Quade du Toit on Unsplash

NLP, Large Language Models & AI21labs

And How Does AI21labs Surface Their Language Models

Cobus Greyling
6 min readJun 1, 2022

--

Introduction

When it comes to Large Language Models (LLM’s), AI21labs, co:here, OpenAI Language API & HuggingFace find themselves in the same league.

  • These LLM’s focus on keeping complexity under the hood, and surface functionality in a very simplistic way.
  • Playgrounds are available with example use-cases to get off to a running start.
  • Zero-shot learning is employed to make accurate predictions and analysis without any prior training. This in of itself is astounding in question answering, general chatbots, sentence completion, summarisation, key word extraction and most of all Natural Language Generation.
  • Few-shot learning is available where a very small number of examples or a short description is given as training data.
  • Where traditional NLU/chatbot frameworks focus on control, LLM’s are all about flexibility. However, control needs a degree of flexibility and flexibility also requires a degree of control.
  • Flexibility is available via fine-tuning functionality. Read more about OpenAI Language API fine-tuning here. co:here also has fine-tuning capability, the NLG fine-tuning in specific is impressive. Text classification fine-tuning is also available.
  • These services are available on a pay-per-use basis and AI21labs has a few practical implementation products.
  • These LLM’s can be used as supporting services or API’s within a chatbot implementation in the case of NLG, Sentence boundary detection, named entity recognition, Question and Answer, etc.
  • Or, the LLM’s can be used in the process of creating the chatbot, supporting good copy writing, summarising bot responses in a knowledge base scenario, etc.

Two AI21labs Products

The objective of AI21labs is to build state of the art language models whilst focussing on understanding meaning.

One of AI21labs’ products is wordtune. This interface allows for users to enter a sentence, and the interface rewrites the sentence.

A short and abrupt sentence is transformed with five possible scenarios where the sentence is improved while retaining the meaning.

As seen above, a short and abrupt sentence is transformed with five possible scenarios where the sentence is improved while retaining the meaning.

The writing style can tweaked to be casual, formal, longer or shorter.

From the menu shown above, the writing style can be tweaked to be casual, formal, longer or shorter.

Wordtune read summarises and segments a complete document. Not only is the document summarised, but relevant segments are created.

Wordtune read summarises and segments a complete document. Not only is the document summarised, but relevant segments are created.

There is a use-case where, for a knowledge-base implementation, a document can be segmented into messages. These messages can be presented to the user and on request of the user continued.

OpenAI has a similar implementation where highly technical text is simplified.

Playground

The AI21labs playground is very similar in look and feel to the OpenAI playground. A distinguishing factor of the co:here playground is their visual representation of clusters.

The list of examples in the AI21labs playground is:

  • Chatbot,
  • Twitter agent,
  • Ads copywriter,
  • Instruction to SQL,
  • Product description generator,
  • Idea to title,
  • Outline creator,
  • Summarise restaurant reviews,
  • Classify news topics,
  • Table question answering,
  • Generate code,
  • Python to Javascript,
  • Predict the outcome,
  • De-Jargonizer.

Below is an example of going from an idea to a title. A few training examples are given and the last sentence is the actual implementation.

Below is an example of going from an idea to a title. A few training examples are given and the last sentence is the actual implementation.

As an application is built in the playground, the API call is generated and built-out. This API code can then be copied into a notebook or incorporated into code via an IDE.

Below is an example of going from an idea to a title. A few training examples are given and the last sentence is the actual implementation.

Below the code for a general question and answer chatbot…

import requestsrequests.post("https://api.ai21.com/studio/v1/j1-jumbo/complete",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"prompt": "The following is a general question and answer chatbot\n\n##\n\nUser: What is the longest river in the world",
"numResults": 1,
"maxTokens": 260,
"temperature": 0.58,
"topKReturn": 0,
"topP":1,
"countPenalty": {
"scale": 0,
"applyToNumbers": False,
"applyToPunctuations": False,
"applyToStopwords": False,
"applyToWhitespaces": False,
"applyToEmojis": False
},
"frequencyPenalty": {
"scale": 0,
"applyToNumbers": False,
"applyToPunctuations": False,
"applyToStopwords": False,
"applyToWhitespaces": False,
"applyToEmojis": False
},
"presencePenalty": {
"scale": 0,
"applyToNumbers": False,
"applyToPunctuations": False,
"applyToStopwords": False,
"applyToWhitespaces": False,
"applyToEmojis": False
},
"stopSequences":["##","User:"]
}
)

Models

Models can be trained on top of the LLM’s, this is a customised version or model for a specific implementation, addressing a certain domain. The custom-trained models are deployed instantly and available to you on-demand.

Below is a example training file, CSV version with prompt and completion examples.

Here is a example training file, CSV version with prompt and completion examples.

When importing the CSV, no details are given on why the import has failed, and a minimum of 50 examples are required.

When importing the CSV, no details are given on why the import has failed, and a minimum of 50 examples are required.

Models can be set according to learning rate and number of epochs, but this influences the cost of training the model.

Models can be set according to learning rate and number of epochs, but this influences the cost of training the model.

There is no real barrier to entry, but other platforms like co:here and OpenAI do allow for fine-tuning and model training free of charge, to some degree. This does not seem to be the case with AI21labs.

This is really a pity, giving free access to the training of smaller models can lead to improved adoption.

Datasets

AI21labs has a clean and minimalistic interface to manage datasets via the GUI. A preview pane is available to see the head of the data.

Custom models in AI21 Studio can be trained on very small datasets with as few as 50–100 examples.

They offer better precision and latency and can be more economically scaled-up to serve production traffic.

Conclusion

Fine-tuning will definitely grow and expand as an avenue to leverage the large language models.

Thinking of conversational AI implementations, AI21Labs can act as a supporting technology to:

  • Assist as an initial NLP high-pass on user input.
  • Copywriting and ideation.
  • Summarising documents for knowledge base implementations.
  • General conversational agent.
  • Classification
  • etc.

--

--

Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI. www.cobusgreyling.com