The Introduction Of Chat Markup Language (ChatML) Is Important For A Number Of Reasons

On 1 March 2023 OpenAI introduced the ChatGPT and Whisper APIs. Part of this announcement was Chat Markup Langauge which seems to have gone largely unnoticed. Here I discuss why ChatML is an important development…

Cobus Greyling
6 min readMar 2, 2023

--

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

Short Recap…

The OpenAI announcement centred around a few main points:

🚀 The significant drop in price for a hosted API, there has been a 90% cost reduction for ChatGPT since December 2022.

🚀 The APIs hosted via Azure will most probably come with very granular management, and regional and geographic availability zones. This speaks to significant potential value-add to the APIs.

🚀 The pressure on ASR suppliers are mounting, differentiation will have to be established via stellar and personal support, granular fine-tuning, support for niche minority languages, etc.

🚀 The Whisper and ChatGPT APIs are allowing for ease of implementation and experimentation. Ease of access to Whisper enable expanded use of ChatGPT in terms of including voice data and not only text.

🚀 Allowing you to access a specific model version and then upgrade when required exposes changes and updates to models. This introduces stability for production implementations.

🚀 These changes are indicative of the increasing maturity of the LLM environments.

Back to Chat Markup Langauge (ChatML)

I believe the introduction of ChatML is extremely significant and important for the following reasons…

⚙️ The main security vulnerability and avenue of abuse for LLMs has been prompt injection attacks. ChatML is going to allow for protection against these types of attacks.

⚙️ To negate prompt injection attacks, the conversation is segregated into the layers or roles of:

  • System
  • assistant
  • user, etc.

⚙️ This is only version 0 of ChatML, and significant development is promised for this language.

⚙️The payload accommodated for in ChatML is currently only text. OpenAI foresee the introduction of other datatypes. This is in keeping with the notion of Large Foundation Models to soon start combining text, images, sound, etc.

Users can still use the unsafe raw string format. But again, this format inherently allows injections.

⚙️ OpenAI is in the ideal position to steer and manage the LLM landscape in a responsible manner. Laying down foundational standards for creating applications.

ChatML makes explicit to the model the source of each piece of text, and particularly shows the boundary between human and AI text.

This gives an opportunity to mitigate and eventually solve injections, as the model can tell which instructions come from the developer, the user, or its own input. ~ OpenAI

ChatML Example Code

Below is a ChatML example JSON file with the roles defined of system, user and assistant.

[{"role": "system", 
"content" : "You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-02"},
{"role": "user",
"content" : "How are you?"},
{"role": "assistant",
"content" : "I am doing well"},
{"role": "user",
"content" : "What is the mission of the company OpenAI?"}]

And the working Python code snippet:

pip install openai

import os
import openai
openai.api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages = [{"role": "system", "content" : "You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-02"},
{"role": "user", "content" : "How are you?"},
{"role": "assistant", "content" : "I am doing well"},
{"role": "user", "content" : "What is the mission of the company OpenAI?"}]
)
#print(completion)
print(completion)

With the output below, notice the role which is defined, the model detail which is gpt-3.5-turbo-0301 and other detail.

{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "The mission of OpenAI is to ensure that artificial intelligence (AI) benefits humanity as a whole, by developing and promoting friendly AI for everyone, researching and mitigating risks associated with AI, and helping shape the policy and discourse around AI.",
"role": "assistant"
}
}
],
"created": 1677751157,
"id": "chatcmpl-6pa0TlU1OFiTKpSrTRBbiGYFIl0x3",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion",
"usage": {
"completion_tokens": 50,
"prompt_tokens": 84,
"total_tokens": 134
}
}

In Closing

One of the challenges of building a conversational interface based on LLMs, is the notion sequencing prompt nodes into chains.

The edges, which sits between the nodes, is hard to manage due to the unstructured nature of the input. And the input is usually in natural langauge or conversational, which is inherently unstructured.

ChatML will greatly assist in creating a standard target for data transformation for submission to a chain.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

https://www.linkedin.com/in/cobusgreyling
https://www.linkedin.com/in/cobusgreyling

--

--