Now You Can Set GPT Output To JSON Mode

This feature is separate to function calling. When calling the models “gpt-4–1106-preview” or “gpt-3.5-turbo-1106” the model response can be set to JSON. However there are a number of considerations…

5 min readNov 21, 2023

--

Some Considerations

When using function calling, the JSON mode is always on, however with Chat Completions, the JSON flag needs to be set:

response_format={ “type”: “json_object” }.

With function calling, a JSON schema or structure is created by the user, against which the generated response is matched and the JSON fields are populated.

Hence with function calling a predefined template guides the model on what the structure of the JSON document is. And subsequently the model uses AI to assign entities from the user input to the JSON fields.

When JSON mode is enabled, the model is constrained to only generate strings that parse into valid JSON object.

JSON mode will not guarantee the output matches any specific schema, only that it is valid and parses without errors.

With the new JSON Mode, a challenge is the fact that the JSON output from the model varies considerably with each model query due to the fact that a schema cannot be defined.

Below you will see two vastly different JSON document examples, generated by the model.

One way to create consistency in terms of JSON schemas, is to make use of the seed parameter, as you will see in the code example below.

For a fairly similar input, if the seed parameter is passed, the same JSON schema is repeated.

Also visible is the newly added system fingerprint parameter, here is an example fingerprint returned: system_fingerprint=’fp_eeff13170a’ . The fingerprint can be saved and checked with each response. When a change in the fingerprint is detected, users are notified when backend changes have been made that might impact determinism.

General Guidance From OpenAI

You should always instruct the model to produce JSON via some message in the conversation, for example via your system message. Setting the response format flag alone is not enough.

OpenAI warns that by not including an explicit instruction to generate JSON, the model may generate an unending stream of whitespace and the request may run continually until it reaches the token limit.

The return parameter of finish_reason should be checked if it says stop.

If it says length, it indicates the generation exceeded max_tokens or the conversation exceeded the token limit. To guard against this, check finish_reason before parsing the response.

Below is complete code which you can copy and paste into a notebook; notice how the model is instructed in the system message to generate a JSON document.

Notice the response format set to json_object, and the seed parameter is set.

pip install openai

import openai
import os

os.environ['OPENAI_API_KEY'] = str("Your API Key goes here")

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
model="gpt-3.5-turbo-1106",
response_format={ "type": "json_object" },
messages=[
{"role": "system", "content": "You are a helpful assistant designed to output JSON."},
{"role": "user", "content": "How many Olympic medals have Usain Bolt have and from which games?"}
],
temperature=1,
max_tokens=250,
top_p=1,
frequency_penalty=0,
presence_penalty=0,
seed=1001
)
print(response.choices[0].message.content)
print("###################")
print (response)

Below is one version of JSON document generated where the seed parameter was not set…

{
"medals": {
"gold": 8,
"silver": 0,
"bronze": 0
},
"games": [
"Beijing 2008",
"London 2012",
"Rio 2016"
]
}

And again another schema is generated; hence the earlier statement that setting the seed parameter makes sense to introduce predictable JSON schemas.

{
"athlete": "Usain Bolt",
"total_medals": 8,
"medals": [
{
"games": "2008 Beijing",
"medal": "Gold",
"event": "100m"
},
{
"games": "2008 Beijing",
"medal": "Gold",
"event": "200m"
},
{
"games": "2008 Beijing",
"medal": "Gold",
"event": "4x100m relay"
},
{
"games": "2012 London",
"medal": "Gold",
"event": "100m"
},
{
"games": "2012 London",
"medal": "Gold",
"event": "200m"
},
{
"games": "2012 London",
"medal": "Gold",
"event": "4x100m relay"
},
{
"games": "2016 Rio",
"medal": "Gold",
"event": "100m"
},
{
"games": "2016 Rio",
"medal": "Gold",
"event": "200m"
}
]
}

If the JSON mode is disabled, the response below is generated.

Usain Bolt has won a total of 8 Olympic medals. Here are the details of his medals from each Olympic Games:

2008 Beijing Olympics:
- Gold in 100m
- Gold in 200m
- Gold in 4x100m relay

2012 London Olympics:
- Gold in 100m
- Gold in 200m
- Gold in 4x100m relay

2016 Rio Olympics:
- Gold in 100m
- Gold in 200m

And lastly, the complete model response, in this case the system_fingerprint is return.

ChatCompletion(
id='chatcmpl-8N0qThwbrN5e0tRa0j6u8GBrtSEIM',
choices=[Choice(finish_reason='stop', index=0,
message=ChatCompletionMessage


(content='\n{\n "athlete": "Usain Bolt",\n "total_medals": 8,\n
"medals_by_game": {\n "Beijing 2008": {\n "gold": 3\n },\n
"London 2012": {\n "gold": 3\n },\n "Rio 2016": {\n
"gold": 3\n }\n }\n}',

role='assistant',
function_call=None,
tool_calls=None))],

created=1700495485,
model='gpt-3.5-turbo-1106',
object='chat.completion',
system_fingerprint='fp_eeff13170a',
usage=CompletionUsage

(completion_tokens=83, prompt_tokens=35, total_tokens=118))

⭐️ Follow me on LinkedIn for updates on Large Language Models ⭐️

I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

LinkedIn

https://platform.openai.com/docs/guides/text-generation/json-mode

--

--