What Are Realistic GPT-4 Size Expectations?

This article aims to cut through the hype by considering historic scaling and current trends of existing LLM models.

6 min readJan 16, 2023

An August 2021 article from Wired first rumoured the possibility of 100 Trillion parameter GPT-4 model…

More recent astute commentaries concluded that everything is in the air regarding GPT-4’s size…but what would be a reasonable expectation in terms of GPT-4?

The short answer is that OpenAI’s GPT-4 will be closer to 1 Trillion parameters, as shown in the image below.

This is in stark contrast with the widely shared big circle graphic of a 100 Trillion Parameter GPT-4 model next to the small dot of a 175 Billion Parameter GPT-3…

I say this for 3️⃣ reasons…

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

1️⃣ The LLM Trends point to a one trillion model, or close to it.

A billion = 1,000,000,000 & a trillion = 1,000,000,000,000

The graph below shows how a one trillion GPT-4 model compares to existing Large Language Models (LLMs).

The graph also include models which are not yet generally available, like Google’s PaLM model and LaMDA. The graph clearly shows the progression and relative scale of models in sheer parameters.

From this graphic it should be evident how model sizes are scaling and what realistic expectations should resemble.

The data constituting the graph:

GPT-1           117 000 000
GPT-Neo       1 300 000 000
GPT-2         1 500 000 000
GPT-J         6 000 000 000
Fairseq      13 000 000 000
GPT-NeoX     20 000 000 000
Chinchilla   70 000 000 000
LaMDA       137 000 000 000
GTP-3       175 000 000 000
BLOOM       176 000 000 000
Gopher      280 000 000 000
MT-NLG      530 000 000 000
PaLM        540 000 000 000
GPT-4     1 000 000 000 000

Considering the same graph with GPT-4 at 100 Trillion Parameters below…it is evident that there is no correlate with the natural progression of LLMs.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

2️⃣ The Experts point to a scenario of less parameters, not more

I recently posed the question on LinkedIn, regarding the GPT-4 parameter count.

Below is an authoritative and astute comment from Gregory Whiteside on that post:

I was fortunate to be invited to the Scale X event where Greg Brockman was speaking (among other people who obviously know a lot more than us where all of this is going) — size of model isn’t the only thing that will allow performance to continue improving, but what they’re seeing is that size / number of parameters is directly correlated with ability to generalize / be applied to newer and more powerful tasks.

The quality / type of data being used to train these models is also going to keep improving (my guess is that Whisper was released as an open model to accelerate the creation of high-quality voice transcripts to feed the next generation of LLMs with a completely new source of high-value data: voice).

A second quote by Gregory Whiteside leads me to the third point…

My hunch would be that OpenAI released Whisper as an open model (instead of monetizing it like their other APIs) to accelerate this high-quality STT data availability — if every single publicly available video / audio on the net comes with high-quality transcriptions (without OpenAI paying for it), makes OpenAI’s job cheaper / faster.

There also other authoritative sources citing caution in exaggerating the GPT-4 parameter count.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

3️⃣ There is only so much quality data

A recent study found that stock of high-quality language data will be exhausted in all likelihood by 2026. In contrast, low-quality language and image data will be exhausted only between 2030 and 2050.

The study also found that the current growth of LLM size will slow down if the data efficiency does not improve or new sources of data become available.

Language datasets have grown exponentially by more than 50% per year, and contain up to 2e12 words as of October 2022.
The stock of language data currently grows by 7% yearly, but our model predicts a slowdown to 1% by 2100.

Each graph shows a representation of data usage with past trends and projections.

Thanks to Julien Simon for pointing out this study.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

In Conclusion

There are two elements which capture the imagination of the general public when it comes to AI.

The first is the sheer number of parameters which make for really captivating graphics. 🙂

The second is the ability of a LLM to have a natural, general conversation without any form of training, Like ChatGPT.

For enterprise adoption however, there is a requirement to fine-tune LLMs and train these vast models on specific use-case related data.

From a Chatbot / Digital Assistant perspective, intent classification is front of mind.

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

NLU design tooling

“Conversation Designer, Retail, 10k+ employees The tool that turned conversation designers, into NLU designers” ★★★★★…

www.humanfirst.ai

https://www.linkedin.com/in/cobusgreyling

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

cobusgreyling.medium.com

Eliza Language Technology Community — Language Technology: Conversational AI, NLP/NLP, CCAI…

ELIZA — Where language technology enthusiasts unite.

eliza.community

The Cobus Quadrant™ Of NLU Design

NLU design is vital to planning and continuously improving Conversational AI experiences.

cobusgreyling.medium.com

The Cobus Quadrant™ Of Conversation Design Capabilities

∗ This is part one of a two part series, please also take a look part two, the Cobus Quadrant of NLU Design.

cobusgreyling.medium.com

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

We analyze the growth of dataset sizes used in machine learning for natural language processing and computer vision…

arxiv.org

OpenAI Optimised LLMs For Dialogue With ChatGPT

How does ChatGPT compare with the new DaVinci 3 model? Is ChatGPT an implementation of KI-NLP? And how OpenAI is…

cobusgreyling.medium.com

What Is Needed For ChatGPT To Be Enterprise Ready?

What elements need to be introduced for ChatGPT to be used in an enterprise and customer facing implementation…

cobusgreyling.medium.com

What Are Realistic GPT-4 Size Expectations?

This article aims to cut through the hype by considering historic scaling and current trends of existing LLM models.

I say this for 3️⃣ reasons…

1️⃣ The LLM Trends point to a one trillion model, or close to it.

2️⃣ The Experts point to a scenario of less parameters, not more

3️⃣ There is only so much quality data

In Conclusion

NLU design tooling

“Conversation Designer, Retail, 10k+ employees The tool that turned conversation designers, into NLU designers” ★★★★★…

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

Eliza Language Technology Community — Language Technology: Conversational AI, NLP/NLP, CCAI…

ELIZA — Where language technology enthusiasts unite.

The Cobus Quadrant™ Of NLU Design

NLU design is vital to planning and continuously improving Conversational AI experiences.

The Cobus Quadrant™ Of Conversation Design Capabilities

∗ This is part one of a two part series, please also take a look part two, the Cobus Quadrant of NLU Design.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

We analyze the growth of dataset sizes used in machine learning for natural language processing and computer vision…

OpenAI Optimised LLMs For Dialogue With ChatGPT

How does ChatGPT compare with the new DaVinci 3 model? Is ChatGPT an implementation of KI-NLP? And how OpenAI is…

What Is Needed For ChatGPT To Be Enterprise Ready?

What elements need to be introduced for ChatGPT to be used in an enterprise and customer facing implementation…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Cobus Greyling

No responses yet