GPT-4o mini

OpenAI states that they are advancing cost-efficient intelligence with their most cost-efficient Small Language Model.

Cobus Greyling
4 min readJul 18, 2024

--

Introduction

Today Sam Altman stated that in 2022, the best model in the world was text-davinci-003 which was much worse than GPT-4o mini, at it cost 100 times more.

GPT-4o-mini is already available within the OpenAI Playground

Advantages of GPT-4o-mini

  • GPT-4o mini supports text & vision in the API and playground
  • Text, image, video & audio inputs and outputs coming in the future.
  • The model has a context window of 128K tokens and knowledge up to October 2023.
  • The model does have multi-language capabilities
  • Enhanced inference speeds
  • The combination of inference speed and cost make the model ideal for agentic applications with multiple parallel calls to the model.
  • Fine-tuning for GPT-4o mini will be rolled out soon.
  • Cost: 15 cents / million input tokens & 60 cents per million output tokens.

Considerations

  • With open-sourced SLMs the exciting part is running the model locally and having full control over the model via local inferencing.
  • In the case of OpenAI, this is not applicable due to their commercial hosted API model.
  • Hence OpenAI focus on speed, cost and capability.
  • And also following the trend of small models.
  • There are highly capable text based SLM’s which are open-sourced in the case of Orca-2, Phi3, TynyLlama, to name a few.
  • A differentiators for GPT-4o-mini will have to be cost, speed, capability and available modalities.

Why Small Language Models?

Before delving into Small Language Models (SLMs), it’s important to consider the current use-cases for Large Language Models (LLMs).

LLMs have been widely adopted due to several key characteristics, including:

  • Natural Language Generation
  • Common-Sense Reasoning
  • Dialogue and Conversation Context Management
  • Natural Language Understanding
  • Handling Unstructured Input Data
  • Knowledge Intensive nature

While LLMs have delivered on most of these promises, one area remains challenging: their knowledge-intensive nature.

We have opted to supersede the use of LLMs trained knowledge by making use of In-Context Learning (ICL) via RAG implementations.

RAG serves as an equaliser when it comes to Small Language Models (SLMs). RAG supplements for the lack of knowledge intensive capabilities within SLMs.

Apart from the lack of some Knowledge Intensive features, SLMs are capable of the other five aspects mentioned above.

This is an example of an image sent to GPT-4o-mini and the LM’s response after analysing the image.

Small Language Models Offer Several Advantages:

Local inference

They enable efficient on-device processing, reducing the need for cloud-based resources. This is in general for SLMs, obviously OpenAI’s offering will be commercial API based.

Cost

Small models are more cost-effective to run, requiring less computational power and storage.

Privacy

By processing data locally, they enhance user privacy and reduce data exposure risks.

Open Source Models

They provide flexibility and customisation options, allowing users to modify and adapt models to their specific needs.

Model Control & Management

They allow for easier control and management, enabling fine-tuning and optimisation for particular tasks without relying on external dependencies.

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

I’m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

Responses (1)