The Language Model Landscape — Version 7
Language Models are transforming with a new era of Specialised Functionality & Advanced Abilities.
Introduction
Firstly, the term Large Language Models (LLMs) has become a proprietary eponym. It has become a generic reference to language based models which deals mostly with unstructured data. Which is not a problem.
However, from a technical perspective, it has become inaccurate, with models not being large and only language models per se.
Models now are varying in size, with Small Language Models (SLMs) having exceptional capabilities in reasoning, vision, etc. Multiple models have vision capabilities, function calling capabilities (where the model decides internally if it should make use of a function, and which one. Or if a base knowledge of the model should be leveraged.
Terms like Foundation Models are being used, or Multi-Modal Models, however, it feels like the term Language Model is the most appropriate (for now).
Ripples & Disruption
The image above shows the ripples caused by the advent and development of which can be divided into seven bands or zones. As these ripples extend, there are requirements and opportunities for products and services.
The opportunities lie with creating frameworks and supporting structures around the Language Models which allows the harnessing of the power of LMs. The most common example here is the leveraging of LM’s ability to perform in-context learning (ICL) via RAG frameworks.
A whole industry has developed around RAG build frameworks and supporting technologies.
Hence, some of these opportunities have been discovered, some are yet to be discovered or truly leveraged. Some recent and exciting developments have been around AI Agent Computer Interfaces (ACI) where an AI Agent can navigate and execute based on what is presented on a user’s desktop.
I would argue that the danger of being superseded as a product is greater in Zone 7 as apposed to Zone 6.
Zone 6 offers a bigger opportunity for differentiation, substantial built-in intellectual property and stellar UX enabling enterprises to leverage the power of LLMs. Exciting developments in Zone 5 include quantisation, Small Language Models, model gardens/hubs and data centric tooling.
Zone 5 can be considered as an emerging area with specialised models seeing the light, and model providers having to extend beyond mere models; more about this later.
The Opportunity For Orchestration
The language model ecosystem has become increasingly fragmented as specialised models are being developed to handle niche tasks such as reasoning, behaviour prediction, and multimodal processing.
Each implementation requires supporting infrastructure and technology, including integration frameworks, deployment tools, and monitoring systems, adding complexity for enterprises.
Despite the technological advancements, the lack of a unified approach has created challenges for organisations, leading to inefficiencies when deploying AI solutions across diverse domains.
This fragmentation presents a significant opportunity to orchestrate these specialised models and supporting infrastructure into a cohesive, enterprise-ready AI solution.
By integrating these elements into a unified framework, businesses can harness the full potential of AI while simplifying adoption, maintenance, and scalability, giving them a competitive edge in leveraging advanced technologies.
Zone 1 — Language Models Disruption
Considering LLMs and as I have stated at the top of this article, in essence LLMs are language bound, however, multi-modal models or multi-modality have been introduced in terms of images, audio and more. This shift gave rise to a more generic term being used, namely Foundation Models. However, the term Language Model seems to be more accurate.
Apart from increased modalities, there has been model diversification from the large commercial providers, offering multiple models which are more task specific. There has also been a slew of open-sourced models made available. The availability and performance of open-sourced models have given rise to easy, no-code hosting options, where users can select and deploy models via a no-code fashion.
New prompting techniques have illustrated how models performance can be enhanced and how the market are moving towards a scenario where data discovery, data design, data development and data delivery can be leveraged to achieve this level of model-autonomy.
Zone 2 — General Use-Cases
With the advent of large language models, functionality was more segmented…models were trained for specific tasks. Models Sphere & Side focussed on Knowledge Answering; something Meta called KI-NLP. Models like DialoGPT, GODEL, BlenderBot and others focussed on dialog management.
There were models focussing on language translation, specific languages, etc.
Recent developments in LLMs followed an approach where models incorporate these traits, with one model consolidating most, if not all of these functions. Add to this astounding performance can be extracted using different prompting techniques.
The main implementations of LLMs are listed here, with text generation encompassing tasks like summarisation, rewriting, key-word extraction and more.
Text analysis and RAG are becoming increasingly important, and embeddings are vital for these type of implementations.
Speech recognition, also known as ASR is the process of converting audio speech into text. The accuracy of any ASR process can easily be measured via a method called Word Error Rate (WER). ASR opens up vast amounts of recorded language data for LLM training and use.
Notable shifts in this zone are:
- Knowledge answering and Knowledge Intensive NLP (KI-NLP) approaches are superseded by RAG Prompt Engineering at inference.
- LLM functionality consists of a few elements: dialog & context management, logic & reasoning, unstructured input and output, natural language generation and knowledge intense base-model. All of these elements are leveraged extensively, except the knowledge intensive nature of LLMs.
- The base knowledge intensive nature of the LLMs are being replaced by In-Context Learning strategies at inference. Most notable here is RAG as a standard that most technology providers are standardising on.
- Dialog generation was spearheaded by developments like GODEL and DialoGPT. These have been superseded by specific implementations like ChatGPT, HuggingChat and Cohere Coral. Also by prompt engineering approaches where few-shot training is used with the dialog context presented in the prompt.
Zone 3 — Specific Implementations
A few specific-use models are listed in this zone. As mentioned before, models have become less use-case specific at this time, and models have started to incorporate multiple if not all of these elements in one model.
However, this changed to some degree with the advent of Zone 5.
Zone 4 — Commercial Model Providers
The most notable Large Language Model suppliers are listed here. Most of the LLMs have inbuilt knowledge and functionality including human language translation, capability of interpreting and writing code, dialog and contextual management via prompt engineering.
Some of these models suppliers make APIs available, some models are open-sourced and are freely available to use. The only impediment is hosting, model managing and managing the APIs.
Zone 5 — Model Diversification
As I have eluded to, LM’s were more focused and task specific (Zone 2) after which we saw a unification of all these capabilities within single, behemoths of models. The general consensus at this stage was very much one of single models being used for almost all of the model needs of an implementation.
However the market is developing in an opposite manner, with specific models seeing the light, which is trained for ver niche tasks. Large Behaviour Models (LBMs) focus on creating user context based on subtle cues from the training data via supplementary social media data, likes, etc.
Large Action Models are geared towards excelling in AI Agent implementations where structured data output is important. Models are increasing in reasoning capabilities, where models inherently have the ability to decompose complex tasks into sub-tasks and solve complex, compound queries and a step-by-step fashion analogous to how we as humans do.
The latest to this list is where LM providers are creating a framework where the model can be used for Computer Use. The model can make use of their vision capabilities, to interpret user screens, navigate those screens via character recognition and making use of design affordance to achieve a certain outcome.
Zone 6 — Foundation Tooling
This sector considers tooling to harness the power of LLMs, including vector stores, playgrounds and prompt engineering tools. Hosting like HuggingFace enables no-code interaction via model cards and simple inference APIs.
Listed in this zone is the idea of data-centric tooling which focusses on repeatable, high value use of LLMs.
Recent additions to this area is local off-line inference servers, quantisation, and small language models.
The market opportunity in this area is creating foundation tooling which will address a future need for data delivery, data discovery, data design and data development.
Zone 7— End User UIs
On the periphery, there is a whole host of applications which focus on flow building, idea generation, content and writing assistants. These products focus on UX and adding varying degrees of value between LLMs and the user experience.
Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.