The DeepSeek Phenomenon Demystified

Why Did DeepSeek Take the AI World by Storm, While Qwen2.5-Max and Qwen Chat Struggled to Generate the Same Hype?

6 min readJan 30, 2025

--

The Language Model Landscape

Somehow DeepSeek’s aspirations surprises the west during the last two weeks. It was described as a Sputnik moment, and a wakeup call to the US, etc. etc.

I believe the hype is due to most people in this field not understanding or having a basic knowledge of the market.

Consider the slide below from Artificial Analysis, showing their quality index from 16 December 2024.

DeepSeek V2.5 sits ahead of AI21Labs and Cohere, which is already a significant achievement; with an index of 72.

The Qwen 2.5 model sits at 77 on the index and finds itself ahead of models like GPT-4o and GPT-4o mini.

Hence DeepSeek and Qwen both have made a significant leap in performance but did not just come out of nowhere, by any stretch of the imagination.

Also the question around China’s dominance, from the graph below it is also clear that the US is firmly established and this is by no means over for US dominance. Added to this, from the foundation of DeepSeek-R1, DeepSeek AI has created a series of distilled models based on both Meta’s Llama and Qwen architectures, ranging from 1.5–70 billion parameters.

Looking at the image below, also from Artificial Analysis it is clear that model quality of proprietary and open source models were improving considerably.

And that the gap was closing…o1 presented quite a leap forward at the end of 2024, and we are sure to see more iminent leaps in quality and performance.

Capturing The Imagination

So why did DeepSeek capture the imagination of the public?

It was most probably not the fact that DeepSeek V3 & R1 is open-sourced or the stellar benchmark performance of DeepSeek V3 against its counterparts.

I believe the main reason and cause for the general excitement around DeepSeek is their web UI, which is analogous to ChatGPT. With the same basic features as ChatGPT, but with the added streaming of the internal reasoning and context management of the UI.

This is such a clever UX trick to engage the user while the model works out the final answer, but gives a level of transparency and awe where the user sees the internal reasoning.

OpenAI has had reasoning models for a while, which manages the context of the conversation, but it is not shown as the way DeepSeek is showing it.

The UI also made it possible for the wider and general public to experiment and hence affording most everyone a personal experience, and by implication an opinion which was shared.

There were alarmist posts saying if you use DeepSeek you are handing the keys of your kingdom to China, and all the user data is forwarded to DeepSeek in China, etc. etc.

Please read the following…

DeepSeek R1/V3: Open-Source AI with Full Data Control

DeepSeek R1/V3 are open-source AI models that can be privately or commercially hosted, offering full control over data governance and security.

These models, when self-hosted, operate independently from DeepSeek as a company, ensuring that when you deploy a private instance, your data (prompts) remain within your infrastructure and are not sent to DeepSeek’s servers.

DeepSeek vs. DeepSeek-Chat: Understanding the Difference

  • DeepSeek R1/V3 → Open-source models that can be self-hosted for full privacy and customisation.
  • DeepSeek-Chat (App & Website) → A publicly available chat interface (similar to ChatGPT) that allows users to interact with the AI model via DeepSeek’s servers. When using this service, your data is processed by DeepSeek’s infrastructure.

Insights from Perplexity CEO Aravind Srinivas

Perplexity CEO Aravind Srinivas clarified the significance of hosting DeepSeek R1/V3 independently:

  • Full Control Over AI Inference: When organizations download and host the DeepSeek model on their own servers, all AI inference happens locally, ensuring data does not leave their infrastructure.
  • No External Computation: The AI model can generate responses entirely on the hosted server, meaning no data needs to be sent to external locations (e.g., China or elsewhere) during inference.
  • Enterprise Customisation: Companies can further customize the AI by integrating web search, external tools (e.g., code execution, Wolfram Alpha), or other functionalities into their hosted instance.

DeepSeek R1/MoE offers a powerful open-source alternative for organisations looking for privacy-first AI deployment. While DeepSeek-Chat provides a convenient web-based interface, self-hosting ensures that businesses and researchers can fully control their data, optimise model inference, and customise capabilities to fit their needs.

Qwen2.5-Max

So here is where it really becomes interesting…

Alibaba launched the Qwen2.5-Max model on 28 January 2025, and considering the chart below, Qwen2.5-Max exceeds DeepSeek V3 in all of the benchmarks.

Also, Alibaba launched Qwen Chat, which is basically a clone of DeepSeek Chat, with the same strategy to stream the internal reasoning of the model.

The model is also available via Alibaba Cloud and HuggingFace, the only difference btween the Qwen2.5-Max and DeepSeek V3 is the fact that Qwen2.5-Max is not open-sourced. And from my own casual testing, there does not seem to be any political censorship as was found with DeepSeek.

A Real Danger

I would say the true and real danger is employees using hosted solutions like ChatGPT, Cohere Coral, DeepSeek Chat, Qen Chat and others for business use or personal use with sensitive data.

Not that I have any reason to think those companies are involved in any nefarious actions. But from a corporate data governance perspective it makes sense to have a bespoke enterprise solution with all the elements required.

In Closing

So why the hype regarding DeepSeek and not Alibaba’s efforts? Was the fact that DeepSeek was open-sourced the reason it was seen as a Chinese trojan horse?

Maybe the answer lies somewhere in the news cycle…for instance, I thought OpenAI Operator received much more media attention than Anthropic’s Computer Use AI Agent.

The Computer Use AI Agent is actually such a comprehensive framework, and really shows a level of PC autonomy, compared to OpenAI Operator. Yet the news hype with regards to Operator was much larger. At least in my estimation.

So I would say, find news sources and people who see through the hype and paints a balanced picture. I think the technical understanding of most news sources are questionable, to say the least.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

Qwen Chat

Qwen Chat offers comprehensive functionality spanning chatbot, image and video understanding, image generation…

chat.qwenlm.ai

Qwen2.5 Max Demo — a Hugging Face Space by Qwen

Discover amazing ML apps made by the community

huggingface.co

Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model

QWEN CHAT API DEMO DISCORD It is widely recognized that continuously scaling both data size and model size can lead to…

qwenlm.github.io

--

--

Cobus Greyling
Cobus Greyling

Written by Cobus Greyling

I’m passionate about exploring the intersection of AI & language. www.cobusgreyling.com

No responses yet