DeepSeek R1 Explained: China's ambitious answer to ChatGPT, Gemini, and why it is making waves

Published 1 month ago• 5 minute read

DeepSeek AI

Almost two years back when OpenAI’s ChatGPT burst on the scene, many thought China was caught on the back foot when it came to AI. First, Microsoft and OpenAI and then the likes of Google, Meta, Amazon ushered in the age of artificial intelligence. Chinese players like Alibaba, Baidu started investing billions of dollars but seemed like playing catch up. Not anymore, though as a new large language model originating from China has become the talk of tech town.

What is DeepSeek and is it better than ChatGPT and other AI models out there? Here’s a detailed explainer:

DeepSeek R1, a state-of-the-art reasoning model developed by the Chinese AI startup DeepSeek, was launched earlier this month and has garnered significant attention for its exceptional performance and competitive pricing. DeepSeek R1 aims to augment reasoning and analytical capabilities, positioning it as a formidable contender to other prominent AI models such as OpenAI’s o1 and ChatGPT.

DeepSeek R1, an advanced language model, is constructed employing a hybrid architecture, similar to its predecessor, V3. Yes, it’s not the first model from the Chinese startup but is better in almost every aspect.

It incorporates large-scale reinforcement learning (RL) and chain-of-thought reasoning to enhance the precision of its responses. The model comprises two versions: DeepSeek-R1 and DeepSeek-R1-Zero. Notably, the latter undergoes unsupervised fine-tuning, demonstrating remarkable reasoning abilities.

Although R1 has generated significant attention, DeepSeek itself remains a lesser-known entity. Headquartered in Hangzhou, China, the company was established in July 2023 by Liang Wenfeng, a Zhejiang University graduate specialising in information and electronic engineering, according to a report by MIT Technology Review. DeepSeek was incubated by High-Flyer, a hedge fund founded by Liang in 2015. Similar to OpenAI’s Sam Altman, Liang’s goal is to develop artificial general intelligence (AGI)—an advanced AI capable of performing a wide range of tasks at or beyond human-level proficiency.

It’s always about the money, right? One of the notable features of DeepSeek R1 is its cost-effectiveness. In contrast to OpenAI’s o1, which charges $15 per million input tokens and $60 per million output tokens, DeepSeek R1 offers a substantially lower price at $0.55 per million input tokens and $2.19 per million output tokens. This makes it an appealing option for developers, researchers, and organisations seeking cost-effective AI solutions.

What’s even more impressive is that, according to DeepSeek, it took the startup about two months to develop the model. While OpenAI, Google, Microsoft have been pumping billions of dollars in development of AI models, DeepSeek invested ‘just’ $6 million to create its latest model.

In terms of performance, DeepSeek R1 has demonstrated comparable results to OpenAI’s o1 across various benchmarks, including mathematics, coding, and reasoning tasks. Notably, it even outperforms OpenAI’s o1 in certain areas, such as coding tasks, where it achieves a remarkable 97% success rate.

Furthermore, DeepSeek has unveiled six compact versions of its R1 model, designed to run efficiently on laptops. The company claims that one of these smaller models surpasses OpenAI’s o1-mini in specific benchmarks. “DeepSeek has effectively replicated o1-mini and made it open source,” said Perplexity CEO Aravind Srinivas in a post on X.

Satya Nadella, CEO, Microsoft remarked — in reference to DeepSeek — that “I think we should take the development out of China very, very seriously.”

The recent launch of DeepSeek R1 has sparked a discourse on social media platforms, with numerous users sharing their experiences and comparative analyses between DeepSeek R1 and other AI models. Notably, AI and technology educator Paul Couvert underscored the seamless performance of DeepSeek R1 version 1.5B on his smartphone, surpassing GPT-4o and Claude 3.5 Sonnet in mathematical computations. Furthermore, another X user — ZeroEdge — conducted a comparative evaluation of the models’ capabilities in executing a rotating triangle with a red ball, demonstrating DeepSeek R1’s superior outcomes.

Chinese companies are increasingly adopting open-source practices alongside a focus on efficiency. For example, Alibaba in recent months has introduced over 100 open-source AI models, supporting 29 languages and addressing diverse applications such as coding and mathematics, according to the MIT Technology Review report.

A report published by the China Academy of Information and Communications Technology, a state-affiliated research institute, highlights the global scale of AI development. As of last year, there were 1,328 large language models worldwide, with China accounting for 36% of them. This solidifies China’s position as the second-largest player in AI development, trailing only the United States. While it is still behind the US in AI development, the likes of DeepSeek could give China the edge and have US worried about what might come next. With its rapid development, competitive pricing, and open-source initiatives, DeepSeek is set to make its presence felt in the global AI landscape and it also signals China’s growing influence in the field.

Users can visit the DeepSeek Chat interface at chat.deepseek.com. Signing up requires a valid email address, and clicking on the “DeepThink” option on the homepage provides access to the platform.

For developers looking to integrate DeepSeek-R1 into their applications, API access is available via the DeepSeek Developer Portal. After obtaining an API key, developers can set up their environment using tools like Python’s requests library or the OpenAI package. The API client should be configured with the base URL: api.deepseek.com.

Origin: