DeepSeek’s AI Upgrade Rewrites More Than Just Code.

Key Takeaways

  • Chinese AI startup DeepSeek has updated its R1 reasoning model, naming it R1-0528.
  • The new version boasts significantly better reasoning, handles complex tasks more effectively, and reduces “hallucinations” or false outputs by nearly half.
  • This upgrade brings DeepSeek’s model closer in performance to OpenAI’s o3 and Google’s Gemini 2.5 Pro.
  • DeepSeek also used this new model’s logic to enhance Alibaba’s Qwen 3 model, improving its performance.
  • The company’s progress challenges the idea that U.S. export controls are holding back China’s AI development.

Chinese artificial intelligence startup DeepSeek rolled out an update to its popular R1 reasoning model early Thursday, intensifying competition with American giants like OpenAI.

DeepSeek announced via the developer platform Hugging Face that R1-0528, while a minor version upgrade, substantially boosts the model’s depth of reasoning and inference capabilities. This includes better handling of complex tasks.

The company aims for this upgrade to bring its performance nearer to OpenAI’s o3 reasoning models and Google’s Gemini 2.5 Pro. The initial launch of R1 in January made global headlines and questioned the belief that scaling AI always needs massive computing power and investment.

Since R1’s debut, Chinese tech firms such as Alibaba and Tencent have introduced models they claim surpass DeepSeek’s original offering.

Thursday’s update was initially light on specifics, unlike the detailed academic paper accompanying R1’s January launch, which the global AI community studied to understand DeepSeek’s approach.

Later, the Hangzhou-based firm stated in a brief post on X that R1-0528 featured improved performance. A more detailed post on WeChat revealed that the rate of “hallucinations”—false or misleading outputs—was cut by about 45-50% in tasks like rewriting and summarizing.

DeepSeek also mentioned the update enables creative writing across genres like essays and novels, and has improved capabilities in generating front-end code and role-playing scenarios.

“The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic,” DeepSeek stated.

DeepSeek’s success has shaken the assumption that U.S. export controls were significantly hindering China’s AI advancements. They’ve released AI models comparable to, or even better than, leading U.S. models, often at a lower cost.

On Thursday, the startup also revealed that a variant of its update was created by applying the R1-0528 model’s reasoning process to further enhance Alibaba’s Qwen 3 8B Base model. This “distillation” process resulted in performance surpassing the original Qwen 3 model by over 10%.

“We believe that the chain-of-thought from DeepSeek-R1-0528 will hold significant importance for both academic research on reasoning models and industrial development focused on small-scale models,” DeepSeek added.

Bloomberg first reported the update on Wednesday, mentioning that a DeepSeek representative had informed a WeChat group about completing a “minor trial upgrade” and that users could begin testing it. This information was also reported by Reuters.

In response to the competition from DeepSeek, Google’s Gemini has introduced discounted access tiers, while OpenAI has cut prices and released an o3 Mini model that requires less computing power.

DeepSeek is still widely anticipated to release R2, the successor to R1. Sources indicated in March that R2’s release was initially planned for May. DeepSeek also upgraded its V3 large language model in March.

Independent, No Ads, Supported by Readers

Enjoying ad-free AI news, tools, and use cases?

Buy Me A Coffee

Support me with a coffee for just $5!

 

More like this

Latest News