Alibaba’s Quietly Dropped AI Is Shaking Up Open Models

Key Takeaways

  • Alibaba’s Qwen team has released Qwen3, a new family of powerful open-source AI models.
  • The series includes eight models, featuring both “mixture-of-experts” (MoE) and standard “dense” architectures.
  • Benchmark tests show Qwen3 models compete strongly with leading open-source options and even approach the performance of proprietary models from OpenAI and Google.
  • These models support 119 languages and offer a flexible “hybrid reasoning” capability for different task complexities.
  • Qwen3 is available under the permissive Apache 2.0 license, encouraging broad use and modification.

Alibaba’s AI team, Qwen, has unveiled a new lineup of open-source artificial intelligence models called Qwen3. This release marks a significant step, offering performance that rivals some of the best models currently available.

The Qwen3 family is quite diverse, featuring eight distinct models. Two of these use a “mixture-of-experts” (MoE) design, a technique popularized by Mistral AI that combines specialized model parts efficiently. The other six are more traditional “dense” models of varying sizes.

According to performance results shared by the team and detailed by VentureBeat, the largest Qwen3 model stacks up impressively against competitors. It reportedly outperforms DeepSeek’s R1 and OpenAI’s o1 model in key tests and gets close to Google’s new Gemini 2.5-Pro.

This positions the top Qwen3 model as one of the most capable publicly accessible AI options out there right now.

A standout feature is “hybrid reasoning.” Users can switch between quick, accurate answers and a more intensive “Thinking Mode” for complex problems in fields like math or engineering. This can be toggled via the Qwen Chat website or through specific commands when using the model elsewhere.

Qwen3 significantly boosts multilingual capabilities, now understanding 119 languages and dialects. This makes the models far more useful globally for research and development.

The training process for Qwen3 was substantially upgraded from its predecessor, using double the data (around 36 trillion tokens) from web pages, documents, and synthetic content. This improved training allows even the smaller dense models to perform remarkably well.

Developers and researchers can find these models readily available on popular platforms like Hugging Face, ModelScope, and GitHub. They can also be used directly through the Qwen Chat interface.

The models come with an Apache 2.0 open-source license. This permissive license allows for commercial use without many of the restrictions seen with other models, like Meta’s Llama series, making it an attractive option for businesses.

Deployment is designed to be flexible, supporting integration with frameworks like vLLM and local use via tools such as Ollama and LMStudio. This makes it easier for engineering teams to adopt Qwen3, potentially switching from other systems quickly.

For developers, the MoE models offer high-level reasoning comparable to GPT-4 but require less computing power, roughly equivalent to a medium-sized dense model. The range of dense models (from 0.6B to 32B parameters) allows prototyping on smaller machines and scaling up as needed.

The Qwen team mentioned tackling tough engineering challenges behind builds like these, focusing now on creating AI agents capable of complex, real-world tasks. This release heats up the competitive AI landscape, providing another strong alternative to models from major players in North America and China.

Ultimately, Qwen3’s release under an open license lowers barriers for innovation, allowing more people to experiment with and build upon state-of-the-art AI technology. The team sees this as a step towards more advanced AI concepts like Artificial General Intelligence (AGI).

Independent, No Ads, Supported by Readers

Enjoying ad-free AI news, tools, and use cases?

Buy Me A Coffee

Support me with a coffee for just $5!

 

More from this stream

Recomended