The Allure of Tiny AI Models
The latest advancements from tech giants like OpenAI, Anthropic, and Google are impressive. However, their size presents a significant limitation, as users can only access these models via their proprietary chatbots. In stark contrast, Alibaba has recently entered the arena with an intriguing development: the “Qwen 3.5 Small Models,” featuring four variants designed for efficiency and accessibility.
Exploring the Qwen 3.5 Small Models
Alibaba’s Qwen 3.5 lineup includes models with parameters as follows:
- Qwen3.5-0.8B: 800 million parameters
- Qwen3.5-2B: 2 billion parameters
- Qwen3.5-4B: 4 billion parameters
- Qwen3.5-9B: 9 billion parameters
In comparison, the latest models from major competitors are estimated to have parameters in the hundreds of billions, making Alibaba’s smaller offering particularly noteworthy.
Tiny but Mighty
Models Qwen3.5-0.8B and Qwen3.5-2B are optimized for deployment on modest devices, prioritizing battery efficiency. Meanwhile, the Qwen3.5-4B model boasts multimodal capabilities, recognizing input from both text and images and supporting an impressive context window of 262,144 tokens. With a size under 3 GB in its 4-bit quantized version, it can even function on mobile devices.
The Best Essences of AI
The star of Alibaba’s smaller models, Qwen3.5-9B, is a reasoning model that reportedly surpasses the capabilities of the much larger gpt-oss-120B from OpenAI. This model is available through open weights on platforms like Hugging Face and ModelScope.
A New Approach to AI Architecture
Alibaba’s models leverage an Efficient Hybrid Architecture, which synergies innovative attention algorithms known as Gated Delta Networks with the established Mixture-of-Experts (MoE) framework. This design effectively circumvents the “memory wall” issue that often plagues smaller models.
Promising Returns
Benchmark tests reveal that Qwen3.5-4B and Qwen3.5-9B perform exceptionally well, especially in multimodal tests. For instance, Qwen3.5-9B outperformed the Gemini 2.5 Flash lite in the MMMU-Pro visual reasoning test and bested the gpt-oss-120B in the GPQA reasoning test. AI expert Paul Couvert noted that Qwen3.5-4B matches the output quality of previously acclaimed larger models, thereby bridging the gap between size and performance.
Models for Everyone
These models stand out for their ability to run on everyday devices such as laptops and smartphones, which implies accessibility for a broader audience. Users can enjoy the privacy and security of offline operation, as their data wouldn’t be sent to the cloud, thereby ensuring conversations remain confidential.
The Competitive Landscape
In the West, only Google appears to be exploring the realm of smaller models, exemplified by its Gemma 3 270M released in August 2025. Microsoft has also introduced its Phi-4, though wider interest seems limited. Startups like Liquid are beginning to develop smaller models, but Alibaba currently leads the pack in the small AI model sector.
In conclusion, while large AI models dominate the conversation, Alibaba’s Qwen 3.5 Small Models present a promising alternative, poised to democratize access to highly capable AI.

