DeepSeek Unveils V3.2 Model, Slashing AI Costs in Half

Chinese AI startup DeepSeek rolled out its latest experimental model Monday, doubling down on efficiency as the company seeks to prove powerful artificial intelligence doesn't require massive computing budgets.

The new DeepSeek-V3.2-Exp builds on the company's previous V3.1-Terminus model, introducing what the firm calls DeepSeek Sparse Attention—a technique that cuts operating costs by 50% while maintaining performance levels.

The technology works by filtering out less relevant data during processing, focusing computational resources only on information deemed important for the task at hand. This approach allows the model to handle lengthy documents and extended conversations more efficiently than traditional systems that process every piece of available data.

DeepSeek shook Silicon Valley earlier this year when it demonstrated that large language models could be trained quickly using less powerful chips and fewer resources—a stark contrast to the industry's prevailing wisdom about AI development costs.

The latest release continues that trajectory. The company said the sparse attention mechanism delivers "substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality".

Benchmark testing shows DeepSeek-V3.2-Exp performs on par with its predecessor across various domains, including reasoning tasks, coding challenges, and tool-based applications. The model scored 89.3% on the AIME 2025 mathematics test and achieved a 2121 rating on Codeforces, a competitive programming platform.

The company slashed API pricing by more than 50% effective immediately, making the technology more accessible to developers and smaller firms.

DeepSeek acknowledged V3.2-Exp represents "an intermediate step toward our next-generation architecture" and released the model's code publicly under an MIT license for others to study and build upon.

The release comes as AI remains at the center of U.S.-China tech competition, with both nations racing to dominate the field. DeepSeek noted the model works seamlessly with Chinese-made AI chips, allowing it to run on domestic hardware without additional configuration.

The company has made the model available through its app, web interface, and API, with support from major inference platforms including vLLM and SGLang.