x
Close
Technology - September 18, 2025

Huawei Unveils Next-Gen Ascend Chip Line and SuperPoDs, Aiming to Surpass NVIDIA in Computing Power

Huawei Unveils Next-Gen Ascend Chip Line and SuperPoDs, Aiming to Surpass NVIDIA in Computing Power

At the Huawei Connect 2025 event in Shanghai this week, the tech giant unveiled its vision for future iterations of the Ascend chip series.

During his keynote address, Eric Xu, deputy chair of Huawei’s board, reflected on 2025 as a remarkable year and cited the launch of DeepSeek-R1 in January as a pivotal moment for the company. He also acknowledged that China might face a prolonged delay in semiconductor manufacturing process nodes.

To navigate ongoing tariffs and trade embargoes, Huawei plans to accelerate infrastructure design and technology advancement while open-sourcing significant portions of its software, including openPangu foundation AI models and Mind series SDKs.

In the coming years, Huawei will introduce three new lines of the Ascend chip – the 950, 960, and 970. The Ascend 950PR and 950TO will share a common die, offering enhanced support for low-precision data formats like FP8. With the 950 delivering a PFLOP of performance and MXFP8 rated at two PFLOPs, these chips will boast improved vector processing and finer memory access (down to 128 byte chunks from 512 bytes).

The Ascend 950 series will offer an interconnect bandwidth of 2 TB/s, 2.5 times more than the current Ascend 910C. The Ascend 950PR is expected to debut in Q1 2026, with the launch of the Ascend 950DT slated for Q4 2026.

The Ascend 960, scheduled for release a year later in Q4 2027, will boast twice the computing power, memory access bandwidth, memory capacity, and number of interconnect ports compared to the 950. It will support Huawei’s proprietary HiF4 data format, which the company claims offers greater precision than other FP4 technologies.

The most powerful chip in the lineup will be the Ascend 970, expected to hit markets in Q4 2028. Xu stated that while some details are still being finalized, the general goal is to significantly enhance all specifications of the Ascend 970 series. The chips are anticipated to offer an interconnect bandwidth of 4TB/s, be capable of 8 PFLOPs of FP4, and boast a larger memory capacity.

Huawei’s strategy is to provide hyperscalers with raw compute in the form of SuperPoDs, with the first such offering – the Atlas 950 SuperPoD equipped with new Ascend 950DT chips – set to appear in Q4 2026.

Competitor NVIDIA’s NVL144 system (a SuperPod analogue) is expected to launch mid- to late-2026, and Huawei claims that its first SuperPoD will have 56.8 times more NPUs than GPUs in the NVL144 and deliver nearly seven times the processing power. Even with the scheduled arrival of the NVIDIA NVL576 in 2027, the Atlas 950 SuperPod is expected to remain the superior performer.

For general computing, Huawei plans to launch two models of its Kunpeng 950 processors in Q1 2026, boasting 96 cores and 192 threads, or 192 cores and 384 threads for the faster model. The company also intends to unveil “the world’s first general-purpose computing SuperPoD,” the Kunpeng 950-based TaiShan 950 SuperPod, in Q1 2026.

The NPU and general computing SuperPods will utilize UnifiedBus 2.0, the next evolution of the existing UnifiedBus 1.0. This interconnection technology was first used in the Atlas 900 A3 SuperPoD, which entered service in March this year and has been installed over 300 times.

UnifiedBus 2.0 will be an open protocol, with technical specifications released immediately to the developer community. UnifiedBus 2.0 will be used internally in new generations of SuperPods and connect clusters of SuperPods, forming SuperClusters.

The first cluster product is set to be the Atlas 950 SuperCluster, offering 2.5 times more NPUs and 1.3 times more computing power than xAI’s Colossus – currently the world’s most powerful computing cluster.

In the last quarter of 2027, Huawei intends to launch the Atlas 960 SuperCluster, which will integrate over a million NPUs and deliver 4 ZFLOPS in FP4 (with a ZFLOP representing 10^21 floating point operations per second). “SuperPoDs and SuperClusters powered by UnifiedBus are our answer to surging demand for computing, both today and tomorrow,” Xu concluded.