Huawei strikes again in the midst of the AI chip war. The Chinese company has unveiled Flex:ai, an open-source orchestration platform designed to maximize the use of GPUs, NPUs, and other accelerators, promising to increase their average utilization by around 30% and bringing China closer to a self-sufficient, competitive AI infrastructure.
The announcement is accompanied by an ambitious narrative: Chinese media frame it as part of a strategy that even includes the development of analog AI chips that, on paper, could be up to 1,000 times faster than NVIDIA GPUs for certain tasks. This message targets both the market and the geopolitical landscape.
What is Flex:ai and what does it solve?
Flex:ai is an orchestration software built on Kubernetes, the de facto standard for managing containerized applications across large clusters. Instead of focusing solely on CPUs, the platform acts as a management layer for GPUs, NPUs, and other accelerators from various vendors, with two core ideas:
- Group and “slice” accelerators: a single card can be divided into multiple virtual computing units, allowing multiple AI workloads to run in parallel without resource waste.
- Leverage idle resources: Flex:ai can detect underutilized processors across different nodes in the cluster and dynamically reassign them to AI tasks that need them.
The system’s core is Hi Scheduler, an intelligent scheduler that decides which task runs on which accelerator, how to allocate a GPU or NPU among multiple jobs, and when to group resources for training or inference of large models. According to Huawei, this approach can boost the utilization rate of AI chips by around 30% on average—a significant figure in a time when compute capacity is expensive and scarce.
A “vitaminized” Kubernetes for the AI era
In practice, Flex:ai functions as an extension of Kubernetes tailored to modern AI needs:
- Treats GPUs and NPUs as first-class resources, not just attachments to pods.
- Allows running multiple models or tasks on the same card without reserving the entire one for a small job.
- Enables consolidating heterogeneous clusters with accelerators from different vendors—crucial in a context of sanctions and export restrictions affecting access to cutting-edge hardware.
For system administrators and MLOps teams, the promise is clear: fewer unused GPUs “sitting idle,” and more productive work for every euro invested in hardware.
Context: sanctions, local chips, and “democratization” of AI
The launch of Flex:ai cannot be understood without considering the broader backdrop of the US-China tech war. Restrictions on exporting NVIDIA’s most advanced GPUs are forcing Chinese companies to turn to:
- Domestic chips, such as Huawei’s Ascend accelerators.
- Optimization software that extracts maximum performance from available hardware.
Flex:ai fits squarely into this second category: if you can’t access NVIDIA’s latest chips, make the most of what you do have. Plus, it adopts an open-source approach to attract universities, startups, and developers eager to build on this platform. Huawei plans to publish Flex:ai through its ModelEngine developer community, in collaboration with universities like Shanghai Jiao Tong, Xi’an Jiaotong, and Xiamen.
The company has already been strengthening this strategy with tools like Unified Cache Manager (UCM), designed to optimize data access across different memory levels and reduce reliance on high-bandwidth memory from foreign suppliers.
And where do the “1000 times faster” analog chips fit in?
The message that China is working on an analog AI chip up to 1,000 times faster than NVIDIA GPUs stems from parallel research conducted by Chinese universities and R&D centers exploring non-traditional architectures to accelerate neural networks.
In this landscape, Flex:ai is not the chip itself but the software layer that could:
- Orchestrate simultaneously digital GPUs, NPUs, and future analog chips within a single cluster.
- Abstract the complexity of each hardware type and present it as a unified “compute pool” to data teams.
- Allow AI applications to benefit from this new hardware without needing to rebuild the entire stack.
In other words, if these analog chips reach production and are integrated into data centers, a platform like Flex:ai will be necessary to efficiently combine them with other accelerators. Huawei aims to be ahead of that scenario with Flex:ai.
A direct competitor to Run:AI and others
The proposal inevitably brings to mind Western platforms like Run:AI, acquired by NVIDIA in 2024, which also offers advanced clustering orchestration to improve utilization and simplify MLOps workflows.
The similarities are evident:
- Logical slicing of GPUs for multiple jobs.
- Intelligent scheduling of training and inference queues.
- Support for large groups of containerized applications in Kubernetes.
The key difference lies in the strategic approach: while Run:AI has become part of NVIDIA’s ecosystem, Flex:ai positions itself as a core piece of a sovereign ecosystem designed so China can continue training and deploying large AI models with reduced dependence on U.S. suppliers.
Implications for companies, data centers, and AI teams
If Huawei fulfills its promise, Flex:ai could have tangible impacts on how AI clusters are scaled and operated:
- Better hardware ROI: a 30% increase in GPU and NPU utilization could mean fewer servers for the same workload or more effective capacity without additional budget.
- Reduced queues and wait times: slicing cards to match task sizes reduces bottlenecks in environments where many small workloads block entire GPUs.
- Improved R&D flexibility: teams can run experiments in parallel without “contending” for resources, using fractions of GPUs instead of reserving full nodes.
- Isolation and multitenancy: virtualized accelerators allow multiple teams or clients to share infrastructure securely.
For system administrators and data center managers, the takeaway is clear: the challenge is no longer just about acquiring chips, but about using them intelligently. Orchestration will become as strategic as the silicon itself.
Another step in the AI infrastructure race
Flex:ai is essentially another building block in Huawei’s ongoing effort to develop its AI ecosystem—comprising Ascend chips, Pangu models, compilation tools, and now an orchestration layer meant to compete with the best solutions in the market.
It remains to be seen how much of the international community will adopt the project beyond China, and what real impact the software and future analog chips will have in practice—beyond the headlines. But the underlying message is clear: in the AI race, software that manages compute is as strategic as the hardware itself, and Huawei intends to be a major player in that arena.
via: scmp

