Here’s the translation of your text into American English:
The collaboration between Arm and Alibaba has taken multimodal artificial intelligence to a new level in mobile devices. Thanks to the integration of Arm KleidiAI into the deep learning framework MNN, developed by Alibaba, improvements of up to 57% in processing multimodal AI tasks at the edge have been achieved. This enables faster and more efficient experiences in applications like chatbots and visual search in e-commerce.
AI Optimization at the Edge with KleidiAI
Multimodal AI applications are becoming increasingly common, combining text, images, audio, and video to provide more accurate and contextual responses. However, running them on mobile devices poses a challenge due to power and memory limitations.
To address this, KleidiAI offers optimizations that accelerate the inference of AI models on Arm CPUs without requiring additional adjustments from developers. This technology has already been integrated into popular frameworks such as ExecuTorch, Llama.cpp, LiteRT, and MediaPipe, and now also into MNN from Alibaba.
The optimization allows the model Qwen2-VL-2B-Instruct, a model with 2 billion parameters designed for image understanding and multimodal generation in multiple languages, to function efficiently on mobile devices.
Improvements in Speed and Efficiency
The integration of KleidiAI in MNN has resulted in:
✅ 57% improvement in pre-fill – Optimization of processing multiple inputs before generating a response.
✅ 28% improvement in decoding – Reduction in the time needed to generate text from the processed input.
✅ Lower computational cost – Reduction in resource consumption on devices with limited hardware.
These improvements allow for smoother user experiences in chatbots, virtual assistants, and product searches using images.
Demonstration at MWC 2025
At the Mobile World Congress 2025, Arm and Alibaba will showcase these improvements at the Arm booth (Hall 2, Stand I60). The demo will display how the model Qwen2-VL-2B-Instruct interprets text and images and generates real-time responses while running on smartphones with the MediaTek Dimensity 9400 chip.
A Step Forward in Mobile AI
The integration of KleidiAI in MNN represents a significant advancement in the development of edge AI, enabling complex models to operate on devices with limited power.
With these optimizations, millions of developers will be able to create more efficient multimodal AI applications, bringing advanced artificial intelligence closer to mobile users and paving the way for the next generation of smart computing.
via: ARM