ARM accelerates artificial intelligence on Android with SME2 and KleidiAI: the future of mobile AI is here

Arm’s new extension will enable the execution of generative models and complex AI tasks directly on smartphones without changing app code.

Arm has taken a significant step toward democratizing AI on mobile devices with the introduction of Scalable Matrix Extension 2 (SME2), a key evolution in its Armv9 architecture. Together with its software layer KleidiAI, this technology will allow Android developers to incorporate advanced AI functions such as computer vision, natural language processing, or speech generation directly on the device—no cloud connection or app modifications needed.

From enhancing an image just before pressing the shutter, to keeping calls free of background noise or interacting with AI assistants offline, on-device AI is redefining the mobile experience. The challenge has been delivering this real-time performance without draining the battery, overheating, or complicating development. That’s where SME2 comes in.

SME2: a new era for inference on mobile CPUs

SME2 is designed to accelerate matrix workloads, essential for generative models and computer vision, directly on the mobile’s main processor. Unlike other solutions that rely solely on GPU or NPU, SME2 is part of a heterogeneous approach that intelligently distributes AI tasks across different compute units.

The real value of SME2 lies in its transparent accessibility: thanks to KleidiAI, developers do not need to modify a single line of code. This acceleration layer automatically integrates into popular libraries such as Google XNNPACK, MediaPipe, ONNX Runtime, Alibaba MNN, LiteRT, or even llama.cpp. When SME2 is enabled in hardware, the system automatically redirects intensive operations to it.

“SME2 enables running more advanced AI models, like Gemma 3, directly on a wide range of devices. This will benefit end users with low-latency experiences accessible from any smartphone.”
— Iliyan Malchev, Android Software Engineer (Google)

Concrete results: 6× faster speeds and sub-second summaries

Testing on devices with SME2 enabled showed that Google’s Gemma 3 model achieved up to six times faster responses in conversational interactions compared to the same device without SME2 active. Additionally, with a single CPU and this acceleration, it can generate a summary of an 800-word text in less than a second, demonstrating that high-level inference no longer requires the cloud.

An independent software vendor announced that they will shift most of their token generation from the cloud to the device, driven by these performance and efficiency improvements.

SME2 on Android and iOS: an ecosystem with over 9 million apps

Although SME2 will debut first on upcoming Android devices, it is already available on the latest iOS devices. Arm highlights that the impact will be widespread, with over 22 million developers and 9 million active applications built on their designs.

This also means greater portability and energy efficiency in a landscape where developers must optimize across diverse devices, thermal budgets, and energy consumption without sacrificing performance.

Arm’s advice: prepare today, win tomorrow

Arm recommends that mobile developers ensure their apps are built on frameworks compatible with KleidiAI now, so they can automatically benefit from future accelerations like SME2 as hardware implementations roll out. No code rewrites or model modifications are needed—KleidiAI takes care of everything.

Additionally, the company has launched a Developer Launchpad, offering resources and practical examples to help mobile developers adopt SME2 today and be ready to deploy next-generation AI functions as soon as devices support them.

Native AI, extreme performance, unchanged code

With SME2, Arm is not only optimizing AI model execution on mobile CPUs but also providing a clear strategy for making generative, personalized, real-time AI a standard in future apps—without altering app logic or user experience.

In a world where every second counts and every milliwatt matters, Arm proves that the future’s key isn’t just power but how it’s intelligently deployed. SME2 and KleidiAI lead the way.

source: arm

Scroll to Top