文字のサイズ
- 小
- 中
- 大
Unprecedented Fast AI Experience. 5x increase in prompt processing speed.
As AI technology evolves, the latest large-scale language models, LLMs, run in all environments from the cloud to the edge and are indispensable in maximizing the potential and opportunities of AI.
The challenge, however, is that they require enormous computing resources and energy.
Eliminate the challenges of computational workload and energy consumption! The next generation open source LLM is now available.
To address this issue, Meta has released the latest version of its open source LLM (Llama 3.2) with improved efficiency to quickly provide users with an unprecedented fast AI experience.
Running the latest LLM on an Arm CPU, prompt processing has been improved by a factor of 5 and token generation by a factor of 3, achieving 19.92 tokens/second in the generation phase, according to the company.
In particular, there has been an improvement in latency when processing AI workloads on the device, allowing for more efficient AI processing.
Scaling AI processing at the edge also reduces energy and cost by reducing power consumption due to data traveling to and from the cloud.
AI performance on Arm CPUs has improved dramatically, and more than 100 billion Arm-based devices are expected to be AI-enabled in the future.
This is expected to further leverage AI in everyday life and business.
This presentation details the latest version of the open source jointly developed by Arm and Meta, and explains how rapidly accelerating AI technologies, especially tools such as “Kleidi” and “PyTorch,” have contributed to improved AI performance.
This presentation will explain how tools such as “Kleidi” and “PyTorch” in particular have contributed to the improvement of AI performance.
Companies looking to accelerate and extend AI inference by leveraging the latest version of LLM for Arm are encouraged to purchase the report.