Revolutionizing generative AI with innovative NPU technology

Korean researchers develop a cutting-edge NPU core enhancing generative AI performance by over 60% while reducing energy consumption by 44%.

  • Monday, 28th July 2025 Posted 8 months ago in by Aaron Sandhu

In the quest to enhance the efficiency of the burgeoning generative AI sector, Korean researchers have taken significant strides with the development of a novel NPU (Neural Processing Unit) core technology. As the demands of powerful AI models like OpenAI's ChatGPT-4 and Google's Gemini 2.5 continue to swell in terms of memory requirements, such advancements are crucial.

Professor Jongse Park and his team from KAIST School of Computing, in partnership with HyperAccel Inc., have introduced an NPU core that stands out not just for its impressive performance metrics but also for its energy efficiency. Their efforts are set to be showcased at the '2025 International Symposium on Computer Architecture (ISCA 2025)', a testament to its groundbreaking nature.

The core aim of the research revolves around optimising performance for large-scale generative AI services, achieved by lightweighting the inference process without sacrificing accuracy. The innovation is acknowledged for its harmonised design of AI semiconductors and system software, integral to AI infrastructure.

Traditionally, GPU-based AI setups demand multiple units to satisfy memory bandwidth and capacity needs. However, the NPU technology introduced here employs KV cache quantisation, revolutionising resource usage. Through this, fewer devices are needed, cutting costs in building and operating generative AI platforms.

Key to the hardware architecture is an adaptation that retains compatibility with existing NPUs while integrating advanced quantisation algorithms and page-level memory management. These innovations ensure maximal utility of available memory resources, optimising operations and further cutting power requirements.

  1. Cost-effectiveness: With power efficiency surpassing that of cutting-edge GPUs, operating expenses are expected to plummet.
  2. Broader Implications: Beyond AI cloud data centres, this technology is anticipated to shape the AI transformation landscape, facilitating environments like 'Agentic AI'.

With over 60% performance enhancement against traditional GPUs while using 44% less power, this achievement underscores the potential of NPUs in architecting robust and sustainable AI solutions. As AI technology continues its rapid ascent, the fruits of this research signify a pivotal turning point in striving toward state-of-the-art AI ecosystems.

SUSE and NVIDIA have collaborated on an AI Factory designed to support enterprises in deploying and...
The new CIS Companion Guides provide security guidance for emerging AI environments, including LLMs...
ServiceNow’s planned acquisition of Armis aims to expand its market position and add capabilities...
UK firms shift from AI experimentation to operational integration, enhancing project efficiency and...
ShareGate research highlights the challenges organisations face as AI adoption outpaces existing...
The RFU partners with Capgemini to enhance its digital services, leveraging technology and AI to...
ABB has announced the winners of its 2026 Startup Challenge, which focuses on AI-based solutions...
ANS appoints Ali Mustoe-Playfair as Director of Agentic Operations, with the aim of supporting...