Chimera GPNPU Supports Large Language Modules

Run the newest generative transformer models with a simple software port, not a silicon respin!

Porting LLama2 LLM In record time

In July of 2023, Meta announced the Llama2 large language model (LLM) designed to be run on-device, not in the cloud. This exciting breakthrough model prompted numerous SoC vendors and IP core vendors to announce intent to support Llama2 - far into the future. Early announcements promised Llama2 support in 2024, or a new IP core available for early 2024 tape-out and thus 2025 shipment.

But Quadric's Chimera GPNPU is a fully programmable NPU - programable in C++. A small team of 4 Quadric engineers ported Llama2 in just under 5 weeks! If YOUR chip had a Chimera GPNPU you could have raced to the market a half-year faster than your competition!
Porting a new ML workload to Chimera cores was fast - because its done in C++. But Chimera processors also deliver ML Inference efficiency dramatically higher than what CPUs or GPUs provide. Chimera GPNPUs uniquely combine the easy programmability of a processor with the ML efficiency of a NPU "accelerator".

Experience Chimera yourself

Sign-in to your existing account, or sign-up for a new Quadric DevStudio account today and see it for yourself.

© Copyright 2024  Quadric    All Rights Reserved     Privacy Policy

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram