In July of 2023, Meta announced the Llama2 large language model (LLM) designed to be run on-device, not in the cloud. This exciting breakthrough model prompted numerous SoC vendors and IP core vendors to announce intent to support Llama2 - far into the future. Early announcements promised Llama2 support in 2024, or a new IP core available for early 2024 tape-out and thus 2025 shipment.
But
Quadric's Chimera GPNPU is a fully programmable NPU - programable in C++. A small team of 4 Quadric engineers ported Llama2 in just under 5 weeks! If YOUR chip had a Chimera GPNPU you could have raced to the market a half-year faster than your competition!