News Coverage

Choosing the Right Memory Configuration for AI/ML Accelerators

(February 15, 2024) AI2.news - Quadric’s Chimera GPNPU addresses the memory challenge with its intelligent approach. By analyzing data usage across ML graphs and leveraging advanced operator fusion techniques, Quadric’s technology eases memory bottlenecks.

As AI Takes Off, Chipmakers Pump Up Performance

(February 1, 2024) Electronic Design - The Chimera can provide strong ML inference performance while also running traditional C++ code. There’s no need for a partition code between multiple kinds of processors. The GPNPU uses a single pipeline to handle matrix and vector operations and scalar (control) code.

2024 Outlook with Steve Roddy of Quadric

(January 29, 2024) SemiWiki - In a marketplace with more than a dozen machine learning “accelerators” ours is the only NPU solution that is fully C++ programmable that can run any and every AI/ML graph without the need for any fallback to a host CPU or DSP.

Synsense, Intel, NVIDIA, Quadric, Esperanto ~ The AI Hardware Show

(January 23, 2024) The AI Hardware Show - Dr. Ian Cutress discusses Chimera at approximately 7 minutes in this video.

Chip Industry Silos Are Crimping Advances

(January 11, 2024) SemiEngineering - “Designers at all stages of product development need to pay even greater attention to what their co-workers wrestle with, both upstream and downstream. An engineer needs to be aware that specifications, algorithms, and interfaces may need to change as precursor or follow-on groups progress though their design processes.”

Is Transformer Fever Fading?

(January 11, 2024) SemiEngineering - At Quadric, we’d advise that just as doomsday predictions about transformers were too hyperbolic, so too are predictions about the imminent demise of transformer architectures. But make no mistake, the bright minds of data science are hard at work today inventing the Next New Thing that will certainly capture the world’s attention in 2024 or 2025 or 2026 and might one day indeed supplant today’s state of the art.

BYO NPU Benchmarks

(December 14, 2023) Semiconductor Engineering - There is a straight-forward, low-investment method for an IP evaluator to short-circuit all the vendor shenanigans and get a solid apples-to-apples result: Build Your Own Benchmarks. BYOB!...Many IP vendors may have laboriously hand-tuned, hand-pruned and twisted a reference benchmark beyond recognition in order to win the benchmark game.

Partitioning Processors for AI Workloads

(October 12, 2023) Semiconductor Engineering - A transformer structure is far more complicated, and attempts to partition on a heterogeneous multi-core architecture will result in dozens of round trips of moving data between the separate memory subsystems of each of the multiple processing engines. “Each of those shufflings of data burns power and kills throughput while accomplishing zero real work,” Steve Roddy said. “Attempting to partition and map a new, complex LLM or ViT model onto a heterogeneous architecture is orders of magnitude more difficult and time consuming than the 2019 SOTA models.”

Does Your NPU Vendor Cheat On Benchmarks?

(October 12, 2023) Semiconductor Engineering - There are two common major gaps in collecting useful “apples to apples” comparison data on NPU IP: [1] not specifically identifying the exact source code repository of a benchmark, and [2] not specifying that the entire benchmark code be run end to end, with any omissions reported in detail.

Fast Path to Baby Llama BringUp at the Edge

(September 26, 2023) SemiWiki - Assuming that Baby Llama is a good proxy for an edge based LLM, Quadric made the following interesting points. First, they were able to port the 15 million parameter network to their Chimera core in just 6 weeks. Second, this port required no hardware changes, only some (ONNX) operation tweaking in C code to optimize for accuracy and performance. Third they were able to reach 225 tokens/second/watt, using a 4MB L2 memory, 16 GB/second DDR, a 5nm process and 1GHz clock. And fourth the whole process consumed 13 engineer weeks.

© Copyright 2024  Quadric    All Rights Reserved     Privacy Policy

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram