News Coverage | Quadric in the news

BYO NPU Benchmarks

(December 14, 2023) Semiconductor Engineering - There is a straight-forward, low-investment method for an IP evaluator to short-circuit all the vendor shenanigans and get a solid apples-to-apples result: Build Your Own Benchmarks. BYOB!...Many IP vendors may have laboriously hand-tuned, hand-pruned and twisted a reference benchmark beyond recognition in order to win the benchmark game.

Partitioning Processors for AI Workloads

(October 12, 2023) Semiconductor Engineering - A transformer structure is far more complicated, and attempts to partition on a heterogeneous multi-core architecture will result in dozens of round trips of moving data between the separate memory subsystems of each of the multiple processing engines. “Each of those shufflings of data burns power and kills throughput while accomplishing zero real work,” Steve Roddy said. “Attempting to partition and map a new, complex LLM or ViT model onto a heterogeneous architecture is orders of magnitude more difficult and time consuming than the 2019 SOTA models.”

Does Your NPU Vendor Cheat On Benchmarks?

(October 12, 2023) Semiconductor Engineering - There are two common major gaps in collecting useful “apples to apples” comparison data on NPU IP: [1] not specifically identifying the exact source code repository of a benchmark, and [2] not specifying that the entire benchmark code be run end to end, with any omissions reported in detail.

Fast Path to Baby Llama BringUp at the Edge

(September 26, 2023) SemiWiki - Assuming that Baby Llama is a good proxy for an edge based LLM, Quadric made the following interesting points. First, they were able to port the 15 million parameter network to their Chimera core in just 6 weeks. Second, this port required no hardware changes, only some (ONNX) operation tweaking in C code to optimize for accuracy and performance. Third they were able to reach 225 tokens/second/watt, using a 4MB L2 memory, 16 GB/second DDR, a 5nm process and 1GHz clock. And fourth the whole process consumed 13 engineer weeks.

Unlocking the Power of Operator Fusion to Accelerate AI

(September 12, 2023) Towards AI - For these reasons, GPNPUs strike a balance between thoughtfully designed hardware for AI compute with the programmability and flexibility of a general-parallel compute platform like a GPU. GPNPUs satisfy the general compute and the memory organization requirements for operator fusion and are only limited occasionally by the memory availability requirement.

Transformers in Auto: Who Does it, Who Needs it?

(August 23, 2023) The Ojo-Yoshida Report - Roddy sees Quadric’s strength in its ability to run different kernels doing different jobs – tasks classically thought of as DSP code, kernels for classic neural nets including one for a detector, another for an authenticator, and something in between doing CPU-like tasks – all on Quadric’s “single processor.” Roddy explained, “There’s no multiple engines inside the hood. There’s one actual processor, one execution pipeline, one code stream that all gets compiled together.”

AI, Rising Chip Complexity Complicate Prototyping

(August 24, 2023) Semiconductor Engineering - What’s new today, in 2023, is that the rapid rush of machine learning inference is simultaneously disturbing nearly all types of subsystems. The known characteristics of proven building blocks that might have allowed a team to use heuristic approaches to size system resources — memory, bus bandwidth, I/O bandwidth, power management — are all disrupted.

Compiler-Driven Performance Boosts For GPNPUs

(August 10, 2023) Semiconductor Engineering - While most fixed-function NN accelerators cannot run Vision Transformers at all, Quadric not only runs transformers, but is poised to continue to deliver large increases in performance – without hardware changes – as the CGC to LLVM compiler stack continues to mature and improve.

A Bridge From Mars To Venus

(July 20 2023) SemiEngineering - In today’s modern world of electronics, the headlong rush to build and deploy machine learning (ML) based artificial intelligence into devices and systems creates its own Mars – Venus clash of cultures.

Silicon 100: Startups Worth Watching in 2023

(July 17, 2023) - EETimes - Quadric’s Chimera family of general-purpose neural processor cores combines a neural processing accelerator with the full C++ programmability of a digital-signal processor.