News Coverage - Page 2

Fallback Fails Spectacularly

(May 16, 2024) Semiconductor Engineering - Our analysis of ConvNext on an NPU+DSP architecture suggests a throughput of less than 1 inference per second. Note that these numbers for the fallback solution assume perfect 100% utilization of all the available ALUs in an extremely wide 1024-bit VLIW DSP. Reality would undoubtably be below the speed-of-light 100% mark, and the FPS would suffer even more. In short, fallback is unusable.

Dealing With AI/ML Uncertainty

(April 25, 2024) Semiconductor Engineering - When it comes to the question of how to test and correct the model, the first thing most companies need to do is establish the realistic goal of what kind of error rate is acceptable, what severity of error needs to be eliminated completely, and then guardband the model using those criteria.

Embrace the New!

(March 14, 2024) Semiconductor Engineering - Perhaps those who are not Embracing The New are limited because they chose accelerators with limited support of new ML operators. If a team four years ago implemented a fixed-function accelerator in an SoC that cannot add new ML operators, then many newer networks – such as Transformers – cannot run on those fixed-function chips today. A new silicon respin – which takes 24 to 36 months – is needed.

Thanks for the Memories!

(February 15, 2024) Semiconductor Engineering - “I want to maximize the MAC count in my AI/ML accelerator block because the TOPs rating is what sells, but I need to cut back on memory to save cost,” said no successful chip designer, ever.

Choosing the Right Memory Configuration for AI/ML Accelerators

(February 15, 2024) AI2.news - Quadric’s Chimera GPNPU addresses the memory challenge with its intelligent approach. By analyzing data usage across ML graphs and leveraging advanced operator fusion techniques, Quadric’s technology eases memory bottlenecks.

As AI Takes Off, Chipmakers Pump Up Performance

(February 1, 2024) Electronic Design - The Chimera can provide strong ML inference performance while also running traditional C++ code. There’s no need for a partition code between multiple kinds of processors. The GPNPU uses a single pipeline to handle matrix and vector operations and scalar (control) code.

2024 Outlook with Steve Roddy of Quadric

(January 29, 2024) SemiWiki - In a marketplace with more than a dozen machine learning “accelerators” ours is the only NPU solution that is fully C++ programmable that can run any and every AI/ML graph without the need for any fallback to a host CPU or DSP.

Synsense, Intel, NVIDIA, Quadric, Esperanto ~ The AI Hardware Show

(January 23, 2024) The AI Hardware Show - Dr. Ian Cutress discusses Chimera at approximately 7 minutes in this video.

Chip Industry Silos Are Crimping Advances

(January 11, 2024) SemiEngineering - “Designers at all stages of product development need to pay even greater attention to what their co-workers wrestle with, both upstream and downstream. An engineer needs to be aware that specifications, algorithms, and interfaces may need to change as precursor or follow-on groups progress though their design processes.”

Is Transformer Fever Fading?

(January 11, 2024) SemiEngineering - At Quadric, we’d advise that just as doomsday predictions about transformers were too hyperbolic, so too are predictions about the imminent demise of transformer architectures. But make no mistake, the bright minds of data science are hard at work today inventing the Next New Thing that will certainly capture the world’s attention in 2024 or 2025 or 2026 and might one day indeed supplant today’s state of the art.