(April 24, 2025) Semiconductor Engineering - For a lot of the architectures for the deployment of inference models, people have chosen very inflexible, fixed-function accelerators of AI, and that’s the trap. If you look at the set of models today and try to build something that accelerates those and makes them low-power and efficient, and then the state-of-the-art model changes in two years, you could be in trouble. You could wind up with a chip that you spent a lot of money developing, and it can’t run the latest thing, and now you’re dead in the water.