Nvidia’s CEO defends his moat as AI labs change how they improve their AI models

Nvidia raked in more than $19 billion in net income during the last quarter, the company reported on Wednesday, but that did little to assure investors that its rapid growth would continue. On its earnings call, analysts prodded CEO Jensen Huang about how Nvidia would fare if tech companies start using new methods to improve their AI models. The method that underpins OpenAI’s o1 model, or “test-time scaling,” came up quite a lot. It’s the idea that AI models will give better answers if you give them more time and computing power to “think” through questions. Specifically, it adds more compute to the AI inference phase, which is everything that happens after a user hits enter on their prompt. Nvidia’s CEO was asked whether he was seeing AI model developers shift over to these new methods and how Nvidia’s older chips would work for AI inference. Huang indicated that o1, and test-time scaling more broadly, could play a larger role in Nvidia’s business moving forward, calling it “one of the most exciting developments” and “a new scaling law.” Huang did his best to ensure investors that Nvidia is well-positioned for the change. The Nvidia CEO’s remarks aligned with what Microsoft CEO Satya Nadella said onstage at a Microsoft event on Tuesday: o1 represents a new way for the AI industry to improve its models. This is a big deal for the chip industry because it places a greater emphasis on AI inference. While Nvidia’s chips are the gold standard for training AI models, there’s a broad set of well-funded startups creating lightning-fast AI inference chips, such as Groq and Cerebras. It could be a more competitive space for Nvidia to operate in. Despite recent reports that improvements in generative models are slowing, Huang told analysts that AI model developers are still improving their models by adding more compute and data during the pretraining phase. Anthropic CEO Dario Amodei also said on Wednesday during an onstage interview at the Cerebral Valley summit in San Francisco that he is not seeing a slowdown in model development. “Foundation model pretraining scaling is intact and it’s continuing,” said Huang on Wednesday. “As you know, this is an empirical law, not a fundamental physical law, but the evidence is that it continues to scale. What we’re learning, however, is that it’s not enough.” That’s certainly what Nvidia investors wanted to hear, since the chipmaker’s stock has soared more than 180% in 2024 by selling the AI chips that OpenAI, Google, and Meta train their models on. However, Andreessen Horowitz partners and several other AI executives have previously said that these methods are already starting to show diminishing returns. Huang noted that most of Nvidia’s computing workloads today are around the pretraining of AI models — not inference — but he attributed that more to where the AI world is today. He said that one day there will simply be more people running AI models, meaning more AI inference will happen. Huang noted that Nvidia is the largest inference platform in the world today and the company’s scale and reliability gives it a huge advantage compared to startups. “Our hopes and dreams are that someday, the world does a ton of inference, and that’s when AI has really succeeded,” said Huang. “Everybody knows that if they innovate on top of CUDA and Nvidia’s architecture, they can innovate more quickly, and they know that everything should work.”