OpenAI CEO warns that GPU crisis jeopardizes the future of LLMs

By Soumik Sinharoy | Apr 26, 2023

Editor’s note: This story was first published in April 2023 in our internal newsletter and is being released for all readers as part of our AI April program. Learn more about our two AI April exploratory live panels here (opens in a new window)!

Key insight:

Today’s LLMs require extensive GPU computational resources. As nations strive towards building sovereign AI supercomputing platforms, the demands for NVIDIA GPUs will only increase. Scarcity will be one of the leading drivers of innovation in the search for more computationally efficient LLMs and neural network ASICs.

Speaking recently at MIT, OpenAI CEO Sam Altman predicted, “I think we’re at the end of the era where it’s going to be these, like, giant, giant models. We’ll make them better in other ways.” He went on to say that OpenAI faces physical limits to how many data centers it can build; that training GPT-4 cost the company more than $100 million; and that OpenAI is not currently developing GPT-5. “We are not,” he said, “and won’t for some time.”

Today’s Large Language Models (LLMs) are driving the ongoing generative AI revolution — and these LLMs require extensive GPU computational resources. These resources first train the Deep Neural Networks (DNNs) and once trained, the networks require powerful enterprise grade GPUs for execution (i.e., AI inferencing). For example, the GPT-3 model from OpenAI is a decoder-only 2048-token-long context with 175 billion parameters and can scale up to 800GB in size. With the explosive growth in usage of ChatGPT, OpenAI and Microsoft have increased their procurement of GPUs significantly to support their existing DNN training and their live workloads on inferencing as the number of users has exceeded their existing capacity.

GPT-4 is an even larger model with more than 1 trillion parameters and requires significantly more GPU compute capability for training. In terms of sustainability and environmental impact, training just GPT-3 took 10,000+ GPUs and consumed 700,000 liters of water for evaporative cooling — enough water to fill the tank of a giant nuclear reactor. AI supercomputers are so energy intensive that, according to Elon Musk, AI data centers can be identified with heat signatures from Earth observation satellites. To be sustainable, AI researchers are relentlessly working towards designing more efficient AI models.

However, the size of an AI model does not necessarily result in higher accuracy. The OpenAI team is also working toward increasing accuracy of the models. As the data science community strives towards optimization of LLMs, we will see more specialized AI ASICs enter the market with higher TOPS/watts efficiency. During the Tesla most recent earnings call, Elon Musk said that Tesla is still buying NVIDIA GPUs in massive quantities for the GPU cluster that trains neural nets for Self-Driving. The DOJO supercomputer, when operational with Tesla’s own multi-SOC chip/tile, will be far more efficient than today’s state of art NN training GPUs such as NVIDIA’s H100.

As all nations strive towards building their sovereign AI supercomputing platforms, the demands for NVIDIA GPUs will only increase — and the production from TSMC Fab is limited. Today NVIDIA dominates the AI computing hardware market, with greater than 88% market share, and the scalability of large transformer models for training and inferencing is now dependent on the availability of GPU supply. Hence, scarcity will be one of the leading drivers of innovation in the search for more computationally efficient LLMs and neural network ASICs.

Soumik Sinharoy, Technology Group Principal, Orange Silicon Valley

Soumik leads the activities around HPC, Video Analytics and Accelerated Computing for AI in Datacenters and Edge at Orange Silicon Valley. He brings extensive experience in Supercomputing platforms (terrestrial and air/spaceborne), high broadband networks, GPU Computing for HPC. His 19-year career spans companies such as Cisco, AT&T. He is also serving in the role of Technical Advisor for a few Orange portfolio companies. He holds a Master of Sciences in Computer Engineering from University of South Carolina, and a Bachelor of Engineering, Electronics, and Telecommunications from Jadavpur University, India.

Email Soumik Sinharoy