AI chatbots, such as ChatGPT and Bard, face a significant problem—they incur losses on every chat. The exorbitant costs associated with running large language models, which serve as the foundation for these tools, not only limit their quality but also pose a threat to the global AI boom they have ignited.
The substantial expenses and scarcity of computer chips required to operate these models create barriers for companies that wish to deploy them. Even the wealthiest companies are compelled to transform chatbots into revenue-generating entities sooner than intended, as the cost factor looms large.
Tom Goldstein, a computer science professor at the University of Maryland, highlights that the current models deployed are not the best available due to cost constraints. These limitations result in weaknesses, including biases and false information.
The implications of unreliable AI language models become apparent when they disseminate false information about real individuals. While tech giants, including OpenAI, Microsoft, and Google, are tight-lipped about the costs, industry experts acknowledge that it is the most glaring obstacle to realizing Big Tech’s vision of AI ubiquity across industries, streamlining processes, and reducing workforce size.
Unlike other machine learning approaches, generative AI demands extensive computational power and specialized graphics processing units (GPUs). These GPUs are available only to the most affluent companies, giving rise to tech giants that dominate the industry, both in cloud computing and chip manufacturing. The battle for access to GPUs has positioned these chip providers as highly valued entities in the tech industry.
Silicon Valley’s success in the internet economy was partly attributed to offering free services initially, ultimately turning profitable through personalized advertising. However, analysts suggest that advertising alone may not be sufficient to make cutting-edge AI tools profitable in the near future.
Companies providing AI models for consumer use must strike a balance between gaining market share and managing financial losses. Additionally, the quest for more reliable AI is expected to generate profits primarily for chipmakers and cloud computing giants who dominate the digital landscape.
It is worth noting that the leading AI language model developers are either major cloud computing providers, such as Google and Microsoft, or closely affiliated with them, as seen in OpenAI’s partnership with Microsoft. Businesses utilizing AI tools from these providers may not be aware that they are locked into a heavily subsidized service that costs significantly more than their current expenses, warns Clem Delangue, CEO of Hugging Face.
OpenAI CEO Sam Altman indirectly acknowledged the issue during a Senate hearing, emphasizing the need to design systems that do not maximize engagement due to GPU shortages. This further highlights the expense associated with AI language models.
The cost of AI language models encompasses their development, training, and ongoing computational requirements. These models rely on massive amounts of data and employ star researchers whose salaries rival those of professional athletes. Each query to a chatbot necessitates routing to data centers, where supercomputers perform high-speed calculations to interpret user prompts and predict responses. The cost per chat, estimated to be in single-digit cents, multiplies exponentially with millions of users.
Efforts to optimize costs are ongoing, with the industry striving to create smaller, more cost-effective models. However, experts note that these cost-cutting measures may compromise the quality and reliability of the models.
As the economics of AI chatbots continue to evolve, Microsoft and OpenAI are exploring different avenues, including potential ad integration and subscription models, to make them financially viable. Altman is confident that the value of AI justifies finding profitable solutions.
Nevertheless, critics caution that generative AI has societal costs, including increased greenhouse gas emissions resulting from high computational energy consumption. This energy could be redirected to other computing tasks with broader.