The current centralized model of AI compute is not sustainable. The financial and environmental costs are too high, and the benefits are not only concentrated but also contrived. We live in a world where—roughly speaking—over 95% of GenAI compute happens in gigantic data centers of connected GPUs, where NVidia has an 85% market share with aspirations of over 75% margins on each.
Concentrated compute is wreaking havoc on power demand surges in areas that must support data centers doing the computation for the world. The extreme centralization of GenAI models has left large chunks of efficiencies that need to be innovated, likely through computational break-ups. Any efficiency improvements through decentralization should also have material improvements in the environmental footprint.
AI or AI compute on the cloud has the upstream GPU makers and cloud service providers generating a meaningful amount of revenues against freshly minted share and debt papers from its largest clients (in some cases, the financing is not by these vendors reporting the revenues). There is a relatively scant amount of revenue and almost no tales of vast profits from the end users so far, which is another reason why the GenAI build-out must change to enhance the appeal and create room for innovations beyond AI models and their underlying chips. As things stand, the unsustainable financial arrangements will not just lead to a massive cyclical downturn but also numerous investigations in the not-too-distant future.
Extreme compute concentration – or “AI inequality” – has numerous other issues. It has systemic security risks that must be “diversified” or distributed away. For real-time multimodality, the latencies introduced make centralized solutions impractical, with a rising need for near-shoring. A global data ocean is less helpful for personal and corporate AI – they need far more tailored data lakes. For corporates and professionals, the ability to reproduce and duplicate is also important – which will also accelerate the move towards localization. Solutions involving a mixture of experts and agents (in GenAI speak) could also significantly improve costs.
The share of AI compute on the cloud must decrease dramatically, possibly to as low as 10% or even less from nearly 100% currently. A small number of companies may prefer centralization, but it is less about the disproportionate spread of benefits—also reflected in stock markets—as about the other reasons discussed above.
Those who project serially—in terms of the data center growth with the attendant projections for the GPUs, memories, or even power demand within—are likely ignoring the fundamental needs for the compute to split. This shift will democratize GenAI while also making it more accessible and sustainable and driving innovation across various sectors.