One of the quirks of stock markets and economic forecasting is the dominance of one-sided narratives, even when contrary evidence abounds—until a major event or influential voice forces a rethink. We saw this with quantum computing, where market euphoria deflated only after industry titans echoed the cautious outlook of seasoned scientists. This propensity to dismiss inconvenient truths, favoring simplistic projections and seductive narratives over nuanced trends, is glaringly apparent in the AI space. In the past ten days alone, three seismic shifts with far-reaching implications for hyperscalers, data centers, energy markets, and beyond have ignited fervent debate among those in the know while barely registering a blip on the radar of most market analysts.
We repeatedly read comments about how hyperscalers will continue to do well, partly because of the fast-rising training compute demand fueled by ever-larger models and scaling laws that demand disproportionate compute increases for marginal gains. This presumption, which supports the narrative for investors in data centers and energy stocks, ignores potential disruptions that could shake up the industry and leave investors vulnerable.
It’s Chinese Models Doing Truly Transforming GenAI Now
As we discussed last year, faced with soaring venture capital costs and compute constraints, Chinese GenAI trainers have sidestepped the Western obsession with building the largest models at any cost or throwing hundreds of thousands of GPUs at the problem in the hope that sheer compute will conjure miracles. Instead, these developers have been forced to pursue efficiency-driven solutions—and the results, underscored by two major announcements in a single week, are nothing short of stunning. While the breakthroughs may have escaped the notice of the investor community that assumes price-important announcements to only emerge from a handful of elite Western companies, they’ve sent ripples through the global AI community, likely also upending plans and priorities for countless training and development teams as new techniques redefine what’s possible.
DeepSeek was the first major announcement. It showed how models can be developed with years-old GPUs at costs 15-20x—for emphasis, fifteen to twenty times—cheaper than is generally assumed. The under-the-hood changes are not a secret, and the models’ industry-leading abilities are not in doubt because of the access, which is as easy as something like ChatGPT.
Overnight, MiniMax-01 displayed a different type of engineering marvel. This model can take 20-32x—for emphasis, twenty to thirty-two times—bigger input than the best current models. In simpler terms, one can input a 16,000-page PDF for analysis. This writer does not have anything so large to test, but the model proved that it could extract data from the various parts of a 450-page IPO prospectus and compute Du Pont that other models found too large to handle.
Why Data Center Investors Need Not Panic Yet
The breakthroughs showcased by DeepSeek and MiniMax-01 underscore the transformative potential of architectural improvements and engineering ingenuity in GenAI development. These advancements are a clarion call for teams worldwide to rethink their strategies: instead of chasing more GPUs and data, the focus is shifting to unlocking efficiency gains through smarter design and execution. This paradigm shift not only lowers barriers to entry but also accelerates innovation by making cutting-edge AI development more accessible.
While these efficiency gains in training might seem to threaten the hyperscaler and data center growth narrative, the reality is more nuanced. Our detailed analysis confirms that inference—not training—accounts for 70-90% of data center compute demand. As AI models become more efficient to train, their deployment in real-world applications is poised to explode, driving sustained demand for centralized compute resources. From enterprise AI tools to consumer-facing applications, the proliferation of AI-driven solutions will require robust infrastructure to handle the growing volume of inference workloads. Far from undermining the growth story, these efficiencies could act as a catalyst, enabling faster adoption and scaling of AI technologies across industries.
Why Hyperscalers and Data Center Investors Cannot Breathe Easy
Another groundbreaking development is NVIDIA’s GB10 chip, introduced at CES 2025. Hailed by some as having the potential to disrupt as much as the IBM PC, this $3,000 chip ushers in the era of decentralized computation, directly challenging the current centralized model. This shift poses a significant risk to hyperscalers and data center investors who have been banking on the continued growth of cloud-based AI workloads.
The appeal of decentralized computation extends beyond cost considerations. Data security, privacy concerns, and the desire for greater control over AI infrastructure are driving many to explore on-premises solutions. But perhaps most importantly, decentralization offers significant latency advantages. Imagine a surgeon using an AI-powered tool for real-time image analysis during a critical operation. With local processing on an AI PC, the lag between image capture and analysis is minimized, enabling faster and more accurate decision-making. This responsiveness is crucial in time-sensitive applications like healthcare, manufacturing, and autonomous vehicles, where milliseconds can make a difference. Governments and businesses dealing with sensitive information may also find it more palatable to perform AI analysis with resources under their own control, reducing reliance on third-party providers and mitigating potential risks associated with data breaches or geopolitical tensions.
While the transition from centralized to decentralized AI computation may not be as dramatic as the shift from mainframes to PCs, dismissing the potential impact of this trend would be a grave mistake. The emergence of powerful AI PCs, coupled with the growing need for data sovereignty, security, and low latency, could significantly alter the landscape of AI infrastructure.
The Rise of Physical AI: A New Era of Embedded Intelligence
Jensen Huang has officially announced the arrival of the Physical AI era. Our condolences to the "Agentic AI" tag for its short-lived existence at the top. It is clear that embedding AI capabilities directly into everyday devices is the new trend, reminiscent of connecting every technological device in every office and home with the Internet in the 1990s. This creates a network of intelligent objects that can sense, process, and respond to their environment in real time.
This shift towards "physical AI" is evident in the growing number of consumer products with on-device AI capabilities, alongside the arrival of new products, particularly in robotics. Advanced robot vacuums with enhanced cleaning capabilities, AI-powered security cameras with object and facial recognition, and even solar-powered gadgets with AI for optimized energy consumption are becoming increasingly commonplace. The applications extend further into smart cities, healthcare with wearable monitoring devices, and manufacturing with predictive maintenance and quality control.
However, this burgeoning Physical AI era will demand more than powerful AI PCs. It necessitates further innovations in GenAI compute infrastructure. Efficient and scalable solutions are needed to support the development, deployment, and management of AI models across a vast network of interconnected devices. This includes advancements in edge computing, new AIoT platforms, and the integration of AI with emerging technologies like 6G networks. The future of Physical AI hinges on our ability to build a robust and adaptable infrastructure that can keep pace with this rapidly evolving landscape.
2025: The Year GenAI Broke Free From the Clouds!
The opening salvos of 2025 have shattered the simplistic narratives surrounding GenAI, demanding a reassessment of investment strategies and priorities. Forget the hype about chatbots, ever-larger models, or the fleeting dominance of "agentic AI." The true revolution lies in the convergence of efficiency-driven innovation and the rise of Physical AI, poised to transform industries and services in ways we are only beginning to grasp. As AI breaks free from the confines of the cloud and embeds itself into the very fabric of our world, the possibilities are limitless.