DeepSeek: The Biggest Splash in AI Right Now
The little company is making seismic waves in the AI industry with massive geopolitical ramifications. Or, it's lying.
Over the past few years, the AI foundational model industry has been playing an increasingly higher-stakes game. Big players like OpenAI, Anthropic, and X.ai have set the rules: massive investments in compute-heavy infrastructure, partnerships with cloud giants like Microsoft and Oracle, and proprietary models that demand billions in training dollars.
Last week, the U.S. government’s $500 billion “Stargate” AI initiative with OpenAI, Oracle, and Softbank announced funneling staggering resources into cementing dominance in this space. The media frenzy was complemented by the gossipy real/fake rifts between Elon Musk (X.AI), Sam Altman (OpenAI), Satya Nadella (Microsoft), and current and past US Presidents.
Yet, over the last month, a quiet drumbeat has been behind that hype, drowning out the noise and dominating discussions in AI circles. DeepSeek, a tiny startup from China, has released and open-sourced a powerful model, DeepSeekV-3, alongside a smaller family of models, throwing the industry—and its assumptions—into disarray.
But is everything as it seems? Possibly. The implications are pretty massive either way. Here are five statements surrounding DeepSeek and what they could mean for the industry.
1. DeepSeek Is Misleading the Costs to Train Its Models
DeepSeek claims to have trained its powerful models for a mere $5 million—a fraction of the hundreds of millions spent by OpenAI and others. But is this really possible? Industry experts argue that even with state-backed subsidies, training such a model would likely cost significantly more. So why the low-ball figure? It’s a perfect PR move. By creating buzz around its efficiency, DeepSeek not only attracts global attention but also undermines the narrative that bigger budgets lead to better AI. If this is the lie, it’s a calculated move to shape the perception of Chinese AI innovation as lean and efficient.
Then again, what if DeepSeek is lying, but training the model costs $50M—ten times more than the original claim? The ramifications would be incredible, as it would still be orders of magnitude less than what other major players are spending to train similar or worse models. Additionally, will the open-source community learn enough to reproduce similar performance and erode the dominance of big vendors?
2. China Is Making the U.S. Tech Sector Blink
What if the cost claim is a state-sponsored strategic bluff? On the heels of the U.S. tech sector announcements in billion-dollar investments in data centers and compute-heavy infrastructure, the DeepSeek announcement could make investors feel compelled to reevaluate before they break ground. If China convinces the world that cutting-edge AI can be built without such massive expenditures, it could slow down the momentum of U.S. initiatives that seem to guarantee U.S. dominance. By introducing uncertainty, China might be trying to force its competitors into a costly game of second-guessing. If this is the lie, it’s a masterstroke of economic and psychological warfare.
3. China Is Pulling a TikTok 2.0…but for Corporations
Could DeepSeek be a Trojan horse? TikTok showed the world how quickly a Chinese application could dominate global markets while raising questions about data privacy and security. DeepSeek is offering direct API access that is 97% cheaper than the costs of OpenAI. Developers and some companies will move to DeepSeek’s API.
What are the privacy and security implications? DeepSeek could be attracting widespread adoption while quietly collecting corporate and research data at an unprecedented scale.
DeepSeek’s open-source models offer some protection, as they can be self-hosted. Moreover, DeepSeek’s reasoning approach allows users to see exactly what the model is thinking along the way. Yet there are enough user examples, most notably asking DeepSeek about Tiananmen Square, that should give any company pause before making this a corporate reasoning engine.
It’s a stark reminder that geopolitical strategy often plays a role in technological advancement. The potential for data espionage and subterfuge through such a model cannot be overlooked.
4. The Big Proprietary Vendors Are Toast
Perhaps the real question is not about China, but the vulnerability of proprietary vendors like OpenAI and Anthropic. If DeepSeek truly rivals or surpasses its offerings, it could signal the beginning of the end for closed models. OpenAI’s and Anthropic’s billion-dollar valuations are built on the premise that their proprietary technology remains unmatched. DeepSeek’s open-source approach undermines this, offering competitive tools without the price tag or restrictions. If this is the lie, it highlights how fragile the dominance of proprietary vendors has become in the face of open-source disruption.
5. US Export Controls Are Not Working
Finally, there’s the simplest explanation: DeepSeek’s performance metrics are overstated. But if DeepSeek’s claims are true, and it actually required massive amounts of advanced GPUs to train its models, then the efficacy of U.S. export controls comes into question. The U.S. has restricted the sale of its most advanced chips to China in an effort to stymie its AI progress. However, if DeepSeek managed to obtain and leverage these resources despite these measures, it suggests that current export controls are either ineffective or insufficiently enforced.
Specific cases of chip smuggling or intermediary trade routes could highlight weaknesses in enforcement. Furthermore, the U.S. may need to reevaluate its strategy to ensure that sensitive technologies are not slipping through regulatory gaps. If this is the lie, it raises urgent questions about the need for stronger oversight and coordination in containing the spread of advanced compute resources.
What Does This Mean for the Future of AI?
Each of these statements carries profound implications. If the training cost claim is false, it begs the question why mislead the public? If China’s strategy is to disrupt U.S. investments, it highlights the geopolitical stakes in the AI arms race. If DeepSeek is a data-siphoning tool, it raises urgent questions about trust and security in open-source AI. If proprietary vendors are indeed vulnerable, it could accelerate the shift toward open-source dominance. And if export controls are failing, it suggests a need for a significant recalibration of U.S. policy.
In the end, whether any of these statements are true or not, DeepSeek has already succeeded in forcing the global AI community to rethink its assumptions. The rules of the game are changing, and DeepSeek is proving to be the ultimate wildcard.
I said they "may" be lying, or strongly misleading what it took to train the model. And that they would still be very efficient if it cost and order of magnitude or more m.
Im surprised that you think they may be lying and that is your take. https://x.com/karpathy/status/1872362712958906460?s=46&t=0aWNZ2ZtEd80VMc4Zu1JTQ They got hyper efficient.