Tech stocks tumbled. Giant companies such as Meta and Nvidia faced a barrage of questions about their future. And tech executives took to social media to proclaim their fears.
And it was all because of a little-known Chinese artificial intelligence start-up called DeepSeek.
DeepSeek caused waves all over the world Monday as one of its accomplishments – having created a very powerful AI model with far less money than many AI experts thought possible – raised a host of questions, including whether U.S. companies were even competitive in AI any more.
DeepSeek is “AI’s Sputnik moment,” Marc Andreessen, a tech venture capitalist, posted on social media Sunday.
How could a company that few people had heard of have such an effect?
What is DeepSeek?
DeepSeek is a start-up founded and owned by the Chinese stock trading firm High-Flyer. Its goal is to build AI technologies along the lines of OpenAI’s ChatGPT chatbot or Google’s Gemini. By 2021, DeepSeek had acquired thousands of computer chips from the U.S. chipmaker Nvidia, which are a fundamental part of any effort to create powerful AI systems.
In China, the start-up is known for grabbing young and talented AI researchers from top universities, promising high salaries and an opportunity to work on cutting-edge research projects. Both High-Flyer and DeepSeek are run by Liang Wenfeng, a Chinese entrepreneur.
Over the past few years, DeepSeek has released several large language models, which is the kind of technology that underpins chatbots such as ChatGPT and Gemini. On January 10th, it released its first free chatbot app, which was based on a new model called DeepSeek-V3.
Why did the stock market react to it now?
When DeepSeek introduced its DeepSeek-V3 model the day after Christmas, it matched the abilities of the best chatbots from U.S. companies such as OpenAI and Google. That alone would have been impressive.
But the team behind the new system also revealed a bigger step forward. In a research paper explaining how it built the technology, DeepSeek said it used only a fraction of the computer chips that leading AI companies relied on to train their systems.
The world’s top companies typically train their chatbots with supercomputers that use as many as 16,000 chips or more. DeepSeek’s engineers said they needed only about 2,000 Nvidia chips.
Why is that important?
Since late 2022, when OpenAI set off the AI boom, the prevailing notion had been that the most powerful AI systems could not be built without investing billions of dollars in specialised AI chips. That would mean that only the biggest tech companies – such as Microsoft, Google and Meta, all of which are based in the United States – could afford to build the leading technologies.
But DeepSeek’s engineers said they needed only about $6 million in raw computing power to train their new system. That was roughly 10 times less than what Meta spent building its latest AI technology.
How did DeepSeek make its tech with fewer AI chips?
Top AI engineers in the United States say that DeepSeek’s research paper laid out clever and impressive ways of building AI technology with fewer chips.
In short, the start-up’s engineers demonstrated a more efficient way of analysing data using the chips. Leading AI systems learn their skills by pinpointing patterns in huge amounts of data, including text, images and sounds. DeepSeek described a way of spreading this data analysis across several specialised AI models – what researchers call a “mixture of experts” method – while minimising the time lost by moving data from place to place.
Others have used similar methods before, but moving information between the models tended to reduce efficiency. DeepSeek did this in a way that allowed it to use less computing power.
Is DeepSeek’s tech as good as systems from OpenAI and Google?
DeepSeek-V3 can answer questions, solve logic problems and write its own computer programs as effectively as anything already on the market, according to standard benchmark tests.
Just before DeepSeek released its technology, OpenAI had unveiled a new system, called OpenAI o3, which seemed more powerful than DeepSeek-V3. But OpenAI has not released this system to the wider public.
OpenAI o3 was designed to “reason” through problems involving math, science and computer programming. Many experts pointed out that DeepSeek had not built a reasoning model along these lines, which is seen as the future of AI.
Then, on January 20th, DeepSeek released its own reasoning model called DeepSeek R1, and it, too, impressed the experts. That eventually sent US investors and others into a panic late last week and over the weekend as they realised the importance of DeepSeek’s new technology.
US tech giants are building data centres with specialised AI chips. Does this still matter, given what DeepSeek has done?
Yes, it still matters.
Large numbers of AI chips can still help companies in many ways. With more chips, they can run more experiments as they explore new ways of building AI. In other words, more chips can still give companies a technical and competitive advantage.
Hasn’t the United States limited the number of Nvidia chips sold to China?
Yes. To maintain the US lead in the global AI race, the Biden administration had put in place rules limiting the number of powerful chips that could be sold to China and other rivals.
But the impressive performance of the DeepSeek model raised questions about the unintended consequences of the US government’s trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet.
Some experts continue to argue in favour of U.S. trade restrictions, saying that they were only recently put in place and that they will have a greater effect on China’s abilities to create AI as the years pass.
Does DeepSeek’s tech mean that China is now ahead of the United States in AI?
No. The world has not yet seen OpenAI’s o3 model, and its performance on standard benchmark tests was more impressive than anything else on the market. But experts are concerned that China is jumping ahead on open-source AI systems.
What exactly is open-source AI?
Like many other companies, DeepSeek has “open sourced” its latest AI system, which means that it has shared the underlying computer code with other businesses and researchers. This allows others to build and distribute their own products using the same technologies.
This is part of the reason DeepSeek and others in China have been able to build competitive AI systems so quickly and inexpensively.
In the AI world, open source first gathered steam in 2023 when Meta freely shared an AI system called Llama. At the time, many assumed that the open-source ecosystem would flourish only if companies such as Meta – giant firms with huge data centres filled with specialised chips – continued to open source their technologies.
But DeepSeek and others have shown that this ecosystem can thrive in ways that extend beyond the American tech giants.
Why is that important?
Many experts have argued that the big American companies should not open source their technologies because they could be used to spread disinformation or cause other serious harm. Some US lawmakers have explored the possibility of preventing or throttling the practice.
But other experts have argued that if regulators stifle the progress of open-source technology in the United States, China will gain a significant edge. If the best open-source technologies come from China, these experts argue, US researchers and companies will build their systems atop those technologies.
In the long run, that could put China at the heart of AI research and development, which could further accelerate its effort to build a wide range of AI technologies, including autonomous weapons and other military systems. – This article originally appeared in The New York Times.