DeepSeek: US firms praise disruptive AI chatbot, try to poke holes in it

Developers at leading US AI firms are praising the DeepSeek AI models that have leapt into prominence while also trying to poke holes in the notion that their multibillion dollar technology has been bested by a Chinese newcomer’s low-cost alternative.

Chinese start-up DeepSeek on Monday sparked a stock sell-off and its free AI assistant overtook OpenAI’s ChatGPT atop Apple’s App Store in the US harnessing a model it said it trained on Nvidia’s lower-capability H800 processor chips using under $6 million.

As worries about competition reverberated across the US stock market, some AI experts applauded DeepSeek’s strong team and up-to-date research but remained unfazed by the development, said people familiar with the thinking at four of the leading AI labs.

OpenAI CEO Sam Altman wrote on X that R1, one of several models DeepSeek released in recent weeks, “is an impressive model, particularly around what they’re able to deliver for the price”.

Florida shooting: Suspect (20) ‘used police officer mother’s gun’ to kill two in university attack

10 Good Friday traditions you’ve probably never heard of

Stephen Collins: Rory McIlroy’s wish to be identified as Northern Irish is typical of his generation

Nvidia said in a statement DeepSeek’s achievement proved the need for more of its chips.

Software maker Snowflake decided Monday to add DeepSeek models to its AI model marketplace after receiving a flurry of customer inquiries.

With employees also calling DeepSeek’s models “amazing,” the US software seller weighed the potential risks of hosting AI technology developed in China before ultimately deciding to offer it to clients, said Christian Kleinerman, Snowflake’s executive vice-president of product.

“We decided that as long as we are clear to customers, we see no issues supporting it,” he said.

Meanwhile, US AI developers are hurrying to analyse DeepSeek’s V3 model. DeepSeek in December published a research paper accompanying the model, the basis of its popular app, but many questions such as total development costs are not answered in the document.

China has now leapfrogged from 18 months to six months behind state-of-the-art AI models developed in the US, one person said. Yet with DeepSeek’s free release strategy drumming up such excitement, the firm may soon find itself without enough chips to meet demand, this person predicted.

DeepSeek’s strides did not flow solely from a $6 million shoestring budget, a tiny sum compared to $250 billion analysts estimate big US cloud companies will spend this year on AI infrastructure. The research paper noted that this cost referred specifically to chip usage on its final training run, not the entire cost of development.

The training run is the tip of the iceberg in terms of total cost, executives at two top labs told Reuters. The cost to determine how to design that training run can cost magnitudes more money, they said.

The paper stated that the training run for V3 was conducted using 2,048 of Nvidia’s H800 chips, which were designed to comply with US export controls released in 2022, rules that experts told Reuters would barely slow China’s AI progress.

Sources at two AI labs said they expected earlier stages of development to have relied on a much larger quantity of chips. One of the people said such an investment could have cost north of $1 billion.

Some American AI leaders lauded DeepSeek’s decision to launch its models as open source, which means other companies or individuals are free to use or change them.

“DeepSeek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen – and as open source, a profound gift to the world,” venture capitalist Marc Andreessen said in a post on X on Sunday. ‐ Reuters