DeepSeek’s new artificial intelligence model has questions from US competitors
The OpenAI Challenge and the DeepSeek Lab: How OpenAI Can Make AI Costlier than It Needed to Train GPT-4
“It’s been clear for some time now that innovating and creating greater efficiencies—rather than just throwing unlimited compute at the problem—will spur the next round of technology breakthroughs,” says Nick Frosst, a cofounder of Cohere, a startup that builds frontier AI models. “This is a clarifying moment when people are realizing what’s long been obvious.”
DeepSeek’s technology was developed by a relatively small research lab in China that sprang out of one of the country’s best-performing quantitative hedge funds. A research paper was posted online in December that claimed that an earlier model of DeepSeek-V3 cost less than it needed to to build similar projects. OpenAI has previously said that some of its models cost upwards of $100 million each. The latest models from OpenAI as well as Google, Anthropic, and Meta likely cost considerably more.
OpenAI told the Financial Times that it found evidence linking DeepSeek to the use of distillation — a common technique developers use to train AI models by extracting data from larger, more capable ones. It’s an efficient way to train smaller models at a fraction of the more than $100 million that OpenAI spent to train GPT-4. OpenAI’s terms of service forbid the copying of its outputs to build competing models, and that is a violation. OpenAI did not give details of the evidence it found.
There is an irony to the situation. It was OpenAI that made a lot of progress with its GPT model by sucking the entire written web down.